$k$-Graph

A Graph Embedding for Interpretable Time Series Clustering

$k$-Graph in short

$k$-Graph is an explainable and interpretable Graph-based time series clustering. $k$-Graph is divided into three steps: (i) Graph embedding, (ii) Graph clustering, and (iii) Consensus Clustering. In practice, it first projects the time series into a graph and repeats the operation for multiple pattern lengths. For each pattern length, we use the corresponding graph to cluster the time series (based on the frequency of the nodes and edges for each time series). We then find a consensus between all pattern lengths and use the consensus as clustering labels. Thanks to the graph representation of the time series (into a unique graph), $k$-Graph can be utilized for variable-length time series. Moreover, we provide a way to select the most interpretable graph for the resulting clustering partition and allow users to visualize the subsequences contained in the most representative and exclusive nodes.

🔍 Features

📊 Clusters time series using graph embeddings
🔄 Supports variable-length time series analysis
🧠 Provides interpretable graph visualizations

🌐 Try it Online

Explore $k$-Graph with our interactive tool: 👉 GrapHint Visualization Tool

📁 Project Structure

kGraph/
├── kgraph/             # Core implementation
├── examples/           # Example usage scripts
├── ressources/         # Visuals and images
├── utils/              # utils methods for loading datasets
├── requirements.txt    # Dependencies
└── README.md

Getting started

The easiest solution to install $k$-Graph is to run the following command:

pip install kgraph-ts

Graphviz and pyGraphviz can be used to obtain better visualisation for $k$-Graph. These two packages are not necessary to run $k$-graph. If not installed, a random layout is used to plot the graphs. To benefit from a better visualisation of the graphs, please install Graphviz and pyGraphviz as follows:

For Mac:

brew install graphviz

For Linux (Ubuntu):

sudo apt install graphviz

For Windows:

Stable Windows install packages are listed here

Once Graphviz is installed, you can install pygraphviz as follows:

pip install pygraphviz

Manual installation

You can also install manually $k$-Graph by following the instructions below. All Python packages needed are listed in requirements.txt file and can be installed simply using the pip command:

conda env create --file environment.yml
conda activate kgraph
pip install -r requirements.txt

You can then install $k$-Graph locally with the following command:

pip install .

Usage

In order to play with $k$-Graph, please check the UCR archive. We depict below a code snippet demonstrating how to use $k$-Graph.

import sys
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from sklearn.metrics import adjusted_rand_score

sys.path.insert(1, './utils/')
from utils import fetch_ucr_dataset

from kgraph import kGraph


path = "/Path/to/UCRArchive_2018/"
data = fetch_ucr_dataset('Trace',path)
X = np.concatenate([data['data_train'],data['data_test']],axis=0)
y = np.concatenate([data['target_train'],data['target_test']],axis=0)


# Executing kGraph
clf = kGraph(n_clusters=len(set(y)),n_lengths=10,n_jobs=4)
clf.fit(X)

print("ARI score: ",adjusted_rand_score(clf.labels_,y))

Running kGraph for the following length: [36, 72, 10, 45, 81, 18, 54, 90, 27, 63] 
Graphs computation done! (36.71151804924011 s) 
Consensus done! (0.03878021240234375 s) 
Ensemble clustering done! (0.0060100555419921875 s) 
ARI score:  0.986598879940902

For variable-length time series datasets, $k$-Graph has to be initialized as follows:

clf = kGraph(n_clusters=len(set(y)),variable_length=True,n_lengths=10,n_jobs=4)

Visualization tools

We provide visualization methods to plot the graph and the identified clusters (i.e., graphoids). After running $k$-Graph, you can run the following code to plot the graphs partitioned in different clusters (grey are nodes that are not associated with a specific cluster).

clf.show_graphoids(group=True,save_fig=True,namefile='Trace_kgraph')

Instead of visualizing the graph, we can directly retrieve the most representative nodes for each cluster with the following code:

nb_patterns = 1

#Get the most representative nodes
nodes = clf.interprete(nb_patterns=nb_patterns)

plt.figure(figsize=(10,4*nb_patterns))
count = 0
for j in range(nb_patterns):
	for i,node in enumerate(nodes.keys()):

		# Get the time series for the corresponding node
		mean,sup,inf = clf.get_node_ts(X=X,node=nodes[node][j][0])
		
		count += 1
		plt.subplot(nb_patterns,len(nodes.keys()),count)
		plt.fill_between(x=list(range(int(clf.optimal_length))),y1=inf,y2=sup,alpha=0.2) 
		plt.plot(mean,color='black')
		plt.plot(inf,color='black',alpha=0.6,linestyle='--')
		plt.plot(sup,color='black',alpha=0.6,linestyle='--')
		plt.title('node {} for cluster {}: \n (representativity: {:.3f} \n exclusivity : {:.3f})'.format(nodes[node][j][0],node,nodes[node][j][3],nodes[node][j][2]))
plt.tight_layout()

plt.savefig('Trace_cluster_interpretation.jpg')
plt.close()

You can find a script containing all the code above here.

References

$k$-Graph has been accepted for publication IEEE Transactions on Knowledge and Data Engineering (TKDE). You may find the preprint version here. If you use $k$-Graph in your project or research, cite the following paper:

P. Boniol, D. Tiano, A. Bonifati and T. Palpanas, " k -Graph: A Graph Embedding for Interpretable Time Series Clustering," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2025.3543946.

@ARTICLE{10896823,
  author={Boniol, Paul and Tiano, Donato and Bonifati, Angela and Palpanas, Themis},
  journal={IEEE Transactions on Knowledge and Data Engineering}, 
  title={$k$-Graph: A Graph Embedding for Interpretable Time Series Clustering}, 
  year={2025},
  volume={37},
  number={5},
  pages={2680-2694},
  keywords={Time series analysis;Feature extraction;Clustering algorithms;Accuracy;Heuristic algorithms;Clustering methods;Training;Shape;Partitioning algorithms;Directed graphs;Time Series;Clustering;Interpretability},
  doi={10.1109/TKDE.2025.3543946}}

Contributors

Paul Boniol, Inria, ENS, PSL University, CNRS
Donato Tiano, Università degli Studi di Modena e Reggio Emilia
Angela Bonifati, Lyon 1 University, IUF, Liris CNRS
Themis Palpanas, Université Paris Cité, IUF

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
examples		examples
kgraph		kgraph
ressources		ressources
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

$k$-Graph

A Graph Embedding for Interpretable Time Series Clustering

Table of Contents

$k$-Graph in short

🔍 Features

🌐 Try it Online

📁 Project Structure

Getting started

For Mac:

For Linux (Ubuntu):

For Windows:

Manual installation

Usage

Visualization tools

References

Contributors

About

Uh oh!

Releases

Uh oh!

Contributors 2

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

boniolp/kGraph

Folders and files

Latest commit

History

Repository files navigation

$k$-Graph

A Graph Embedding for Interpretable Time Series Clustering

Table of Contents

$k$-Graph in short

🔍 Features

🌐 Try it Online

📁 Project Structure

Getting started

For Mac:

For Linux (Ubuntu):

For Windows:

Manual installation

Usage

Visualization tools

References

Contributors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 2

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.