Skip to content

nla-group/ABBA

Repository files navigation

Build Status codecov License: MIT

ABBA

This repository provides ABBA, an algorithm for the adaptive symbolic aggregation of time series. The ABBA algorithm consists of two key parts: compression via an adaptive piecewise linear approximation, and digitization via mean-based clustering on the increments and lengths of each piece.

The algorithm uses a scaling parameter scl to control the weighting of the increments and lengths during the clustering. If scl = 0 or scl = np.inf, a one-dimensional clustering algorithm can be used. We use a modified C++ implementation of CKmeans from Ckmeans.1d.dp R package; see Prerequisites. If the C++ implementation is not available or a different scaling parameter is used, then ABBA uses the Kmeans algorithm from the Python package Scikit-learn.

As an example, we consider a synthetic time series and apply ABBA's compression method, which approximates the time series by a sequence of linear segments stitched together. Each segment can be represented by its change in the x-value (len) and change in y-direction (inc).

Compression

ABBA's digitization procedure clusters the tuples (len, inc), assigning a unique symbol to each cluster.

Digitization

The symbolic representation of the time series is 'abbacab'. For further information on ABBA, see [1].

Prerequisites

Install python packages:

pip install -r requirements.txt

If scl = 0 or scl = np.inf, then ABBA will attempt to use a C++ implementation of the CKmeans algorithm. For this we use SWIG, an open source Simplified Wrapper and Interface Generator. SWIG generated a shared library and corresponding python file which provides a wrapper for the C++ function. SWIG can be installed via the following commands:

Linux (Ubuntu):

sudo apt-get install swig

Mac:

brew install swig

Once installed, the Python wrapper can be constructed using the makefile.

make

Testing

Run the unit tests by the following command:

python test_ABBA.py -v

Example

>>> from ABBA import ABBA
>>> ts = [-1, 0.1, 1.3, 2, 1.9, 2.4, 1.8, 0.8, -0.5]
>>> abba = ABBA()
>>> string, centers = abba.transform(ts)
Compression rate:  77.77777777777779
Digitization: Using 4 symbols
>>> reconstructed_ts = abba.inverse_transform(string, centers, ts[0])
>>> print([round(i, 1) for i in reconstructed_ts])
[-1, 0.1, 1.3, 1.9, 1.5, 2.1, 1.8, 0.6, -0.5]

TODO

  • Make patches produce reconstruction that is the same length as original time series.
  • Isolate clustering from digitization. This will allow us to convert time series using already constructed ABBA representation.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

References

[1] S. Elsworth and S. Güttel. ABBA: Adaptive Brownian bridge-based symbolic aggregation of time series, MIMS Eprint 2019.11 (http://eprints.maths.manchester.ac.uk/2712/), Manchester Institute for Mathematical Sciences, The University of Manchester, UK, 2019.

About

A symbolic time series representation building Brownian bridges

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy