Skip to content

selimfirat/pysad

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

https://github.com/selimfirat/pysad/raw/master/docs/logo.png

Python Streaming Anomaly Detection (PySAD)

PyPI GitHub release (latest by date) Documentation status Gitter Azure Pipelines Build Status Travis CI Build Status Appveyor Build status Circle CI Coverage Status PyPI - Python Version Supported Platforms License

PySAD is an open-source python framework for anomaly detection on streaming multivariate data.

Documentation

Features

Online Anomaly Detection

PySAD provides methods for online/sequential anomaly detection, i.e. anomaly detection on streaming data, where model updates itself as a new instance arrives.

Resource-Efficient

Streaming methods efficiently handle the limitied memory and processing time requirements of the data streams so that they can be used in near real-time. The methods can only store an instance or a small window of recent instances.

Complete

PySAD contains stream simulators, evaluators, preprocessors, statistic trackers, postprocessors, probability calibrators and more. In addition to streaming models, PySAD also provides integrations for batch anomaly detectors of the PyOD so that they can be used in the streaming setting.

Comprehensive

PySAD serves models that are specifically designed for both univariate and multivariate data. Furthermore, one can experiment via PySAD in supervised, semi-supervised and unsupervised setting.

User Friendly

Users with any experience level can easily use PySAD. One can easily design experiments and combine the tools in the framework. Moreover, the existing methods in PySAD are easy to extend.

Free and Open Source Software (FOSS)

PySAD is distributed under BSD License 2.0 and favors FOSS principles.

Community Engagement

PySAD has built a strong and active community with significant adoption across academia and industry:

Academic Recognition

  • Cited in academic literature with growing adoption in streaming data research with more than 50 citations to the arXiv version (excluding GitHub link-only citations). See Google Scholar for detailed citation metrics.

GitHub Community

  • 260+ GitHub Stars demonstrating widespread community interest and adoption among developers and researchers in the machine learning community.

Active Usage

  • Strong PyPI download statistics according to pypistats.org with 2K+ downloads in the May 2025 and consistent weekly usage.

Educational Content

Featured in educational content across multiple platforms:

Third-party Integrations

PySAD has been adopted and integrated into major machine learning frameworks:

  • TurboML Integration: PySAD example documentation showing adoption in machine learning workflow platforms.
  • Apache Beam Integration: PySAD modules adapted into Apache Beam's ML package with zscore and robust_zscore anomaly detectors.
  • River ML Integration: The prominent online machine learning library has adapted PySAD algorithms, including the StandardAbsoluteDeviation detector with explicit PySAD references.

Developer Community

Installation

The PySAD framework can be installed via:

pip install -U pysad

Alternatively, you can install the library directly using the source code in Github repository by:

git clone https://github.com/selimfirat/pysad.git
cd pysad
pip install .

Required Dependencies:

  • Python 3.10+
  • numpy==2.0.2
  • scikit-learn==1.5.2
  • scipy==1.13.1
  • pyod==1.1.0
  • combo==0.1.3

Optional Dependencies:

  • rrcf==0.4.3 (Only required for pysad.models.robust_random_cut_forest.RobustRandomCutForest)
  • PyNomaly==0.3.3 (Only required for pysad.models.loop.StreamLocalOutlierProbability)
  • mmh3==2.5.1 (Only required for pysad.models.xstream.xStream)
  • pandas==2.2.3 (Only required for pysad.utils.pandas_streamer.PandasStreamer)
  • jax>=0.6.1 (Only required for pysad.models.inqmad.Inqmad; required for NumPy 2.0+ compatibility)
  • jaxlib>=0.6.1 (Only required for pysad.models.inqmad.Inqmad; required for NumPy 2.0+ compatibility)

Examples

Quick Start

Here's a simple example showing how to use PySAD for anomaly detection on streaming data:

# Import modules.
from pysad.evaluation import AUROCMetric
from pysad.models import LODA
from pysad.utils import Data


model = LODA()  # Init model
metric = AUROCMetric()  # Init area under receiver-operating- characteristics curve metric
streaming_data = Data().get_iterator("arrhythmia.mat")  # Get data streamer.

for x, y_true in streaming_data:  # Stream data.
    anomaly_score = model.fit_score_partial(x)  # Fit the instance to model and score the instance.

    metric.update(y_true, anomaly_score)  # Update the AUROC metric.

# Output the resulting AUROCMetric.
print(f"Area under ROC metric is {metric.get()}.")

Quick Links

Versioning

Semantic versioning is used for this project.

License

This project is licensed under the BSD License 2.0.

Citing PySAD

If you use PySAD for a scientific publication, please cite the following paper:

@article{pysad,
  title={PySAD: A Streaming Anomaly Detection Framework in Python},
  author={Yilmaz, Selim F and Kozat, Suleyman S},
  journal={arXiv preprint arXiv:2009.02572},
  year={2020}
}
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy