0% found this document useful (0 votes)
67 views

Quantum-Inspired Anomaly Detection, A QUBO Formulation

Anomaly detection is a crucial task in machine learning that involves identifying unusual patterns or events in data. It has numerous applications in various domains such as finance, healthcare, and cybersecurity. With the advent of quantum computing, there has been a growing interest in developing quantum approaches to anomaly detection. After reviewing traditional approaches to anomaly detection relying on statistical or distance-based methods, we will propose a Quadratic Unconstrained Binary

Uploaded by

Luiscri96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Quantum-Inspired Anomaly Detection, A QUBO Formulation

Anomaly detection is a crucial task in machine learning that involves identifying unusual patterns or events in data. It has numerous applications in various domains such as finance, healthcare, and cybersecurity. With the advent of quantum computing, there has been a growing interest in developing quantum approaches to anomaly detection. After reviewing traditional approaches to anomaly detection relying on statistical or distance-based methods, we will propose a Quadratic Unconstrained Binary

Uploaded by

Luiscri96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Quantum-inspired anomaly detection, a QUBO

formulation
Julien Mellaerts1*
1* Eviden, Les Clayes-sous-Bois, France.
arXiv:2311.03227v1 [quant-ph] 6 Nov 2023

Corresponding author(s). E-mail(s): julien.mellaerts@eviden.com;

Abstract
Anomaly detection is a crucial task in machine learning that involves iden-
tifying unusual patterns or events in data. It has numerous applications in
various domains such as finance, healthcare, and cybersecurity. With the advent
of quantum computing, there has been a growing interest in developing quan-
tum approaches to anomaly detection. After reviewing traditional approaches
to anomaly detection relying on statistical or distance-based methods, we will
propose a Quadratic Unconstrained Binary Optimization (QUBO) model formu-
lation of anomaly detection, compare it with classical methods, and discuss its
scalability on current Quantum Processing Units (QPU).

Keywords: Anomaly Detection, QML, Quantum Annealer, QUBO

1 Introduction
Quantum computing and machine learning are two rapidly advancing fields that have
the potential to revolutionize various domains. Quantum machine learning (QML)
combines the power of quantum computing with the principles of machine learning,
opening up new possibilities for solving complex problems and enhancing the capa-
bilities of traditional machine learning algorithms. At its core, quantum computing
harnesses the principles of quantum mechanics to manipulate and process informa-
tion in ways that classical computers cannot. Quantum bits, or qubits, can exist in
a superposition of states, allowing for parallel computation and the potential to per-
form complex calculations exponentially faster than classical computers for certain
problems. This unique property of quantum computing forms the basis for developing
quantum machine learning algorithms.

1
Quantum machine learning aims to leverage quantum computers to enhance the speed
and efficiency of various machine learning tasks. It offers the promise of solving com-
putationally intensive problems that are intractable for classical machine learning
algorithms, and relies on phase estimation, quantum matrix inversion and ampli-
tude amplification methods [1–4]. Several key areas within quantum machine learning
have already shown promising advancements, notably in data classification, graphs
applications or pattern recognition [5, 6].

Anomaly detection refers to the identification of patterns or instances that deviate


significantly from the expected or normal behavior within a data set. These anomalies
often indicate potential fraud, errors, faults, or other exceptional events that require
attention. By leveraging advanced algorithms and statistical techniques, anomaly
detection algorithms are widely used in many applications like finance, healthcare and
cybersecurity [7, 8].

Current Quantum Anomaly Detection (QAD) algorithms are implemented on gate-


based quantum computing paradigm making use of amplitude estimation, variational
or quantum K-means technics [9–12]. Current Noisy Intermediate-Scale Quantum
(NISQ) computers are composed of number-limited noisy qubits, impacting the
practicability of these gate-based QAD algorithms in terms of data set size or
dimensionality.

In this paper, I propose a Quadratic Unconstrained Binary Optimization (QUBO)


formulation of anomaly detection, benchmark it against classical methods on real-
world data sets, and finally discuss on its scalability and effectivness.

2 Classical anomaly detection algorithms


There are several types of anomaly detection algorithms, including statistical-based
methods, density-based methods, and machine learning-based methods. Statistical-
based anomaly detection algorithms assume that normal data points follow a known
statistical distribution, such as Gaussian (normal) distribution. These methods use
statistical techniques to model the data distribution and identify instances that fall
outside the expected range. Common statistical-based algorithms include:
Z-Score: This algorithm measures the distance of each data point from the mean in
terms of standard deviations. Points that fall above or below a certain threshold
are considered anomalies.
Gaussian Mixture Models (GMM): GMM assumes that the data is a combination
of several Gaussian distributions. It estimates the parameters of the Gaussian
components and identifies data points with low probability as anomalies.
Box Plot: A box plot summarizes the distribution of a data set by representing
quartiles, outliers (anomalies), and extreme values. Data points beyond a certain
threshold are considered outliers or anomalies.
Density-based anomaly detection algorithms focus on the proximity or density
of data points in the feature space. They assume that anomalies are far from their

2
neighboring points or are in low-density regions. Common density-based algorithms
include:
k-Nearest Neighbors (k-NN): k-NN identifies anomalies based on the distance to their
k-nearest neighbors. Points that have few nearby neighbors or have significantly
larger distances are considered anomalies.
Local Outlier Factor (LOF): LOF measures the local density of data points com-
pared to their neighbors. Anomalies have lower local densities compared to their
neighbors.
Density-Based Spatial Clustering of Applications with Noise (DBSCAN): DBSCAN
groups data points based on their density and identifies outliers as points that do
not belong to any cluster.
Machine learning-based anomaly detection algorithms leverage advanced tech-
niques to learn the underlying patterns and characteristics of normal data. They aim to
build models that can differentiate between normal and anomalous instances. Common
machine learning-based algorithms include:
Support Vector Machines (SVM): SVM separates the data into different classes and
identifies instances that fall on the ”wrong” side of the separation boundary as
anomalies.
Isolation Forest: This algorithm constructs an ensemble of isolation trees to isolate
anomalies efficiently. Anomalies require fewer splits to be isolated compared to
normal instances.
Neural Networks: Neural networks can learn complex patterns in the data and detect
anomalies based on deviations from the learned representation.
Autoencoders: Autoencoders are neural network architectures that aim to recon-
struct input data. Anomalies lead to higher reconstruction errors, allowing for their
detection.
The idea behind the proposed QUBO formulation of anomaly detection algorithm
is to combine both statistical and density-based methods in order to increase the
accuracy of the algorithm.

3 Quantum-inspired anomaly detection


In this section, I describe the proposed quantum-inspired anomaly detection algorithm
using a QUBO formulation. I start with a general definition of QUBO and translate
it for anomaly detection.

3.1 QUBO formulation


Quadratic Unconstrained Binary Optimization (QUBO) is a mathematical optimiza-
tion problem that involves minimizing a quadratic objective function subject to binary
variables. In other words, it is an optimization problem where the goal is to find the
binary values of a set of variables that minimize a quadratic cost function.
The binary variables in QUBO can take only two values: 0 and 1, representing a
binary choice or decision. The objective function in QUBO is quadratic, meaning it

3
consists of quadratic terms involving the binary variables and possibly linear terms as
well. The goal is to find the assignment of binary values to the variables that minimizes
the overall value of the objective function.
A QUBO model is expressed as an optimization problem:

N
X −1 N
X −1
f (x) = Qi,i xi + Qi,j xi xj (1)
i=0 i,j=0,i̸=j

Diagonal terms of Q represent linear terms whereas off-diagonal terms represent


quadratic terms.

A parallel can be drawn between QUBO formulation and anomaly detection algo-
rithms, in the sense that, statistical-based algorithms can be seen as finding most
distant data points from the centroid of a Gaussian distributed data set, which is rela-
tive to linear terms of the QUBO, and density-based algorithms can be seen as finding
neighbors most distant data points, which is relative to quadratic terms of the QUBO.
This drawing allows the combination of both statistical and density based algorithms.

We can now give a more rigorous formulation of the problem. Given a data set
X = (x0 , ..., xN −1 ) of N data points, let’s define di the distance between a point
xi ∈ X and the centroid of the data set distribution, and di,j the distance between
two data points xi , xj ∈ X with i ̸= j, the cost function to maximize is:
N
X −1 N
X −1
Q(x, α) = α di xi + (1 − α) di,j xi xj (2)
i=0 i,j=0,i̸=j
subject to:
NP−1
xi = k
i=0

where:
0 ≤ α ≤ 1, is a weighting parameter for linear and quadratic terms.
0 < k ≤ N is the number of outliers in the data set.

The α parameter is specific for each data set and needs to be learnt during a
training phase.

The number of outliers to select in the data set can be constrained in the QUBO
by adding a penalty to the cost function:

N
X −1
−A( xi − k)2 (3)
i=0

where:
A is a penalty weight that must be scaled accordingly to QUBO terms so that
A > max(Q).

4
To get most accurate results, it is important to limit the number of quadratic terms
of each variable to the number of outliers, by filling only the k-furthest neighbors.
The chosen distance metric is the Mahalanobis distance, which is an effective mul-
tivariate distance metric available in many machine learning models, including those
that will be used to benchmark anomaly detection methods.

3.2 Benchmarks
In this section I present benchmarking methods and results. For all benchmarks, I
compare quantum-inspired anomaly detection to :
Robust covariance: sklearn.covariance.EllipticEnvelope
One-Class SVM: sklearn.vm.OneClassSVM
One-Class SVM (SGD): sklearn.linearmodel.SGDOneClassSVM
Isolation Forest: sklearn.ensemble.IsolationForest
Local Outlier Factor: sklearn.neighbors.LocalOutlierFactor
Each benchmark instance is solved using qbsolv, and the metric is the Area Under
the Receiver Operating Characteristic Curve (ROC AUC) score.

3.2.1 Random Gaussian distributed data points


In this first benchmark, the quantum-inspired anomaly detection is compared to clas-
sical methods on three different random Gaussian distributed data points, with three
different standard deviations.

Fig. 1 Random Gaussian distributed data points ROC AUC scores of anomaly detection algorithms
for three different standard deviations.

5
3.2.2 MNIST data set
The MNIST data set contains 60,000 training images and 10,000 test images of
handwritten digits and is widely used in classification benchmarks.
In this second benchmark, I compare quantum-inspired anomaly detection to
classical methods on three different MNIST data set configurations:
45 samples of handwritten 0, 5 samples of handwritten 9
45 samples of handwritten 7, 5 samples of handwritten 1
45 samples of handwritten 2, 5 samples of handwritten 3
Minority of handwritten digit of each configuration should be identified as outlier
by anomaly detection algorithms.

Fig. 2 MNIST data set ROC AUC scores of anomaly detection algorithms for three different con-
figurations.

3.2.3 Credit card fraud detection


In this last benchmark, I compare quantum-inspired anomaly detection to classical
methods on credit card fraud detection data set [13].
The data set contains credit card transactions that occured in two days in Septem-
ber 2013 by european cardholders, there are 492 frauds out of 284,807 transactions.

6
The data set is highly unbalanced, the positive class (frauds) account for 0.172 percent
of all transactions.

Fig. 3 Credit card fraud data set ROC AUC scores of anomaly detection algorithms.

4 Conclusion
In this article, I present a novel QUBO formulation for anomaly detection which shows
improved accuracy in comparison with classical methods on random data sets as well
as real-word data sets. Limiting quadratic terms to k-furthest neighbors improves accu-
racy and potentially enables finding an embedding on actual QPUs, however the outlier
selection penalty (equation 3) imposes all-to-all quadratic interactions, making the
QUBO not solvable by near-term QPUs. An other method of restricting outlier selec-
tion should be used to allow this problem to be solved on connectivity-limited QPUs.
Nevertheless, this quantum-inspired anomaly detection algorithm can be solved with-
out hardware constraint and with high number of variables using simulated quantum
annealing, simulated annealing, or any QUBO solver.

References
[1] Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.:
Quantum machine learning. Nature 549(7671), 195–202 (2017) https://doi.org/
10.1038/nature23474

7
[2] Dunjko, V., Taylor, J.M., Briegel, H.J.: Quantum-enhanced machine learning.
Phys. Rev. Lett. 117, 130501 (2016) https://doi.org/10.1103/PhysRevLett.117.
130501

[3] Dunjko, V., Briegel, H.J.: Machine learning & artificial intelligence in the
quantum domain (2017)

[4] Ciliberto, C., Herbster, M., Ialongo, A.D., Pontil, M., Rocchetto, A., Severini, S.,
Wossnig, L.: Quantum machine learning: a classical perspective. Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Sciences 474(2209),
20170551 (2018) https://doi.org/10.1098/rspa.2017.0551

[5] Johri, S., Debnath, S., Mocherla, A., Singh, A., Prakash, A., Kim, J., Kerenidis,
I.: Nearest Centroid Classification on a Trapped Ion Quantum Computer (2020)

[6] Henry, L.-P., Thabet, S., Dalyac, C., Henriet, L.: Quantum evolution kernel:
Machine learning on graphs with programmable arrays of qubits. Physical Review
A 104(3) (2021) https://doi.org/10.1103/physreva.104.032416

[7] Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM
Comput. Surv. 41, 15–11558 (2009)

[8] Pimentel, M.A.F., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty
detection. Signal Processing 99, 215–249 (2014) https://doi.org/10.1016/j.sigpro.
2013.12.026

[9] Kottmann, K., Metz, F., Fraxanet, J., Baldelli, N.: Variational quantum anomaly
detection: Unsupervised mapping of phase diagrams on a physical quan-
tum computer. Physical Review Research 3(4) (2021) https://doi.org/10.1103/
physrevresearch.3.043184

[10] Guo, M., Liu, H., Li, Y., Li, W., Gao, F., Qin, S., Wen, Q.: Quantum algorithms
for anomaly detection using amplitude estimation. Physica A: Statistical Mechan-
ics and its Applications 604, 127936 (2022) https://doi.org/10.1016/j.physa.2022.
127936

[11] Woźniak, K.A., Belis, V., Puljak, E., Barkoutsos, P., Dissertori, G., Grossi, M.,
Pierini, M., Reiter, F., Tavernelli, I., Vallecorsa, S.: Quantum anomaly detection
in the latent space of proton collision events at the LHC (2023)

[12] Alvi, S., Bauer, C.W., Nachman, B.: Quantum anomaly detection for collider
physics. Journal of High Energy Physics 2023(2) (2023) https://doi.org/10.1007/
jhep02(2023)220

[13] Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM
Comput. Surv. 41(3) (2009) https://doi.org/10.1145/1541880.1541882

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy