Quantum-Inspired Anomaly Detection, A QUBO Formulation
Quantum-Inspired Anomaly Detection, A QUBO Formulation
formulation
Julien Mellaerts1*
1* Eviden, Les Clayes-sous-Bois, France.
arXiv:2311.03227v1 [quant-ph] 6 Nov 2023
Abstract
Anomaly detection is a crucial task in machine learning that involves iden-
tifying unusual patterns or events in data. It has numerous applications in
various domains such as finance, healthcare, and cybersecurity. With the advent
of quantum computing, there has been a growing interest in developing quan-
tum approaches to anomaly detection. After reviewing traditional approaches
to anomaly detection relying on statistical or distance-based methods, we will
propose a Quadratic Unconstrained Binary Optimization (QUBO) model formu-
lation of anomaly detection, compare it with classical methods, and discuss its
scalability on current Quantum Processing Units (QPU).
1 Introduction
Quantum computing and machine learning are two rapidly advancing fields that have
the potential to revolutionize various domains. Quantum machine learning (QML)
combines the power of quantum computing with the principles of machine learning,
opening up new possibilities for solving complex problems and enhancing the capa-
bilities of traditional machine learning algorithms. At its core, quantum computing
harnesses the principles of quantum mechanics to manipulate and process informa-
tion in ways that classical computers cannot. Quantum bits, or qubits, can exist in
a superposition of states, allowing for parallel computation and the potential to per-
form complex calculations exponentially faster than classical computers for certain
problems. This unique property of quantum computing forms the basis for developing
quantum machine learning algorithms.
1
Quantum machine learning aims to leverage quantum computers to enhance the speed
and efficiency of various machine learning tasks. It offers the promise of solving com-
putationally intensive problems that are intractable for classical machine learning
algorithms, and relies on phase estimation, quantum matrix inversion and ampli-
tude amplification methods [1–4]. Several key areas within quantum machine learning
have already shown promising advancements, notably in data classification, graphs
applications or pattern recognition [5, 6].
2
neighboring points or are in low-density regions. Common density-based algorithms
include:
k-Nearest Neighbors (k-NN): k-NN identifies anomalies based on the distance to their
k-nearest neighbors. Points that have few nearby neighbors or have significantly
larger distances are considered anomalies.
Local Outlier Factor (LOF): LOF measures the local density of data points com-
pared to their neighbors. Anomalies have lower local densities compared to their
neighbors.
Density-Based Spatial Clustering of Applications with Noise (DBSCAN): DBSCAN
groups data points based on their density and identifies outliers as points that do
not belong to any cluster.
Machine learning-based anomaly detection algorithms leverage advanced tech-
niques to learn the underlying patterns and characteristics of normal data. They aim to
build models that can differentiate between normal and anomalous instances. Common
machine learning-based algorithms include:
Support Vector Machines (SVM): SVM separates the data into different classes and
identifies instances that fall on the ”wrong” side of the separation boundary as
anomalies.
Isolation Forest: This algorithm constructs an ensemble of isolation trees to isolate
anomalies efficiently. Anomalies require fewer splits to be isolated compared to
normal instances.
Neural Networks: Neural networks can learn complex patterns in the data and detect
anomalies based on deviations from the learned representation.
Autoencoders: Autoencoders are neural network architectures that aim to recon-
struct input data. Anomalies lead to higher reconstruction errors, allowing for their
detection.
The idea behind the proposed QUBO formulation of anomaly detection algorithm
is to combine both statistical and density-based methods in order to increase the
accuracy of the algorithm.
3
consists of quadratic terms involving the binary variables and possibly linear terms as
well. The goal is to find the assignment of binary values to the variables that minimizes
the overall value of the objective function.
A QUBO model is expressed as an optimization problem:
N
X −1 N
X −1
f (x) = Qi,i xi + Qi,j xi xj (1)
i=0 i,j=0,i̸=j
A parallel can be drawn between QUBO formulation and anomaly detection algo-
rithms, in the sense that, statistical-based algorithms can be seen as finding most
distant data points from the centroid of a Gaussian distributed data set, which is rela-
tive to linear terms of the QUBO, and density-based algorithms can be seen as finding
neighbors most distant data points, which is relative to quadratic terms of the QUBO.
This drawing allows the combination of both statistical and density based algorithms.
We can now give a more rigorous formulation of the problem. Given a data set
X = (x0 , ..., xN −1 ) of N data points, let’s define di the distance between a point
xi ∈ X and the centroid of the data set distribution, and di,j the distance between
two data points xi , xj ∈ X with i ̸= j, the cost function to maximize is:
N
X −1 N
X −1
Q(x, α) = α di xi + (1 − α) di,j xi xj (2)
i=0 i,j=0,i̸=j
subject to:
NP−1
xi = k
i=0
where:
0 ≤ α ≤ 1, is a weighting parameter for linear and quadratic terms.
0 < k ≤ N is the number of outliers in the data set.
The α parameter is specific for each data set and needs to be learnt during a
training phase.
The number of outliers to select in the data set can be constrained in the QUBO
by adding a penalty to the cost function:
N
X −1
−A( xi − k)2 (3)
i=0
where:
A is a penalty weight that must be scaled accordingly to QUBO terms so that
A > max(Q).
4
To get most accurate results, it is important to limit the number of quadratic terms
of each variable to the number of outliers, by filling only the k-furthest neighbors.
The chosen distance metric is the Mahalanobis distance, which is an effective mul-
tivariate distance metric available in many machine learning models, including those
that will be used to benchmark anomaly detection methods.
3.2 Benchmarks
In this section I present benchmarking methods and results. For all benchmarks, I
compare quantum-inspired anomaly detection to :
Robust covariance: sklearn.covariance.EllipticEnvelope
One-Class SVM: sklearn.vm.OneClassSVM
One-Class SVM (SGD): sklearn.linearmodel.SGDOneClassSVM
Isolation Forest: sklearn.ensemble.IsolationForest
Local Outlier Factor: sklearn.neighbors.LocalOutlierFactor
Each benchmark instance is solved using qbsolv, and the metric is the Area Under
the Receiver Operating Characteristic Curve (ROC AUC) score.
Fig. 1 Random Gaussian distributed data points ROC AUC scores of anomaly detection algorithms
for three different standard deviations.
5
3.2.2 MNIST data set
The MNIST data set contains 60,000 training images and 10,000 test images of
handwritten digits and is widely used in classification benchmarks.
In this second benchmark, I compare quantum-inspired anomaly detection to
classical methods on three different MNIST data set configurations:
45 samples of handwritten 0, 5 samples of handwritten 9
45 samples of handwritten 7, 5 samples of handwritten 1
45 samples of handwritten 2, 5 samples of handwritten 3
Minority of handwritten digit of each configuration should be identified as outlier
by anomaly detection algorithms.
Fig. 2 MNIST data set ROC AUC scores of anomaly detection algorithms for three different con-
figurations.
6
The data set is highly unbalanced, the positive class (frauds) account for 0.172 percent
of all transactions.
Fig. 3 Credit card fraud data set ROC AUC scores of anomaly detection algorithms.
4 Conclusion
In this article, I present a novel QUBO formulation for anomaly detection which shows
improved accuracy in comparison with classical methods on random data sets as well
as real-word data sets. Limiting quadratic terms to k-furthest neighbors improves accu-
racy and potentially enables finding an embedding on actual QPUs, however the outlier
selection penalty (equation 3) imposes all-to-all quadratic interactions, making the
QUBO not solvable by near-term QPUs. An other method of restricting outlier selec-
tion should be used to allow this problem to be solved on connectivity-limited QPUs.
Nevertheless, this quantum-inspired anomaly detection algorithm can be solved with-
out hardware constraint and with high number of variables using simulated quantum
annealing, simulated annealing, or any QUBO solver.
References
[1] Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.:
Quantum machine learning. Nature 549(7671), 195–202 (2017) https://doi.org/
10.1038/nature23474
7
[2] Dunjko, V., Taylor, J.M., Briegel, H.J.: Quantum-enhanced machine learning.
Phys. Rev. Lett. 117, 130501 (2016) https://doi.org/10.1103/PhysRevLett.117.
130501
[3] Dunjko, V., Briegel, H.J.: Machine learning & artificial intelligence in the
quantum domain (2017)
[4] Ciliberto, C., Herbster, M., Ialongo, A.D., Pontil, M., Rocchetto, A., Severini, S.,
Wossnig, L.: Quantum machine learning: a classical perspective. Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Sciences 474(2209),
20170551 (2018) https://doi.org/10.1098/rspa.2017.0551
[5] Johri, S., Debnath, S., Mocherla, A., Singh, A., Prakash, A., Kim, J., Kerenidis,
I.: Nearest Centroid Classification on a Trapped Ion Quantum Computer (2020)
[6] Henry, L.-P., Thabet, S., Dalyac, C., Henriet, L.: Quantum evolution kernel:
Machine learning on graphs with programmable arrays of qubits. Physical Review
A 104(3) (2021) https://doi.org/10.1103/physreva.104.032416
[7] Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM
Comput. Surv. 41, 15–11558 (2009)
[8] Pimentel, M.A.F., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty
detection. Signal Processing 99, 215–249 (2014) https://doi.org/10.1016/j.sigpro.
2013.12.026
[9] Kottmann, K., Metz, F., Fraxanet, J., Baldelli, N.: Variational quantum anomaly
detection: Unsupervised mapping of phase diagrams on a physical quan-
tum computer. Physical Review Research 3(4) (2021) https://doi.org/10.1103/
physrevresearch.3.043184
[10] Guo, M., Liu, H., Li, Y., Li, W., Gao, F., Qin, S., Wen, Q.: Quantum algorithms
for anomaly detection using amplitude estimation. Physica A: Statistical Mechan-
ics and its Applications 604, 127936 (2022) https://doi.org/10.1016/j.physa.2022.
127936
[11] Woźniak, K.A., Belis, V., Puljak, E., Barkoutsos, P., Dissertori, G., Grossi, M.,
Pierini, M., Reiter, F., Tavernelli, I., Vallecorsa, S.: Quantum anomaly detection
in the latent space of proton collision events at the LHC (2023)
[12] Alvi, S., Bauer, C.W., Nachman, B.: Quantum anomaly detection for collider
physics. Journal of High Energy Physics 2023(2) (2023) https://doi.org/10.1007/
jhep02(2023)220
[13] Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM
Comput. Surv. 41(3) (2009) https://doi.org/10.1145/1541880.1541882