0% found this document useful (0 votes)
3 views

s43673-021-00030-3

Wei et al. propose a quantum convolutional neural network (QCNN) designed for noisy intermediate-scale quantum (NISQ) devices, significantly reducing computational complexity compared to classical convolutional neural networks (CNN). The QCNN demonstrates effectiveness in image processing tasks such as smoothing, sharpening, and edge detection, as well as recognizing handwritten numbers. This model facilitates the direct transformation of CNNs into QCNNs, leveraging quantum capabilities to enhance information processing in the era of big data.

Uploaded by

鄭旭凱
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

s43673-021-00030-3

Wei et al. propose a quantum convolutional neural network (QCNN) designed for noisy intermediate-scale quantum (NISQ) devices, significantly reducing computational complexity compared to classical convolutional neural networks (CNN). The QCNN demonstrates effectiveness in image processing tasks such as smoothing, sharpening, and edge detection, as well as recognizing handwritten numbers. This model facilitates the direct transformation of CNNs into QCNNs, leveraging quantum capabilities to enhance information processing in the era of big data.

Uploaded by

鄭旭凱
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Wei et al.

AAPPS Bulletin (2022) 32:2


https://doi.org/10.1007/s43673-021-00030-3 AAPPS Bulletin

ORIGINAL ARTICLE Open Access

A quantum convolutional neural


network on NISQ devices
ShiJie Wei1,2 , YanHu Chen3 , ZengRong Zhou1,2 and GuiLu Long1,2,4,5*

Abstract
Quantum machine learning is one of the most promising applications of quantum computing in the noisy
intermediate-scale quantum (NISQ) era. We propose a quantum convolutional neural network(QCNN) inspired by
convolutional neural networks (CNN), which greatly reduces the computing complexity compared with its classical
counterparts, with O((log2 M)6 ) basic gates and O(m2 + e) variational parameters, where M is the input data size, m is
the filter mask size, and e is the number of parameters in a Hamiltonian. Our model is robust to certain noise for image
recognition tasks and the parameters are independent on the input sizes, making it friendly to near-term quantum
devices. We demonstrate QCNN with two explicit examples. First, QCNN is applied to image processing, and
numerical simulation of three types of spatial filtering, image smoothing, sharpening, and edge detection is
performed. Secondly, we demonstrate QCNN in recognizing image, namely, the recognition of handwritten numbers.
Compared with previous work, this machine learning model can provide implementable quantum circuits that
accurately corresponds to a specific classical convolutional kernel. It provides an efficient avenue to transform CNN to
QCNN directly and opens up the prospect of exploiting quantum power to process information in the era of big data.
Keywords: Quantum computing, Quantum algorithm, Quantum machine learning

1 Introduction in recent years [6–15]. Machine learning algorithm con-


Machine learning has fundamentally transformed the way sists of three components: representation, evaluation and
people think and behave. Convolutional neural network optimization, and the quantum version [16–20] usually
(CNN) is an important machine learning model which concentrates on realizing the evaluation part, the funda-
has the advantage of utilizing the correlation information mental construct in deep learning [21].
of data, with many interesting applications ranging from A CNN generally consists of three layers, convolu-
image recognition to precision medicine. tion layers, pooling layers, and fully connected layers.
Quantum information processing (QIP) [1, 2], which ()
The convolution layer calculates new pixel values xij
exploits quantum-mechanical phenomena such as quan- from a linear combination of the neighborhood pixels
tum superpositions and quantum entanglement, allows ()
in the preceding map with the specific weights, xi,j =
one to overcome the limitations of classical computa- m (−1)
tion and reaches higher computational speed for certain a,b=1 wa,b xi+a−2,j+b−2 , where the weights wa,b form a
problems [3–5]. Quantum machine learning, as an inter- m × m matrix named as a convolution kernel or a fil-
disciplinary study between machine learning and quan- ter mask. Pooling layer reduces feature map size, e.g., by
tum information, has undergone a flurry of developments taking the average value from four contiguous pixels, and
is often followed by application of a nonlinear (activa-
tion) function. The fully connected layer computes the
*Correspondence: gllong@tsinghua.edu.cn final output by a linear combination of all remaining pix-
1
Beijing Academy of Quantum Information Sciences, Beijing 100193, China els with specific weights determined by parameters in a
2
State Key Laboratory of Low-Dimensional Quantum Physics and Department
of Physics, Tsinghua University, Beijing 100084, China
Full list of author information is available at the end of the article

© The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to
the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The
images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Wei et al. AAPPS Bulletin (2022) 32:2 Page 2 of 11

fully connected layer. The weights in the filter mask and 2 Results
fully connected layer are optimized by training on large 2.1 Framework of quantum neural networks
datasets. 2.1.1 Quantum convolution layer
In this article, we demonstrate the basic framework The first step for performing quantum convolution layer
of a quantum convolutional neural network (QCNN) by is to encode the image data into a quantum system. In this
sequentially realizing convolution layers, pooling layers, work, we encode the pixel positions in the computational
and fully connected layers. Firstly, we implement convolu- basis states and the pixel values in the probability ampli-
tion layers based on linear combination of unitary opera- tudes, forming a pure quantum state (Fig. 1). Given a 2D
tors (LCU) [22–24]. Secondly, we abandon some qubits in image F = (Fi,j )M×L , where Fi,j represents the pixel value
the quantum circuit to simulate the effect of the classical at position (i, j) with i = 1, . . . , M and j = 1, . . . , L. F is
pooling layer. Finally, the fully connected layer is real- transformed as a vector f with ML elements by putting the
ized by measuring the expectation value of a parametrized first column of F into the first M elements of f, the second
Hamiltonian and then a nonlinear (activation) function to column the next M elements , etc. That is,
post-process the expectation value. We perform numeri-
cal demonstrations with two examples to show the validity f = (F1,1 , F2,1 , . . . , FM,1 , F1,2 , . . . , Fi,j , . . . , FM,L )T . (1)
of our algorithm. Finally, the computing complexity and
trainability of our QCNN model are discussed followed by Accordingly, the image data f can be mapped onto
2n −1
a summary. a pure quantum state |f  = k=0 ck |k with n =

Fig. 1 Comparison of classical convolution processing and quantum convolution processing. F and G are the input and output image data,
respectively. On the classical computer, a M × M image can be represented as a matrix and encoded with at least 2n bits [n = log2 (M2 )]. The
classicalimage transformation through the convolution layer is performed by matrix computation F ∗ W, which leads to
() (−1)
xi,j = m a,b=1 wa,b xi+a−2,j+b−2 . The same image can be represented as a quantum state and encoded in at least n qubits on a quantum computer.
The quantum image transformation is realized by the unitary evolution U on a specific quantum state
Wei et al. AAPPS Bulletin (2022) 32:2 Page 3 of 11

⎛ ⎞
log2 (ML) qubits, where the computational basis |k 1
encodes the position (i, j) of each pixel, and the coefficient ⎜ w12 w22 w32 ⎟
 1/2 ⎜ ⎟
⎜ .. .. .. ⎟
ck encodes the pixel value, i.e., ck = Fi,j / 2
Fi,j for V2 = ⎜ . . . ⎟
 1/2 ⎜ ⎟
⎝ w12 w22 w32 ⎠
k < ML and ck = 0 for k ≥ ML. Here, 2
Fi,j is a
1 M×M
constant factor to normalizing the quantum state. ⎛ ⎞ (5)
Without loss of generality, we focus on the input image 0
with M = L = 2n pixels. The convolution layer trans- ⎜ w13 w23 w33 ⎟
⎜ ⎟
⎜ .. .. .. ⎟
forms an input image F = (Fi,j )M×M into an output image V3 = ⎜ . . . ⎟ .
⎜ ⎟
G = (Gi,j )M×M by a specific filter mask W. In the quan- ⎝ w13 w23 w33 ⎠
tum context, this linear transformation, corresponding to 0 M×M
a specific spatial filter operation, can be represented as
|g = U|f  with the input image state |f  and the output
image state |g. For simplicity, we take a 3 × 3 filter mask
as an example Generally speaking, the linear filtering operator U is
non-unitary that can not be performed directly. Actually,
⎡ ⎤ we can embed U in a bigger system with an ancillary
w11 w12 w13
system and decompose it into a linear combination of
W = ⎣ w21 w22 w23 ⎦ . (2)
w31 w32 w33 four unitary operators [26]. U =  U1 + U2 + U3 +
U4 , where U1 = (U  + U † )/2 + i I − (U + U † )2 /4,
The generalization to arbitrary m × m filter mask is U2 = (U  + U † )/2 − i I − (U + U † )2 /4, U3 = (U −
straightforward. Convolution operation will transform the U† )/2i + i I + (U − U † )2 /4 and U4 = (U − U † )/2i −
input image F = (Fi,j )M×M into the output image as G = i I + (U − U † )2 /4. However, the basic gates consumed
3 to perform Ui scale exponentially in the dimensions of
(Gi,j )M×M with the pixel Gi,j = u,v=1 wuv Fi+u−2,j+v−2
(2 ≤ i, j ≤ M − 1). The corresponding quantum evolu- quantum systems, making the quantum advantage dimin-
tion U|f  can be performed as follows. We represent input ishing. In [25], the efficient decomposition or the gate
image F = (Fi,j )M×M as an initial state complexity of U is an open question. However, the
gate complexity is the fundamental standard for mea-
suring algorithm efficiency. Therefore, we present a new
M2 −1
approach to construct the filter operator to reduce the
|f  = ck |k, (3)
gate complexity. For convenience, we change the elements
k=0
of the first row, the last row, the first column, and the last
 1/2 column in the matrix V1 , V2 , and V3 , which is allowable in
where ck = Fi,j / 2
Fi,j . The M2 × M2 linear filtering imagining processing, to the following form
operator U can be defined as [25]:
⎛ ⎞
⎡ ⎤ w21 w31 w11
E ⎜ w11 ⎟
⎜ w21 w31 ⎟
⎢ V1 V2 V3 ⎥
⎢ ⎥ ⎜ .. .. .. ⎟
⎢ . . . ⎥ V1 = ⎜ . . . ⎟
⎢ .. .. . . ⎥ ⎜ ⎟

U=⎢ ⎥, (4) ⎝ w11 w21 w31 ⎠
. . . ⎥
⎢ .. .. .. ⎥ w31 w11 w21 M×M
⎢ ⎥
⎣ V 1 V2 V3 ⎦ ⎛ ⎞
w22 w32 w12
E ⎜ w12 w22 w32 ⎟
⎜ ⎟
⎜ .. .. .. ⎟
V2 = ⎜ . . . ⎟ (6)
where E is an M dimensional identity matrix, and V1 , V2 , ⎜ ⎟
⎝ w12 w22 w32 ⎠
V3 are M × M matrices defined by
w32 w12 w22 M×M
⎛ ⎞ ⎛ ⎞
0 w23 w33 w13
⎜ w11 w21 w31 ⎟ ⎜ w13 w23 w33 ⎟
⎜ ⎟ ⎜ ⎟
⎜ . . . ⎟ ⎜ .. .. .. ⎟
V1 = ⎜ .. .. .. ⎟ V3 = ⎜ . . . ⎟ .
⎜ ⎟ ⎜ ⎟
⎝ w11 w21 w31 ⎠ ⎝ w13 w23 w33 ⎠
0 M×M w33 w13 w23 M×M
Wei et al. AAPPS Bulletin (2022) 32:2 Page 4 of 11

Defining the adjusted linear filtering operator U as (qRAM). Then, performing unitary matrix S on the ancil-
⎡ ⎤ lary registers to transform |0000a into a specific superpo-
V2 V3 V1 sition state |ψa
⎢ V1 V2 V3 ⎥
⎢ ⎥ 9
⎢ .. .. .. ⎥
⎢ . . . ⎥ S|0000a = |ψa = βk /N|k (11)
U =⎢

⎥,
⎥ (7)
⎢ .. .. .. ⎥ v=1
⎢ . . . ⎥ 
⎣ V 1 V2 V3 ⎦ 9
where Nc = k=1 βk and S satisfies
2
V3 V1 V2

βk /Nc if k ≤ 9
Next, we decompose Vμ (μ = 1, 2, 3) into three unitary Sk,1 = (12)
0 if k > 9.
matrices without normalization, Vμ = V1μ + V2μ + V3μ ,
where S is a parameter matrix corresponding to a specific filter
mask that realizes a specific task.
⎛ ⎞ Then, we implement a series of ancillary system con-
w1μ trolled operations Qk ⊗ |k k| on the work system |f  to
⎜ w1μ ⎟ realize LCU. Nextly, Hadamard gates H T = H ⊗4 are acted
⎜ ⎟
⎜ ⎟
V1μ = ⎜ ... ... ... ⎟ to uncompute the ancillary registers |ψa . The state is
⎜ ⎟
⎝ w1μ ⎠ transformed to
w1μ 16 9
M×M 1 T
⎛ ⎞ |g  = |i H(ik) S(k1) Qk |f , (13)
w2μ Nc
i=1 k=1
⎜ w2μ ⎟
⎜ ⎟
⎜ .. .. .. ⎟ T is the ith row and kth column in matrix H T
where H(ik)
V2μ =⎜ . . . ⎟ (8)
⎜ ⎟ and S(k1) is kth row and the first column in matrix S. The
⎝ w2μ ⎠
first term equals to
w2μ M×M
⎛ ⎞ 1
9
w3μ |0 βk Qk |f , (14)
⎜ w3μ ⎟ Nc
⎜ ⎟ k=1
⎜ .. .. .. ⎟
V3μ =⎜ . . . ⎟ .
⎜ ⎟ which corresponds to the filter mask W. The ith term
⎝ w3μ ⎠ equals to filter mask W i (i = 2, 3, . . . , 16), where
w3μ M×M
⎡ ⎤
Tw
Hi1 T T
11 Hi4 w12 Hi7 w13
⎢ T ⎥
Thus, the linear filtering operator U can be expressed Wi = ⎢ H w HT w HT w ⎥
⎣ i2 21 i5 22 i8 23 ⎦ . (15)
as T T T
Hi3 w31 Hi6 w32 Hi9 w33
3 3
 
U = Vμμ /wμμ ⊗ Vvμ . (9) Totally, 16 filter masks are realized, corresponding to
μ=1 v=1 ancilla qubits in 16 different state |i(i = 1, 2, . . . , 16).
Therefore, the whole effect of evolution on state |f  with-
which can be simplified to out considering the ancilla qubits, is the linear combina-
9
tion of the effects of 16 filter masks.
U = βk Q k , (10) If we only need one filter mask W, measuring the ancil-
lary register and conditioned on seeing |0000. We have
k=1
the state N1c |0000U |f , which is proportional to our
 
where Qk = Vμμ /wμμ ⊗ Vvμ /wvμ is unitary, and βk is a expected result state |g. The probability
9 of detecting the
relabelling of the indices. ancillary state |0000 is Ps = k=1 k k |f 
β Q 2 /N 2 .
c
1
Now, we can perform U through the linear combination After obtaining the final result Nc U |f , we can multiply
of unitary operators Qk . The number of unitary operators the constant factor Nc to compute |g  = U |f . In con-
is equal to the size of filter mask. The quantum circuit to clusion, the filter operator U can be decomposed into a
realize U is shown in Fig. 2. The work register |f  and four linear combination of nine unitary operators in the case
ancillary qubits |0000a are entangled together to form a that the general filter mask is W . Only four qubits or a
bigger system. nine energy level ancillary system is consumed to realize
Firstly, we prepare the initial state |f  using amplitude the general filter operator U , which is independent on the
encoding method or quantum random access memory dimensions of image size.
Wei et al. AAPPS Bulletin (2022) 32:2 Page 5 of 11

Fig. 2 Quantum circuit for realizing the QCNN. |f  denotes the initial state of work system after encoded the image data, and the ancillary system is
a four qubits system in the |0000a state. The squares represent unitary operations and the circles represent the state of the controlling system.
Unitary operations Q1 , Q2 , · · · , Q9 , are activated only when the auxiliary system is in state |0000, |0001 · · · , |1000 respectively

 
The final stage of our method is to extract useful infor- |p = g12 + g22 + gM+1
2 + gM+2
2 , g32 + g42 + gM+3
2 + gM+4
2 ,...,
mation from the processed results |g . Clearly, the image T

state |g is different from |g . However, not all elements in 2
gM 2 −M−1 + gM2 −M + gM2 −1 + gM
2 2 2 .
|f  are evaluated, the elements corresponding to the four
edges of original image remain unchanged. One is only (16)
interested in the pixel values which are evaluated by W in
2.1.3 Quantum fully connected layer
|f . These pixel values in |g  are as same as that in |g (see
Fully connected layers compile the data extracted by pre-
details in Appendix C). So, we can obtain the information
vious layers to form the final output; it usually appears at
of G = (Gi,j )M×M (2 ≤ i, j ≤ M − 1) by evaluating the |f 
the end of the convolutional neural networks. We define a
under operator U instead of U.
parametrized Hamiltonian up to a seconder order corre-
2.1.2 Quantum pooling layer lation as the quantum fully connected layer. This Hamil-
The function of pooling layer after the convolutional layer tonian consists of identity operators I and Pauli operators
is to reduce the spatial size of the representation so as to σz ,
reduce the amount of parameters. We adopt average pool- H = h0 I + hi σzi +
j
hij σzi σz (17)
ing which calculates the average value for each patch on i i,j
the feature map as pooling layers in our model. Consider
a 2 ∗ 2 pixel pooling operation applied with a stride of 2 where h0 , hi , hij are the parameters, and Roman indices i, j
pixels. It can be directly realized by ignoring the last qubit denote the qubit on which the operator acts, i.e., σzi means
and the mth qubit in quantum context. The input image Pauli matrix σz acting on a qubit at site i. We measure the
|g  = (g1 , g2 , g3 , g4 , . . . , . . . , gM2 )T after this operation can expectation value of the parametrized Hamiltonian f (p) =
be expressed as the output image p|H|p. As shown in [27], the local cost function f (p) is
Wei et al. AAPPS Bulletin (2022) 32:2 Page 6 of 11

more trainable than global cost function. f (p) is the final is a technique for image processing [25, 28–30], such as
output of the whole quantum neural network. Then, we image smoothing, sharpening, edge detection, and edge
add an active function to nonlinearly map f (p) to R(f (p)). enhancement. To show the quantum convolutional layer
The parameters in Hamiltonian matrix H are updated can handle various image processing tasks, we demon-
∂f (p)
by gradient descent method, i.e., are calculated by ∂hi = strate three types of image processing, edge detection,
 i    image smoothing, and sharpening with fixed filter mask
∂f (p) j
p|σz |p and ∂hij = p|σzi σz |p . We rewrite the cost
Wde , Wsm and Wsh respectively
function as ⎛ ⎞ ⎛ ⎞
  −1 −1 −1 1 1 1
1
Wde = ⎝ −1 8 −1 ⎠ , Wsm = ⎝1 5 1⎠,
16 9 16 9
1 T T †
f (p) = Tr |i H S Q
(ik) (k1) k |f  f | i | H S Q
(i k ) (k 1) k H 13
Nc2 −1 −1 −1 1 1 1
i=1 k=1 i =1 k =1
  ⎛ ⎞ (19)
1
16 9
T
9
T † −2 −2 −2
= Tr H(ik) S(k1) Qk ρi H(ik ) S(k 1) Qk H , 1 ⎝
Nc2
i=1 k=1 k =1
Wsh = −2 32 −2 ⎠ .
16
(18) −2 −2 −2

here ρi = |f |i i| f |. From Eq.(18), the cost function In a spatial image processing task, we only need one
partial derivative with respect to wk is specific filter mask. Therefore, after performing the above
 
quantum convolutional layer mentioned, we measure the
∂f (p) 1
16 9   ancillary register. If we obtain |0, our algorithm succeeds
= 2 Tr T
H(ik) T
H(ik ) S(k 1) Q†k HQk + Q†k HQk ρi .
∂wk Nc and the spatial filtering task is completed. The numerical
i=1 k =1
simulation proves that the output images transformed by
Therefore, the parameters can be updated by measuring a classical and quantum convolutional layer are exactly the
the expectation values of specific operators. same, as shown in Fig. 3.
Now, we have constructed the framework of quantum
neural networks. We demonstrate the performance of our 2.2.2 Handwritten number recognition
method in image processing and handwritten number Here, we demonstrate a type of image recognition task
recognition in the next section. on a real-world dataset, called MNIST, a handwritten
character dataset. In this case, we simulate a complete
2.2 Numerical simulations quantum convolutional neural network model, including
2.2.1 Image processing: edge detection, image smoothing, a convolutional layer, a pooling layer, and a full-connected
and sharpening layer, as shown in Fig. 2. We consider the two-class image
In addition to constructing QCNN, the quantum convo- recognition task(recognizing handwritten characters 1
lutional layer can also be used to spatial filtering which and 8 ) and ten-class image recognition task(recognizing

Fig. 3 Three types of image processing, edge detection, image smoothing, and sharpening, are implemented on an image by classical method and
quantum method respectively
Wei et al. AAPPS Bulletin (2022) 32:2 Page 7 of 11

handwritten characters 0 - 9 ). Meanwhile, considering For the 2 class classification problem, the training set
the noise on NISQ quantum system, we respectively sim- and test set have a total of 5000 images and 2100 images,
ulate two circumstances that are the quantum gate Qk respectively. For the 10 class classification problem, the
is a perfect gate or a gate with certain noise. The noise training set and test set have a total of 60000 images and
is simulated by randomly acting a single qubit Pauli gate 10000 images, respectively. Because in a training process,
in [ I, X, Y , Z] with a probability of 0.01 on the quan- 100 images are randomly chosen in one epoch, and 50
tum circuit after an operation implemented. In detail, the epochs in total, the accuracy of the training set and the test
handwritten character image of MNIST has 28 × 28 pix- set will fluctuate. So, we repeatedly execute noisy QCNN,
els. For convenience, we expand 0 at the edge of the initial noise-free, and CNN 100 times, under the same construc-
image until 32 × 32 pixels. Thus, the work register of tion. In this way, we obtain the average accuracy and the
QCNN consists of 10 qubits, and the ancillary register field of accuracy, as shown in Fig. 4. We can conclude
needs 4 qubits. The convolutional layer is characterized that from the numerical simulation result, QCNN and
by 9 learnable parameters in matrix W that is the same CNN provide similar performance. QCNN involves fewer
for QCNN and CNN. In QCNN, by abandoning the 4- parameters and has a smaller fluctuation range.
th and 9-th qubit of the work register, we perform the
pooling layer on quantum circuit. In CNN, we perform 3 Algorithm complexity and trainability analysis
average pooling layer directly. Through measuring the We analyze the computing resources in gate complex-
expected values of different Hamiltonians on the remain- ity and qubit consumption. (1) Gate complexity. At the
ing work qubits, we can obtain the measurement values. convolutional layer stage, we could prepare an initial
After putting them in an activation function, we get the state in O(poly(log2 (M2 )) steps. In the case of preparing
final classification result. In CNN, we perform a two- a particular input |f , we employ the amplitude encod-
layer fully connected neural network and an activation ing method in [31–33].  It was shown that if the ampli-
function. In the two-classification problem, the QCNN’s tude ck and Pk = k |ck | can be efficiently 
2 calcu-

parametrized Hamiltonian has 37 learnable parameters, lated by a classical algorithm,
 constructing
 2  the log 2 M2 -
and the CNN’s fully-connected layer has 256 learnable qubit X state takes O poly log2 M steps. Alterna-
parameters. The classification result that is close to 0 are tively, we can resort to quantum random access mem-
classified as handwritten character 1 , and the result that ory [34–36]. Quantum random access memory (qRAM)
is close to 1 are classified as handwritten character 8 . In is an efficient method
  todo state preparation, whose
the ten-classification problem, the parametrized Hamil- complexity is O log2 M2 after the quantum memory
tonian has 10 × 37 learnable parameters and the CNN’s cell established. Moreover, the  controlled operations Qk
6 
fully-connected layer has 10 × 256 learnable parameters. can be decomposed into O log2 M basic gates (see
The result is a 10-dimension vector. The classification
details
 in Appendix A). In summary, our algorithm uses
results are classified as the index of the max element 6 
of the vector. Details of parameters, accuracy, and gate O log2 M basic steps to realize the filter progress
complexity are listed in Table 1. in the convolutional layer. For CNN, the complexity of

Table 1 The important parameters of models


Models Problems Data set Parameters
Learnable parameters Average accuracy Gate complexity
Noisy QCNN 1 or 8 Training set 46 0.948 O((log2 M)6 )
Test set 0.960
0 ∼ 9 Training set 379 0.742
Test set 0.740

Noise-free QCNN 1 or 8 Training set 46 0.954


Test set 0.963
0 ∼ 9 Training set 379 0.756
Test set 0.743

CNN 1 or 8 Training set 265 0.962 O(M2 )


Test set 0.972
0 ∼ 9 Training set 2569 0.802
Test set 0.804
Wei et al. AAPPS Bulletin (2022) 32:2 Page 8 of 11

Fig. 4 The performance of QCNN based on MNIST. The blue, red, and green curves denote the average accuracy of the noisy QCNN, noise-free
QCNN, and CNN, respectively. The shadow areas of the corresponding color denote the accuracy fluctuation range in the 100 times simulation
results. The insets are the typical images from MNIST set. a, b The curves representing the result from the training set and the test set for the 2 class
classification problem respectively. c, d The result from the training set and the test set for the 10 class classification problem respectively

 
implementing a classical convolutional layer is O M2 . leads to the circuit untrainable. In contrast, large vari-
Thus, this algorithm achieves an exponential speedup ances (polynomial small) indicate the absence of barren
over classical algorithms in gate complexity. The measure- plateaus and that the trainability of the parameters can be
ment complexity in fully connected layers is O(e), where e guaranteed.
is the number of parameters in the Hamiltonian. The variance in our model is (see details in Appendix C)
(2) Memory consumption.  !  " # $
 The ancillary
 qubits in the
∂f (p) ∂f (p) 2 ∂f (p) 2
whole algorithm are O log2 m2 , where m is the Var = − (21)
dimension ∂w ∂w S ∂w S
  of  the filter mask, and the work qubits are ⎛ ⎛ ⎞ ⎞
O log2 M2 . Thus,
 the  total
 qubits resource needed is
O log2 m2 + O log2 M2 . 1 ⎝1 ⎝ 2
= 4 2α0 + αi2 + αij2 ⎠ − α02 ⎠
According to [27, 37–39], we can analyze the trainability Nc 17
i ij
of the parameters in our QCNN model by studying the  
 
scaling of the variance 2α02 + i αi2 + ij αij2 −α02
If Nc4
∈ O(poly(log(n)), then Var
% &
 ! 2 " # $2 ∂f (p)
∝ O(1/poly(log(n)). This assumption is reason-
∂f (p) ∂f (p) ∂f (p) ∂w
Var = − , (20) able and easy to be satisfied, because parameters Nc4 in
∂w ∂w S ∂w S
a convolutional kernel which is usually a 3 × 3 or 5 × 5
matrix are independent on input image size . This implies
where the expectation value · · ·  is taken over the param- that the cost function landscape does not present a bar-
eters in S [39, 40]. The cost will exhibit a barren plateau ren plateau,and hence that this QCNN architecture is
in the case the variance is exponentially small, and hence trainable under a convolutional kernel.
Wei et al. AAPPS Bulletin (2022) 32:2 Page 9 of 11

4 Discussion
In summary, we designed a quantum neural network
which provides exponential speed-ups over their classi-
cal counterparts in gate complexity. With fewer parame-
ters, our model achieves similar performance compared
with classical algorithm in handwritten number recog-
nition tasks. Therefore, this algorithm has significant
advantages over the classical algorithms for large data.
We present two interesting and practical applications, Fig. 5 Decomposition of operator E1 in the form of basic gates
image processing and handwritten number recognition,
to demonstrate the validity of our method. The mapping
relations between a specific classical convolutional ker- ⎛ ⎞ ⎛ ⎞
nel to a quantum circuit is given that provides a bridge 1 1
⎜ 1 ⎟ ⎜ 1 ⎟
between QCNN to CNN. We analyze the trainability and ⎜ ⎟ ⎜ ⎟
⎜. . . ⎟ ⎜ ⎟
the existence of barren plateaus in our QCNN model. It E1 = ⎜
⎜ .. .. ..

⎟ E2 =⎜

.. .. ..
. . .


is a general algorithm and can be implemented on any ⎜ ⎟ ⎜ ⎟
⎝ 1 ⎠ ⎝ 1 ⎠
programmable quantum computer, such as superconduct- 1 1
M×M M×M
ing, trapped ions, and photonic quantum computer. In the ⎛ ⎞
big data era, this algorithm has great potential to outper- 1
⎜ 1 ⎟
form its classical counterpart, and works as an efficient ⎜ ⎟
⎜ ⎟
solution. E3 = ⎜

.. .. ..
. . .

⎟ .
⎜ ⎟
⎝ 1⎠
Appendix A: Adjusted operator U can provide
1 M×M
enough information to remap the output imagine
Proof .- The different elements of image matrix after (23)
implementing operator U compared with U are in the
E2 is a M × M identity matrix not need to be further
edges of image matrix. We prove that the evolution under
decomposed. For convenient, consider a n-qubits opera-
operator U can provide enough information to remap the
tor E1 with dimension M × M, where n = log2 (M2 ). It can
output image. The different elements between U and U
be expressed by the combination of O(n3 ) CNOT gates
are included in

and Pauli X gates as shown in Fig. 5. Consequently, E3 can

⎪ (1 ≤ k ≤ M; 1 ≤ n ≤ 2M, M2 − M ≤ n ≤ M2 ) be decomposed into the inverse of combinations of basic



⎪ (M2 −M ≤ k ≤ M2 ; 1 ≤ n ≤ M, M2 −3M ≤ n ≤ M2 −M) gate as shown in Fig. 5, because of the fact E3 = E1† . Thus,


⎨ (k = sM + 1; n = 1 + (s − 1)M, 2 + (s − 1)M, sM, sM
Uk,n  = Uk,n
Qk can be implemented by no more than O(n6 ) basic

⎪ +1, sM + 2, (s+1)M, (s+1)M+1, (s+1)M+2, (s+2)M) gates. Totally, the controlled Qk operation can be imple-





⎪ (k = (s + 1)M; n = 1, sM − 1, sM, sM + 1, (s + 1)M− 2, mented by no more than O(n6 ) = O((log2 M)6 )(ignoring

(s + 1)M − 1, (s +1)M + 1, (s + 2)M − 2, (s + 2)M− 1) constant number).
(22)
Appendix C: Trainable analysis of the QCNN model
where 1 ≤ s ≤ M − 2.
Firstly, we recall the definition of a t-design. Consider a
After performing U and U on quantum state |f  respec-
finite set S = {Sy }y∈Y contains |Y | number d-dimensional
tively, the difference exits in the elements |gk   = |gk (k =
unitaries Sy . And Pt,t (S) is a polynomial function with
1, 2, · · · , M, sM + 1, (s + 1)M, M2 − M + 1, · · · , M2 ), where
degree at most t in the matrix elements of S and at most
1 ≤ s ≤ M − 2. Since |g  can be remapped to G , U
of degree t in those of S† . Then, we say that this finite set
will give the output image G = (Gi,j )M×M . The elements
is a t-design if
in U which is different from U only affect the pixel i, j  ∈
+
2, · · · , M − 1. Thus, only and if only i, j  ∈ 2, · · · , M − 1, 1
Pt,t (Sy ) = dμ(S)Pt,t (S) , (24)
the matrix elements satisfy Gi,j  = Gi,j . Namely, the output |Y |
y∈Y
imagine Gi,j = Gi,j (2 ≤ i, j ≤ M − 1).
where the integral is over U(d) with respect to the Haar
Appendix B: Decomposing operator Q into basic distribution. In our QCNN model, S forms a 2-design and
gates for any function F(S), and for any unitary matrix A
Considering the nine operators Q1 , Q2 , · · · , Q9 consist of + +
filter operator U . Qk is the tensor product of two of the
F(AS)dμ(S) = F(S)dμ(S). (25)
following three operators
Wei et al. AAPPS Bulletin (2022) 32:2 Page 10 of 11

The average of the partial derivative of the cost function Guangdong province (2018B030325002), and Beijing Advanced Innovation
is Center for Future Chip (ICFC).
# $  
∂f (p) 1
9   Availability of data and materials
T T † †
= 2 Tr H(ik) H(ik S
) (k 1) Q H Qk +Q H Qk Tr(|f  f |) The code used to generate the quantum circuit and implement the
∂wk S Nc k k
k =1 experiment is available on reasonable request.
and Tr(|f  f |) = 1. Consider the fact that H main-
tains the property that being constructed by Pauli
product matrices under the transformation of Qk ,i.e., Declarations
 9 † †
k =1 H(ik) H(ik ) S(k 1) (Qk HQk +Qk HQk ) = H
T T new , where
Ethics approval and consent to participate
  j
Hnew = α0 I + i αi σzi + i,j (αij )σzi σz . Then, we have Not applicable.
Tr(Hnew ) = α0 , and Consent for publication
# $ Not applicable.
∂f (p) α0
= 4.
∂wk Nc Competing interests
The authors declare that they have no competing interests.
The expectation value of the squares of gradients is
! 2 " +  16 9
 Author details
∂f (p) 1 T T †
1 Beijing Academy of Quantum Information Sciences, Beijing 100193, China.
= dμ(S) Tr H(ik) H(ik ) S(k 1) Qk HQk 2 State Key Laboratory of Low-Dimensional Quantum Physics and Department
∂w Nc4
S i=1 k =1
of Physics, Tsinghua University, Beijing 100084, China. 3 Institute of Information
 2 Photonics and Optical Communications, Beijing University of Posts and
+Q†k HQk ρi Telecommunications, Beijing 100876, China. 4 Beijing National Research Center
for Information Science and Technology and School of Information Tsinghua
1 1  University, Beijing 100084, China. 5 Frontier Science Center for Quantum
= Tr(Hnew )Tr(|f  f |)Tr(Hnew )Tr(|f  f |) Information, Beijing 100084, China.
Nc4 162 − 1

+Tr((Hnew )2 )Tr(|f  f |) Received: 15 October 2021 Accepted: 6 December 2021
1 1 
− 4 Tr((Hnew )2 )Tr(|f  f |)Tr(|f  f |)
Nc 16(162 − 1)
 References
+Tr(Hnew )Tr(Hnew )Tr(|f  f |) 1. P. Benioff, The computer as a physical system: a microscopic quantum
 mechanical hamiltonian model of computers as represented by turing
1 1 1
= 4 (α 2 + Tr((Hnew )2 )) − machines. J. Stat. Phys. 22(5), 563–591 (1980)
Nc 162 − 1 0 16(162 − 1) 2. R. P. Feynman, Simulating physics with computers. Int. J. Theor. Phys.

21(6), 467–488 (1982)
(α02 + Tr((Hnew )2 )) 3. P. W. Shor, Polynomial-time algorithms for prime factorization and discrete
⎛ ⎞ logarithms on a quantum computer. SIAM Rev. 41(2), 303–332 (1999)
1 ⎝ 1 4. L. K. Grover, Quantum mechanics helps in searching for a needle in a
2 2 2 ⎠
= 4 (2α0 + αi + αij ) haystack. Phys. Rev. Lett. 79(2), 325 (1997)
Nc 17
i ij 5. G. L. Long, Grover algorithm with zero theoretical failure rate. Phys. Rev. A.
022307, 64 (2001)
Therefore, the variance is 6. J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd,
!  " # $ Quantum machine learning. Nature. 549(7671), 195–202 (2017)
∂f (p) ∂f (p) 2 ∂f (p) 2 7. V. Dunjko, J. M. Taylor, H. J. Briegel, Quantum-enhanced machine learning.
Var[ ] = − (26)
∂w ∂w S ∂w S Phys. Rev. Lett. 117(13), 130501 (2016)
8. N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, S. Lloyd,
⎛ ⎞
Continuous-variable quantum neural networks. Phys. Rev. Res. 1(3),
1 1
= 4 ⎝ (2α02 + αij2 ) − α02 ⎠
033063 (2019)
αi2 + 9. J. Liu, K. H. Lim, K. L. Wood, W. Huang, C. Guo, H.-L. Huang, Hybrid
Nc 17
i ij quantum-classical convolutional neural networks. arXiv preprint
arXiv:1911.02998 (2019)
Acknowledgements 10. F. Hu, B.-N. Wang, N. Wang, C. Wang, Quantum machine learning with
We thank X. Yao and X. Peng for inspiration and fruitful discussions. d-wave quantum computer. Quantum Eng. 1(2), e12 (2019)
11. E. Farhi, H. Neven, Classification with quantum neural networks on near
Authors’ contributions term processors. Quantum Rev. Lett. 1(2), 10–37686 (2020)
S.W. formulated the theory. Y.C. and Z.Z. performed the calculation. All work 12. W. Huggins, P. Patil, B. Mitchell, K. B. Whaley, E. M. Stoudenmire, Towards
was carried out under the supervision of G.L. All authors contributed to writing quantum machine learning with tensor networks. Quantum Sci. Technol.
the manuscript. The authors read and approved the final manuscript. 4(2), 024001 (2019)
13. X. Yuan, J. Sun, J. Liu, Q. Zhao, Y. Zhou, Quantum simulation with hybrid
Funding tensor networks. Phys. Rev. Lett. 127(4), 040501 (2021)
This research was supported by National Basic Research Program of China. 14. Y. Zhang, Q. Ni, Recent advances in quantum machine learning. Quantum
S.W. acknowledge the China Postdoctoral Science Foundation 2020M670172 Eng. 2(1), e34 (2020)
and the National Natural Science Foundation of China under Grants No. 15. J.-G. Liu, L. Mao, P. Zhang, L. Wang, Solving quantum statistical mechanics
12005015. We gratefully acknowledge support from the National Natural with variational autoregressive networks and quantum circuits. Mach.
Science Foundation of China under Grants No. 11974205 and No. 11774197, Learn. Sci. Technol. 2(2), 025011 (2021)
The National Key Research and Development Program of China 16. E. Farhi, H. Neven, Classification with quantum neural networks on near
(2017YFA0303700), The Key Research and Development Program of term processors. arXiv preprint arXiv:1802.06002 (2018)
Wei et al. AAPPS Bulletin (2022) 32:2 Page 11 of 11

17. I. Cong, S. Choi, M. D. Lukin, Quantum convolutional neural networks. Nat.


Phys. 15(12), 1273–1278 (2019)
18. B. C. Britt, Modeling viral diffusion using quantum computational network
simulation. Quantum Eng. 2(1), e29 (2020)
19. M. Schuld, N. Killoran, Quantum machine learning in feature hilbert
spaces. Phys. Rev. Lett. 122(4), 040504 (2019)
20. Y. Li, R.-G. Zhou, R. Xu, J. Luo, W. Hu, A quantum deep convolutional neural
network for image recognition. Quantum Sci. Technol. 5(4), 044003 (2020)
21. I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep learning, volume 1.
(MIT press, Cambridge, 2016)
22. L. Gui-Lu, General quantum interference principle and duality computer.
Commun. Theor. Phys. 45(5), 825 (2006)
23. S. Gudder, Mathematical theory of duality quantum computers. Quantum
Inf. Process. 6(1), 37–48 (2007)
24. S.-J. Wei, G.-L. Long, Duality quantum computer and the efficient
quantum simulations. Quantum Inf. Process. 15(3), 1189–1212 (2016)
25. X.-W. Yao, H. Wang, Z. Liao, M.-C. Chen, J. Pan, J. Li, K. Zhang, X. Lin, Z.
Wang, Z. Luo, et al., Quantum image processing and its application to
edge detection: theory and experiment. Phys. Rev. X. 7(3), 031041 (2017)
26. T. Xin, S. Wei, J. Cui, J. Xiao, I. Arrazola, L. Lamata, X. Kong, D. Lu, E. Solano,
G. Long, Quantum algorithm for solving linear differential equations:
theory and experiment. Phys. Rev. A. 101(3), 032307 (2020)
27. M. Cerezo, A. Sone, T. Volkoff, L. Cincio, P. J. Coles, Cost function
dependent barren plateaus in shallow parametrized quantum circuits.
Nat. Comput. 12(1), 1–12 (2021)
28. F. Yan, A. M. Iliyasu, S. E. Venegas-Andraca, A survey of quantum image
representations. Quantum Inf. Process. 15(1), 1–35 (2016)
29. S. E. Venegas-Andraca, S. Bose, Storing, processing, and retrieving an
image using quantum mechanics. Inf. Comput. (2003)
30. P. Q. Le, F. Dong, K. Hirota, A flexible representation of quantum images
for polynomial preparation, image compression, and processing
operations. Quantum Inf. Process. 10(1), 63–84 (2011)
31. G.-L. Long, Y. Sun, Efficient scheme for initializing a quantum register with
an arbitrary superposed state. Phys. Rev. A. 64(1), 014303 (2001)
32. L. Grover, T. Rudolph, Creating superpositions that correspond to
efficiently integrable probability distributions. arXiv preprint
quant-ph/0208112 (2002)
33. A. N. Soklakov, R. Schack, Efficient state preparation for a register of
quantum bits. Phys. Rev. A. 73(1), 012307 (2006)
34. V. Giovannetti, S. Lloyd, L. Maccone, Quantum random access memory.
Phys. Rev. Lett. 100(16), 160501 (2008)
35. V. Giovannetti, S. Lloyd, L. Maccone, Architectures for a quantum random
access memory. Phys. Rev. A. 78(5), 052310 (2008)
36. S. Arunachalam, V. Gheorghiu, T. Jochym-O’Connor, M. Mosca, P. V.
Srinivasan, On the robustness of bucket brigade quantum ram. New J.
Phys. 17(12), 123010 (2015)
37. J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, H. Neven, Barren
plateaus in quantum neural network training landscapes. Nat. Commun.
9(1), 1–6 (2018)
38. K. Sharma, M. Cerezo, L. Cincio, P. J. Coles, Trainability of dissipative
perceptron-based quantum neural networks. arXiv preprint
arXiv:2005.12458 (2020)
39. A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sornborger, P. J. Coles,
Absence of barren plateaus in quantum convolutional neural networks.
Phys. Rev. X. 11(4), 041011 (2021)
40. B. Collins, P. Śniady, Integration with respect to the haar measure on
unitary, orthogonal and symplectic group. Commun. Math. Phys. 264(3),
773–795 (2006)

Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy