0% found this document useful (0 votes)
58 views

QNLP Using Lambeq Toolkit

This paper presents the application of Quantum Natural Language Processing (QNLP) for sentiment analysis using the lambeq toolkit, achieving perfect accuracy in simulations and decent results on a noisy quantum device. The authors explore the differences between classical and quantum NLP, emphasizing the advantages of quantum computing in understanding language meaning through variational quantum circuits. The study demonstrates the potential of QNLP in enhancing sentiment classification tasks, paving the way for future research in this emerging field.

Uploaded by

kareem.sqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

QNLP Using Lambeq Toolkit

This paper presents the application of Quantum Natural Language Processing (QNLP) for sentiment analysis using the lambeq toolkit, achieving perfect accuracy in simulations and decent results on a noisy quantum device. The authors explore the differences between classical and quantum NLP, emphasizing the advantages of quantum computing in understanding language meaning through variational quantum circuits. The study demonstrates the potential of QNLP in enhancing sentiment classification tasks, paving the way for future research in this emerging field.

Uploaded by

kareem.sqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Quantum Natural Language Processing based

Sentiment Analysis using lambeq Toolkit


Srinjoy Ganguly Sai Nandan Morapakula
Escuela Técnica Superior de Ingenierı́a de Sistemas Informáticos Electrical and Electronics Engineering
Universidad Politécnica de Madrid Karunya Institute of Technology and Sciences
Madrid, Spain Coimbatore, India
srinjoyganguly@gmail.com sainandanm2002@gmail.com

Luis Miguel Pozo Coronado


arXiv:2305.19383v1 [quant-ph] 30 May 2023

Escuela Técnica Superior de Ingenierı́a de Sistemas Informáticos


Universidad Politécnica de Madrid
Madrid, Spain
lm.pozo@upm.es

Abstract—Sentiment classification is one the best use case the error rate is directly proportional to the number of qubits
of classical natural language processing (NLP) where we can and information a qubit contains can be lost easily which is
witness its power in various daily life domains such as banking, why quantum computers are stored at very cool temperatures
business and marketing industry. We already know how classical
AI and machine learning can change and improve technology. and requires great maintainence.
Quantum natural language processing (QNLP) is a young and QNLP is different from classical NLP. QNLP has its ori-
gradually emerging technology which has the potential to provide gins in abstract mathematical theory which includes category
quantum advantage for NLP tasks. In this paper we show the first theory - especially monoidal categories, diagramatic quantum
application of QNLP for sentiment analysis and achieve perfect theory and ZX calculus. To gain more understanding about
test set accuracy for three different kinds of simulations and a
decent accuracy for experiments ran on a noisy quantum device. the concepts of diagrammatic quantum theory, the reader can
We utilize the lambeq QNLP toolkit and t|ket > by Cambridge refer to [2] which explains the fundamentals of diagrammatic
Quantum (Quantinuum) to bring out the results. reasoning for quantum theory and is the core of QNLP.
Index Terms—Quantum Computing, Quantum Natural Lan- Since a model of natural language is being equivalent to
guage Processing, lambeq a model which explains quantum mechanical phenomena,
this approach makes QNLP quantum-native. By this process
I. I NTRODUCTION linguistic structure can be encoded easily where as encoding
Taking computational speeds and performance into con- grammer in classical is very costly.
sideration, quantum computers are exponentially faster than In this paper we are going to see how accurately quantum
the present generation classical computers. Until the last two computers can predict the sentiments, where they are already
decades or the end of the 20th century quantum computer was trained with around 130 sentences. We will also see how
a fictional story developed by great mathematicians and physi- classical computers and quantum computers with embedded
cists such as Richard Feynman, Erwin Schrodinger, David noise will give the results and compare them to get a better
Deutsch, etc. In the early 21st century quantum computers understanding of why we need quantum computers and how
gained its importance and the fictional story was greatly read, powerful they are.
understood and was theoretically well developed. Recently, the The rest of the paper is ordered as follows: section 2
first quantum computers have been built, some of them have gives an introduction to the related work that is done and the
been made publicly available, and they have already gained research going on in this field; section 3 gives a clear picture
its significance and showed its power in various domains and brief intuition on QNLP and also explains the sentiment
like machine learning, chemistry, natural language processing, classification experiment; In section 4 we discuss the results
biomedicine, etc. In this paper we will see how quantum of classical, quantum and quantum with noise devices, section
computers can help us in improving the domain of natural 5 summarises the work and also proposes some future lines
language processing. of work in the domain of QNLP.
Quantum computing is itself a nascent field so is Quan-
tum Natural Language Processing (QNLP), we take the phe- II. R ELATED W ORK
nomenon of superposition, entanglement, interference to our As researchers, scientists, enthusiasts identified the capabil-
own advantage and run NLP models or language related tasks ity of quantum devices and the power they have got, more
on the hardware. As of now we are currently in the Noisy effort and time is put into this domain. Despite of QNLP
Intermediate-Scale Quantum (NISQ) [1] computers era, where being a new and emerging field, Noisy Intermediate Scale
Quantum(NISQ) devices have already led to some propitious
results [3] and also appiled to divergent field such as quantum
music [4].
As we all know that the applications of classical NLP is
already seen in our day-to-day lives. Voice assistants such
as Siri and Alexa are the best examples. The problem and
constraint with classical NLP is that it can only read and
decipher bits but cannot deeply understand the meaning of
the language and that is where there is scope for quantum to
do this in a meaning-aware manner.
In QNLP, the sentences are depicted by variational quantum
circuits and each and every word in a sentence is transformed
into quantum states by using parameterized quantum gates.
With the help of the variational circuit technique, QNLP
becomes NISQ-friendly [5].
Scientists and researchers at Cambridge Quantum developed
the first high level python framework for QNLP named lam-
Fig. 1. String diagram of ”siva hates thrilling comics”.
beq. This unique toolkit is a open source package which offers
the functionality of converting sentences into quantum circuits
[6]. the words together to give the meaning of the sentence, which
The first and foremost effort of designing and execution is the grammar. The juxtaposition of the atomic types reduces
of natural language models on a real quantum computer to ’s’ which signifies that the sentence is grammatically
was accomplished by Cambridge Quantum where they used correct. This juxtaposition is solved in (1).
an intermediate dataset containing sentences of two different
classes - Food or IT and results obtained were profound as n · nr · s · nl · n · nl · n → 1 · s · 1 · 1 → s (1)
given in [7]. Due to today’s NISQ devices i.e. small number of qubits,
In this work, we present the first application of QNLP qubit decoherece and since the string diagrams themselves are
for sentiment analysis on an intermediate level dataset where resource intensive, they need to be rewritten into a more NISQ
we achieve successful results in a binary classification of friendly version which is going to consume less number of
sentiments - positive and negative sentences. We demonstrate qubits to represent the sentences.
that for both classical and quantum simulations we achieve
proper convergence.
III. M ETHODOLOGY AND E XPERIMENTS
We have utilized the Distributional Compositonal Cate-
gorical (DisCoCat) [8] framework for our task of sentiment
classification. The DisCoCat framework provides a unique way
of combining different constitutent words to form the meaning
of a whole sentence. It follows a compositional mechanism for
the grammar to entangle word meanings which are distributed
in a vector space. Lambek’s pregroup grammar is used in
DisCoCat to retrieve the meaning of a sentence from the
meaning of the words.
A. Compositional Model of Meaning
The compositional model of meaning is inspired from
Lambek’s pregroup grammar. In this formalism, we assign
atomic types ’n’ for noun and ’s’ for sentence to the words
present in a sentence which assists in the composition of the Fig. 2. Rewritten string diagram of ”siva hates thrilling comics”.
word meanings together. The grammatical rules to compose
different types of sentences can be found in [9] and the string Since cups consume twice the number of qubits assigned
diagrams shown have utilized those rules given by Lambek. to either ’n’ or ’s’, therefore, removing the cups from the
Fig. 1 shows the string diagram for a sentence ”siva hates original string diagrams results in a diagram which actually
thrilling comics” where the words are represented by boxes consumes less number of resources and is better adapted to
or alternatively triangle shaped boxes and the wires (cups and today’s NISQ hardware. From Fig. 2 it can be seen that the
straight wires) represent the entagling effect which composes diagram has reduced in size by removing the cups.
B. Language into Quantum Circuits
After we have designed the string diagrams for the lan-
guage, we have to transform them into quantum circuits in
order to run them on simulators and quantum hardware. The
compositional model of meaning described before follows
a bottom-up approach i.e. composing words to form the
meaning of sentence. On the contrary, language in the form
of quantum circuits follows a top-down approach i.e. inferring
the meaning of words using the meaning of sentence. This top-
down approach is valid because for training quantum circuits
a dataset of sentences with labels is provided and from that
the meaning of words is inferred.

Fig. 4. Rewritten circuit diagram of ”siva hates thrilling comics”.

According to the original string diagram quantum circuit,


we require 7 qubit quantum hardware to run the sentence on
Fig. 3. Circuit diagram of ”siva hates thrilling comics”. a quantum computer. Fig. 4 displays the quantum circuit of
the diagram in Fig. 2 and it can be deciphered that there are
Fig. 3 refers to the quantum circuit of the string diagram 4 qubits (states) and 3 qubits for post-selected measurement.
shown in Fig. 1. The upper triangles (faced upwards) are This is a great reduction of qubits instead of having 6 of 7,
called ”states” or |states > and the lower trianlges (faced we get 3 of 4 after removing the cups by using rewriting
downwards) are knows as ”effects” or < ef f ects|. It can be technique. Therefore the rewritten circuit can be utilized for
seen that there are 7 states i.e. 7 qubits and 6 effects i.e. 6 NISQ devices.
qubits for post-selected measurement. This means that 6 out
of 7 qubits need to be used for measurement. C. Experimental Details
Fig. 3 shows that we have Hadamard gates and CNOT gates For conducting sentiment analysis on a quantum computer
used to create the entangling effects or cups present in the we have used a binary sentiment classification dataset which
string diagram. The nouns such as ”siva” and ”comics” have contains positive and negative sentiments of candidates on
been converted into circuit form using parametrized quantum reading various book generes such as fiction, nonfiction,
gates - Rz(α) and Rx(α). The verbs such as ”hates” and comics and classics. A label of 0 is assigned to positive
”thrilling” have been denoted by parameterized controlled Rz sentiments and a label of 1 is assigned to negative sentiments.
gates. This quantum circuit is denoted as Instantaneous Quan- The dataset consists of 130 sentences, out of which 70 are
tum Polynomial (IQP) [10] which consists of fixed Hadamard in the training set, 30 in development set and 30 in test set.
gates, parametrized single qubit gates and controlled two There are 7 nouns, 3 adjectives and 5 verbs in total for the
qubit quantum gates. Since the IQP consists of parametrized sentences present in the dataset.
gates which can vary or modify their output based on input We have employed lambeq [6], world’s first QNLP toolkit
parameters, so it is an example of a variational quantum circuit. for carrying out our experiments. This toolkit provides a
convenient way of converting string diagrams into quantum A. Classical Simulation
circuits and then using those circuits for each of the sentences In all the four experiments, first we convert each sentence
for training purpose on a quantum computer. The toolkit itself present in our data set into string diagrams using the DepCCG-
is based on Python programming language and entails unique Parser. Once the sentences are converted into string diagrams,
features - high level, open source - code available on GitHub, we apply Spider anstaz by which the noun and sentence spaces
modular - gives independent modules for greater flexibility, each receive a dimension of 2. The PyTorch backend is used
extensive - object-oriented design and interoperability - simple for the training with Adam optimizer.
communication with other packages.

Fig. 5. lambeq QNLP Toolkit Pipeline. Fig from [6].

The lambeq pipeline shown in Fig. 5 is the general process


for QNLP training. A sentence is first parsed by a parser
and then converted into a string diagram. Here lambeq uses
the state-of-the-art DepCCGParser given in [11] to parse the
sentences in a CCG format and then converts them to string
diagrams. The process of converting CCG to string diagram
and vice-versa has been explained in [12] by considering CCG
as a biclosed category. Fig. 6. Results of classical pipeline simulation.
After the sentence is converted into a string diagram, it is
converted into a quantum circuit based on the ansatz present The plots for accuracy obtained on training and development
in lambeq. There are several ansatz which lambeq provides sets are shown in Fig. 6. We have obtained perfect accuracy
such as SpiderAnsatz, TensorAnsatz, IQPAnsatz, etc. For each on the test set for this case.
sentence present in the dataset, a circuit is formed and for all
B. Noiseless Quantum Simulation
the sentences in the dataset those circuits are stored in a list.
Based on the optimization scheme chosen, these circuits are This experiment is not very different from the classical one.
sent to the simulator or quantum hardware for training. We use the same configuration as we used in classical pipeline,
The training process in QNLP is very similar to that of a however, we change the backend from PyTorch to IBM
classical machine learning method. The circuits are ran one Qiskit’s Aer simulator that is accessible through pytket, which
by one, measurements are collected from each of the circuits is t|ket >’s python interface. We use a gradient-approximation
into prediction labels using classical post-processing. These technique called Simultaneous Perturbation Stochastic Ap-
prediction labels are compared with the true labels using a proximation (SPSA) [13]. The reason to choose this optimizer
suitable cost function and the output of the cost function is fed is because SPSA does not calculates the gradient of a quantum
into a classical optimizer which calculates the new parameters circuit but rather approximates it. Evaluating gradients on a
of the quantum gates. These modified parameters are fed back quantum hardware by differentiating quantum circuits is very
into the variational circuit again and the process repeats until resource intensive and this is where SPSA comes to the rescue.
convergence. Even though there is a lot of instability during the early
stages of training as shown in Fig. 7, but eventually the
model converges to good accuracy. The test set accuracy using
IV. R ESULTS AND D ISCUSSIONS quantum pipeline is also perfect but the performance varies
depending upon on the number of iterations.
We apply QNLP using lambeq toolkit to our sentiment
analysis dataset, and there are four simulation types which C. Quantum Simulation with JAX
we cover: In classical pipeline the sentences in our dataset The string diagrams are changed into variational quantum
are modeled as tensor networks; Quantum pipeline simulation circuits using the IQP anasatz. The noun and sentence types
without noise; Quantum pipeline simulation using JAX, JAX get a single qubit each and the layers of IQP are set to 1. We
is a powerful scientific computing library used for automatic use JAX because the prediction functions are compiled with
differentiation; Quantum pipeline simulation with noise, by great speed and JAX takes a very short time to run and execute
using IBM Qiskit’s fake hardware simulator. We have used the results.
FakeVigo as the fake hardware backend simulator for this Results can be seen in Fig. 8. Even though we obtain
experiment; it can be changed according to one’s requirement. perfect accuracy using JAX, this configuration needs 4 times
Fig. 7. Results of quantum noiseless pipeline simulation. Fig. 9. Results of quantum noisy pipeline simulation.

V. C ONCLUSIONS AND F UTURE W ORK


In this paper, we have showed the first application of QNLP
- binary sentiment classification using the lambeq QNLP
toolkit on an intermediate dataset consisting of book genre
sentiments. We were able to achieve successful convergence
for all the simulations carried out using classical and quantum
pipelines. Perfect accuracy on the test set was achieved for
three simulations and a decent accuracy was obtained for the
noisy quantum pipeline case.
QNLP is a new field and much work needs to be done
in this field in order to achieve quantum advantage. The
current work can be extended by including more number of
nouns, adjectives and verbs for each of the sentiments in
the dataset. This will increase the parameter space. We have
Fig. 8. Results of quantum pipeline with JAX simulation. performed binary sentiment classification, therefore another
direction would be include multi class sentiment classification
by including neutral sentiments as well. It would be a great
the number of iterations of noiseless quantum pipeline to gain direction of research if random sentences (without following a
this feat. particular pattern) are also being utilized in the dataset which
is of interest to us as that will provide intuition about the
scalability aspects of QNLP.
D. Noisy Quantum Simulation
ACKNOWLEDGMENT
Running circuits on real hardware, taking into account the S. G. is very grateful to L. M. P. C. for his guidance and
130 circuits needed, will be quite difficult and time consuming. suggestions for improvement of the experiments and to Uni-
So to make this process a bit simpler, we have used Qiskit’s versidad Politécnica de Madrid for supporting this research. S.
fake quantum hardware backend called FakeVigo with the help N. M. thanks Karunya Institute of Technology and Sciences for
of t|ket >. The FakeVigo hardware has 5 qubits and is easily letting him explore research directions in the field of quantum
able to run our circuits because of the diagram rewriting we technology. The authors acknowledge the use of Google Colab
have employed. If we exclude the noise, everything else is Pro for carrying out the experiments and the libraries lambeq
same in noisy pipeline. & t|ket > from Cambridge Quantum (Quantinuum).
The test accuracy is not perfect for noisy quantum simula-
tion which can be seen in Fig. 9. We achieved 83.33% accuracy R EFERENCES
on the test set. To attain perfect accuracy it requires even more [1] J. Preskill, “Quantum computing in the nisq era and beyond,”
iterations. To compare and show the difference of how noisy Quantum, vol. 2, p. 79, Aug 2018. [Online]. Available: http:
quantum simulation differs from noiseless quantum simulation //dx.doi.org/10.22331/q-2018-08-06-79
[2] B. Coecke and A. Kissinger, Picturing Quantum Processes: A First
we haven’t ran the experiment with increased number of Course in Quantum Theory and Diagrammatic Reasoning. Cambridge
iterations. University Press, 2017.
[3] W. Zeng and B. Coecke, “Quantum algorithms for compositional
natural language processing,” Electronic Proceedings in Theoretical
Computer Science, vol. 221, p. 67–75, Aug 2016. [Online]. Available:
http://dx.doi.org/10.4204/EPTCS.221.8
[4] E. R. Miranda, R. Yeung, A. Pearson, K. Meichanetzidis, and B. Coecke,
“A quantum natural language processing approach to musical intelli-
gence,” 2021.
[5] B. Coecke, G. de Felice, K. Meichanetzidis, and A. Toumi, “Foundations
for near-term quantum natural language processing,” 2020.
[6] D. Kartsaklis, I. Fan, R. Yeung, A. Pearson, R. Lorenz, A. Toumi,
G. de Felice, K. Meichanetzidis, S. Clark, and B. Coecke, “lambeq:
An efficient high-level python library for quantum nlp,” 2021.
[7] R. Lorenz, A. Pearson, K. Meichanetzidis, D. Kartsaklis, and B. Coecke,
“Qnlp in practice: Running compositional models of meaning on a
quantum computer,” 2021.
[8] B. Coecke, M. Sadrzadeh, and S. Clark, “Mathematical foundations for
a compositional distributional model of meaning,” 2010.
[9] J. Lambek, From Word to Sentence: A Computational
Algebraic Approach to Grammar, ser. Open access publications.
Polimetrica, 2008. [Online]. Available: https://books.google.co.in/
books?id=ZHgRaRaadJ4C
[10] V. Havlı́ček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala,
J. M. Chow, and J. M. Gambetta, “Supervised learning with quantum-
enhanced feature spaces,” Nature, vol. 567, no. 7747, p. 209–212, Mar
2019. [Online]. Available: http://dx.doi.org/10.1038/s41586-019-0980-2
[11] M. Yoshikawa, H. Noji, and Y. Matsumoto, “A* CCG parsing with
a supertag and dependency factored model,” in Proceedings of the
55th Annual Meeting of the Association for Computational Linguistics
(Volume 1: Long Papers). Vancouver, Canada: Association for
Computational Linguistics, Jul. 2017, pp. 277–287. [Online]. Available:
https://aclanthology.org/P17-1026
[12] R. Yeung and D. Kartsaklis, “A ccg-based version of the discocat
framework,” 2021.
[13] J. Spall, “Multivariate stochastic approximation using a simultaneous
perturbation gradient approximation,” IEEE Transactions on Automatic
Control, vol. 37, no. 3, pp. 332–341, 1992.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy