What Is NLP?: Natural Language Processing in AI
What Is NLP?: Natural Language Processing in AI
What Is NLP?: Natural Language Processing in AI
What is NLP?
- NLP stands for Natural Language Processing, which is a part of Computer Science, Human
language, and Artificial Intelligence.
- It is the technology that is used by machines to understand, analyze, manipulate, and interpret
human's languages.
- It helps developers to organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
- Natural Language Processing (NLP) is the process of computer analysis of input provided in a
human language (natural language), and conversion of this input into a useful form of
representation.
- NLP is one of field of AI that processes or analyzes written or spoken language.
- NLP involve processing of speech, grammar and meaning.
Advantages of NLP
o NLP helps users to ask questions about any subject and get a direct response within seconds.
o NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
o NLP helps computers to communicate with humans in their languages.
o It is very time efficient.
o Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
documentation, and identify the information from large databases.
Disadvantages of NLP
A list of disadvantages of NLP is given below:
o NLP may not show context.
o NLP is unpredictable
o NLP may require more keystrokes.
o NLP is unable to adapt to the new domain, and it has a limited function that's why NLP is built for a
single and specific task only.
Components of NLP
There are three components of NLP:
1. Speech Recognition — The translation of spoken language into text.
Applications of NLP
There are the following applications of NLP -
1. Question Answering: Question Answering focuses on building systems that automatically answer the
questions asked by humans in a natural language.
2. Spam Detection: Spam detection is used to detect unwanted e-mails getting to a user's inbox.
3. Sentiment Analysis: Sentiment Analysis is also known as opinion mining. It is used on the web to analyze
the attitude, behavior, and emotional state of the sender. This application is implemented through a
combination of NLP (Natural Language Processing) and statistics by assigning the values to the text (positive,
negative, or natural), identify the mood of the context (happy, sad, angry, etc.)
4. Machine Translation: Machine translation is used to translate text or speech from one natural language to
another natural language.
5. Spelling correction: Microsoft Corporation provides word processor software like MS-word, PowerPoint
for the spelling correction.
6. Speech Recognition: Speech recognition is used for converting spoken words into text. It is used in
applications, such as mobile, home automation, video recovery, dictating to Microsoft Word, voice
biometrics, voice user interface, and so on.
7. Chatbot: Implementing the Chatbot is one of the important applications of NLP. It is used by many
companies to provide the customer's chat services.
8. Information extraction: Information extraction is one of the most important applications of NLP. It is used
for extracting structured information from unstructured or semi-structured machine-readable documents.
9. Natural Language Understanding (NLU): It converts a large set of text into more formal representations
such as first-order logic structures that are easier for the computer programs to manipulate notations of the
natural language processing.
Phases of NLP
There are the following five phases of NLP:
1. Lexical Analysis: The first phase of NLP is the Lexical Analysis. This phase
scans the source code as a stream of characters and converts it into meaningful
lexemes. It divides the whole text into paragraphs, sentences, and words.
4. Discourse Integration: Discourse Integration depends upon the sentences that proceeds it and also invokes
the meaning of the sentences that follow it.
5. Pragmatic Analysis: Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended
effect by applying a set of rules that characterize cooperative dialogues.
For Example: "Open the door" is interpreted as a request instead of an order.
Context-Free Grammar
Top-Down Parser
Let us see them in detail −
Context-Free Grammar
It is the grammar that consists rules with a single symbol on the left-hand side of the rewrite rules.
CFG consists of -
1. a set of non-terminal symbols
2. a set of terminal symbols
3. a set of rules (productions), where the LHS (mother) is a single non-terminal and the RHS is a
sequence of one or more non-terminal or terminal symbols
Let us create grammar to parse a sentence −
“The bird pecks the grains”
Articles (DET) − a | an | the
Nouns − bird | birds | grain | grains
Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun = DET N | DET ADJ N
Verbs − pecks | pecking | pecked
Verb Phrase (VP) − NP V | V NP
Adjectives (ADJ) − beautiful | small | chirping
The parse tree breaks down the sentence into structured parts so that the computer can easily understand
and process it. In order for the parsing algorithm to construct this parse tree, a set of rewrite rules, which
describe what tree structures are legal, need to be constructed.
These rules say that a certain symbol may be expanded in the tree by a sequence of other symbols. According
to first order logic rule, if there are two strings Noun Phrase (NP) and Verb Phrase (VP), then the string
combined by NP followed by VP is a sentence. The rewrite rules for the sentence are as follows −
S → NP VP
NP → DET N | DET ADJ N
VP → V NP
DET → a | the
ADJ → beautiful | perching
N → bird | birds | grain | grains
V → peck | pecks | pecking
The parse tree can be created as shown −
Now consider the above rewrite rules. Since V can be replaced by both, "peck" or "pecks", sentences such as
"The bird peck the grains" can be wrongly permitted. i. e. the subject-verb agreement error is approved as
correct.
Merit − The simplest style of grammar, therefore widely used one.
Demerits −
They are not highly precise. For example, “The grains peck the bird”, is a syntactically correct
according to parser, but even if it makes no sense, parser takes it as a correct sentence.
To bring out high precision, multiple sets of grammar need to be prepared. It may require a completely
different sets of rules for parsing singular and plural variations, passive sentences, etc., which can
lead to creation of huge set of rules that are unmanageable.
Top-Down Parser
Here, the parser starts with the S symbol and attempts to rewrite it into a sequence of terminal symbols that
matches the classes of the words in the input sentence until it consists entirely of terminal symbols.
These are then checked with the input sentence to see if it matched. If not, the process is started over again
with a different set of rules. This is repeated until a specific rule is found which describes the structure of the
sentence.
Merit − It is simple to implement.
Demerits −