What Is NLP?: Natural Language Processing in AI

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Natural Language Processing in AI

What is NLP?

- NLP stands for Natural Language Processing, which is a part of Computer Science, Human
language, and Artificial Intelligence.
- It is the technology that is used by machines to understand, analyze, manipulate, and interpret
human's languages.
- It helps developers to organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
- Natural Language Processing (NLP) is the process of computer analysis of input provided in a
human language (natural language), and conversion of this input into a useful form of
representation.
- NLP is one of field of AI that processes or analyzes written or spoken language.
- NLP involve processing of speech, grammar and meaning.

Why Natural Language Processing?


 Classify text into categories
 Index and search large texts
 Automatic translation
 Speech understanding
- Understand phone conversation
 Information extraction
- Extract useful information from documents
 Automatic summarization
- Question answering
 Text generations/dialogs

Advantages of NLP
o NLP helps users to ask questions about any subject and get a direct response within seconds.
o NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
o NLP helps computers to communicate with humans in their languages.
o It is very time efficient.
o Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
documentation, and identify the information from large databases.

Disadvantages of NLP
A list of disadvantages of NLP is given below:
o NLP may not show context.
o NLP is unpredictable
o NLP may require more keystrokes.
o NLP is unable to adapt to the new domain, and it has a limited function that's why NLP is built for a
single and specific task only.
Components of NLP
There are three components of NLP:
1. Speech Recognition — The translation of spoken language into text.

2. Natural Language Understanding (NLU)


Natural Language Understanding (NLU) helps the machine to understand and analyze human language by
extracting the metadata from content such as concepts, entities, keywords, emotion, relations, and semantic
roles.
NLU mainly used in Business applications to understand the customer's problem in both spoken and written
language.
NLU involves the following tasks -
o It is used to map the given input into useful representation.
o It is used to analyze different aspects of the language.

3. Natural Language Generation (NLG)


Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural
language representation.
It involves −
 Text planning − It includes retrieving the relevant content from knowledge base.
 Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone
of the sentence.
 Text Realization − It is mapping sentence plan into sentence structure.

Difference between NLU and NLG


NLU NLG
NLU is the process of reading and interpreting NLG is the process of writing or generating
language. language.
It produces non-linguistic outputs from It produces constructing natural language outputs
natural language inputs. from non-linguistic inputs.

Applications of NLP
There are the following applications of NLP -

1. Question Answering: Question Answering focuses on building systems that automatically answer the
questions asked by humans in a natural language.

2. Spam Detection: Spam detection is used to detect unwanted e-mails getting to a user's inbox.

3. Sentiment Analysis: Sentiment Analysis is also known as opinion mining. It is used on the web to analyze
the attitude, behavior, and emotional state of the sender. This application is implemented through a
combination of NLP (Natural Language Processing) and statistics by assigning the values to the text (positive,
negative, or natural), identify the mood of the context (happy, sad, angry, etc.)

4. Machine Translation: Machine translation is used to translate text or speech from one natural language to
another natural language.

Example: Google Translator

5. Spelling correction: Microsoft Corporation provides word processor software like MS-word, PowerPoint
for the spelling correction.

6. Speech Recognition: Speech recognition is used for converting spoken words into text. It is used in
applications, such as mobile, home automation, video recovery, dictating to Microsoft Word, voice
biometrics, voice user interface, and so on.

7. Chatbot: Implementing the Chatbot is one of the important applications of NLP. It is used by many
companies to provide the customer's chat services.

8. Information extraction: Information extraction is one of the most important applications of NLP. It is used
for extracting structured information from unstructured or semi-structured machine-readable documents.

9. Natural Language Understanding (NLU): It converts a large set of text into more formal representations
such as first-order logic structures that are easier for the computer programs to manipulate notations of the
natural language processing.
Phases of NLP
There are the following five phases of NLP:

1. Lexical Analysis: The first phase of NLP is the Lexical Analysis. This phase
scans the source code as a stream of characters and converts it into meaningful
lexemes. It divides the whole text into paragraphs, sentences, and words.

2. Syntactic Analysis (Parsing): Syntactic Analysis is used to check grammar,


word arrangements, and shows the relationship among the words.
Example: Agra goes to the Poonam
In the real world, Agra goes to the Poonam, does not make any sense, so this
sentence is rejected by the Syntactic analyzer.

3. Semantic Analysis: Semantic analysis is concerned with the meaning


representation. It mainly focuses on the literal meaning of words, phrases, and sentences.

4. Discourse Integration: Discourse Integration depends upon the sentences that proceeds it and also invokes
the meaning of the sentences that follow it.

5. Pragmatic Analysis: Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended
effect by applying a set of rules that characterize cooperative dialogues.
For Example: "Open the door" is interpreted as a request instead of an order.

Why NLP is difficult?


NLP is difficult because Ambiguity and Uncertainty exist in the language.

There are the following three ambiguity -


o Lexical Ambiguity
Lexical Ambiguity exists in the presence of two or more possible meanings of the sentence within a single
word.
Example:
Manya is looking for a match.
In the above example, the word match refers to that either Manya is looking for a partner or Manya is looking
for a match. (Cricket or other match)
o Syntactic Ambiguity
Syntactic Ambiguity exists in the presence of two or more possible meanings within the sentence.
Example:
I saw the girl with the binocular.
In the above example, did I have the binoculars? Or did the girl have the binoculars?
o Referential Ambiguity
Referential Ambiguity exists when you are referring to something using the pronoun.
Example: Kiran went to Sunita. She said, "I am hungry."
In the above sentence, you do not know that who is hungry, either Kiran or Sunita.

Difference between Natural language and Computer Language


Natural Language Computer Language
Natural language has a very large vocabulary. Computer language has a very limited vocabulary.
Natural language is easily understood by Computer language is easily understood by the
humans. machines.
Natural language is ambiguous in nature. Computer language is unambiguous.
Implementation Aspects of Syntactic Analysis
There are a number of algorithms researchers have developed for syntactic analysis, but we consider only
the following simple methods −

 Context-Free Grammar
 Top-Down Parser
Let us see them in detail −

Context-Free Grammar
It is the grammar that consists rules with a single symbol on the left-hand side of the rewrite rules.
CFG consists of -
1. a set of non-terminal symbols
2. a set of terminal symbols
3. a set of rules (productions), where the LHS (mother) is a single non-terminal and the RHS is a
sequence of one or more non-terminal or terminal symbols
Let us create grammar to parse a sentence −
“The bird pecks the grains”
Articles (DET) − a | an | the
Nouns − bird | birds | grain | grains
Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun = DET N | DET ADJ N
Verbs − pecks | pecking | pecked
Verb Phrase (VP) − NP V | V NP
Adjectives (ADJ) − beautiful | small | chirping
The parse tree breaks down the sentence into structured parts so that the computer can easily understand
and process it. In order for the parsing algorithm to construct this parse tree, a set of rewrite rules, which
describe what tree structures are legal, need to be constructed.
These rules say that a certain symbol may be expanded in the tree by a sequence of other symbols. According
to first order logic rule, if there are two strings Noun Phrase (NP) and Verb Phrase (VP), then the string
combined by NP followed by VP is a sentence. The rewrite rules for the sentence are as follows −
S → NP VP
NP → DET N | DET ADJ N
VP → V NP
DET → a | the
ADJ → beautiful | perching
N → bird | birds | grain | grains
V → peck | pecks | pecking
The parse tree can be created as shown −
Now consider the above rewrite rules. Since V can be replaced by both, "peck" or "pecks", sentences such as
"The bird peck the grains" can be wrongly permitted. i. e. the subject-verb agreement error is approved as
correct.
Merit − The simplest style of grammar, therefore widely used one.
Demerits −
 They are not highly precise. For example, “The grains peck the bird”, is a syntactically correct
according to parser, but even if it makes no sense, parser takes it as a correct sentence.
 To bring out high precision, multiple sets of grammar need to be prepared. It may require a completely
different sets of rules for parsing singular and plural variations, passive sentences, etc., which can
lead to creation of huge set of rules that are unmanageable.

Top-Down Parser
Here, the parser starts with the S symbol and attempts to rewrite it into a sequence of terminal symbols that
matches the classes of the words in the input sentence until it consists entirely of terminal symbols.
These are then checked with the input sentence to see if it matched. If not, the process is started over again
with a different set of rules. This is repeated until a specific rule is found which describes the structure of the
sentence.
Merit − It is simple to implement.
Demerits −

 It is inefficient, as the search process has to be repeated if an error occurs.


 Slow speed of working.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy