Unit I
Unit I
Unit I
1
TEXT BOOK
James Allen, Natural Language Understanding, 2nd
Edition, 2003, Pearson Education
PBR VITS(AUTONOMOUS)
III B.TECH-AI SEM-II
Subject Code:20A05702c
2
2
UNIT-I
UNIT–I Introduction to Natural language
3
3
1.1 Introduction to NLP:
4
4
NLP is a process where input provided in a human
language and converts this input into a useful form of
representation. The field of NLP is primarily concerned
with getting computers to perform interesting and useful
tasks with human languages. The field of NLP is
secondarily concerned with helping us come to a better
understanding of human language.
5
5
The need for Machine Translation stated in the 1950s.
Then the original language was English and Russian
But the use of other words such as Chinese also came
into existence in the initial period of the 1960s.
In the 1960s, the NLP got a new life when the idea and need of
Artificial Intelligence emerged.
In 1978 LUNAR is developed by W.A woods; it could analyze,
compare and evaluate the chemical data on a lunar rock and soil
composition that was accumulating as a result of Apollo moon
missions and can answer the related question.
6
6
In the 1980s the area of computational grammar became
a very active field of research which was linked with the
science of reasoning for meaning and considering the
user‘s beliefs and intentions.
In the period of 1990s, the pace of growth of NLP
increased. Grammars, tools and Practical resources
related to NLP became available with the parsers.
Probabilistic and data-driven models had become quite
standard by then.
7
7
In 2000, Engineers had a large amount of spoken and
textual data available for creating systems.
Today, large amount of work is being done in the field
of NLP using Machine Learning or Deep Neural
Networks in general, where we are able to create state-
of-the-art models in text classification, Question and
Answer generation, Sentiment Classification, etc.
8
8
1.2 GENERIC NLP SYSTEM
Natural language Processing should start with some
input and ends with effective and accurate output.
9
9
Pipeline view of the components of a generic NLP
system.
10
10
1.3 LEVELS OF NLP
There are two components of Natural Language Processing
I. Natural Language Understanding (NLU)
• NLU takes some spoken/typed sentence and working out what
it means.
• Here different level of analysis required such as
morphological analysis, syntactic analysis, semantic analysis,
discourse analysis, …
II Natural Language Generation (NLG)
• NLG takes some formal representation of what you want to say
and working out a way to express it in a natural (human) language
(e.g., English)
• Here different level of synthesis required: deep planning (what to
say), syntactic
generation . 11
11
Difference between NLU & NLG
NLU
NLG
It is the process of reading and interpreting It is the process of writing or generating
language. language.
NLU explains the meaning behind the written NLG generates the natural language using
text or speech in natural language machines.
NLU understands the human language and NLG uses the structured data and generates
converts it into data meaningful narratives out of it
12
12
The NLP broadly be divided into various levels as
shown in the following Figure
13
13
1. Phonology:
It concerned with interpretation of speech sound within and across words.
2. Morphology:
It deals with how words are constructed from more basic meaning units called
morphemes.
3. Syntax:
It concerns how words can be put together to form correct sentences and
determines what structural role each word plays in the sentence. For example,
“the dog ate my homework”
4. Semantics:
It is a study of the meaning of words and how these meaning combine in
sentences to form sentence meaning. It is study of context- independent meaning.
5. Reasoning:
To produce an answer to a question which is not explicitly stored in a database;
Natural Language Interface to Database (NLIDB) carries out reasoning based on
data stored in the database 14
14
1.4 The Study of Language:
Language is studied in several different academic
disciplines. Each discipline defines its own set of
problems and has its own methods for addressing them.
16
16
1.6 Evaluating Language Understanding Systems
18
18
Natural language understanding Systems
19
19
1.7 The Different Levels of Language Analysis
A natural language-system must use considerable
knowledge about the structure of the language itself,
including what the words are, how words combine to
form sentences, what the words mean, how word
meanings contribute to sentence meanings, and so on.
The following are some of the different forms of
knowledge relevant for natural language understanding:
Phonetic and phonological knowledge - concerns how
words are related to the sounds that realize them. Such
knowledge is crucial for speech-based systems.
20
20
Morphological knowledge - concerns how words are
constructed from more basic meaning units called morphemes.
Syntactic knowledge - concerns how words can be put together
to form correct sentences and determines what structural role
each word plays in the sentence and what phrases are subparts of
what other phrases.
Semantic knowledge - concerns what words mean and how
these meanings -combine in sentences to form sentence meanings
Pragmatic knowledge - concerns how sentences are used in
different situations and how use affects the interpretation of the
sentence.
21
21
Discourse knowledge-concerns how the immediately
preceding sentences affect the interpretation of the next
sentence.
World knowledge - includes the general knowledge
about the structure of the world that language users
must have in order to, for example, maintain a
conversation. It includes what each language user must
know about the other user’s beliefs and goals.
22
22
1.8 Representation and Understanding
23
23
Syntax: Representing Sentence Structure
The syntactic structure of a sentence indicates the way that
words in the sentence are related to each other.
24
24
CFG for nlp
1. S -> NP VP
2. NP -> ART N
3. NP -> ART ADJ N
4. VP -> V
5. VP -> V NP
25
25
Rice flies like sand
26
26
27
27
28
28