0% found this document useful (0 votes)
4 views

unit 4 nlp

N-grams are contiguous sequences of n items from text or speech, used primarily in Natural Language Processing (NLP) for various tasks. They are classified into types based on the value of n, including unigrams (n=1), bigrams (n=2), trigrams (n=3), and so on. Language modeling, a key NLP task, predicts the next word in a sequence and can be approached through statistical methods like N-grams or neural networks, with applications in text generation, machine translation, and speech recognition.

Uploaded by

Sai ganesh Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

unit 4 nlp

N-grams are contiguous sequences of n items from text or speech, used primarily in Natural Language Processing (NLP) for various tasks. They are classified into types based on the value of n, including unigrams (n=1), bigrams (n=2), trigrams (n=3), and so on. Language modeling, a key NLP task, predicts the next word in a sequence and can be approached through statistical methods like N-grams or neural networks, with applications in text generation, machine translation, and speech recognition.

Uploaded by

Sai ganesh Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Unit 4

29 April 2025 23:12

31 Explain in detail about is n-Gram mode and its types ?

• N-grams are defined as the contiguous sequence of n items that can be extracted from a
given sample of text or speech.
• The N-grams typically are collected from a text or speech corpus
• N-grams are continuous sequences of words or symbols or tokens in a document and are
defined as the neighboring sequences of items in a document.
• They are used most importantly in tasks dealing with text data in NLP (Natural Language
Processing).
• The co-occurring words are called "n-grams,"n" is a number saying how long a string of
words we have considered in the construction of n-grams.
• Unigrams are single words, bigrams are two words, trigrams are three words, 4-grams are
four words, 5-grams are five words, etc.

TYPES:
N-grams are classified into different types depending on the value that n takes. When

n=1, it is said to be a unigram. When


n=2, it is said to be a bigram. When
n=3, it is said to be a trigram. When
n=4, it is said to be a 4-gram, and so on.

• Unigrams: These are simply the unique words in the sentence.


○ Cowards, die, many, times, before, their, deaths, the valiant, the valiant, never, taste, of,

death, but, once.


• Bigrams: These are simply the pairs of co-occurring words in the sentence formed by sliding
one word at a time in the forward direction to generate the next bigram.
cowards die, die many, many times, times before, before their, their deaths, deaths the,
unit 4 Page 1
○cowards die, die many, many times, times before, before their, their deaths, deaths the,
the valiant, valiant never, never taste, the taste of, of death, death but, but once
• Trigrams: These are the 3 pairs of co-occurring words in the sentence formed by sliding two
words at a time in the forward direction to generate the next trigram.
○ cowards die many, die many times, many times before, times before their, before their

deaths, their deaths the, deaths the valiant, the valiant never, valiant never taste, never
taste of, taste of death, of death but, death but once
• 4-grams: Here we have the window such that we have combinations of 4 words together
○ cowards die many times, die many times before, many, times before their, times before

their deaths, before their deaths the, their deaths the valiant, deaths the valiant never,
the valiant taste, valiant never taste of, never taste of death, taste of death but, of death
but once
• Simialary we can pick n>4n>4 and generate 5-grams etc.
From <https://www.scaler.com/topics/nlp/n-gram-model-in-nlp/>

33 Explain about language modeling and types of language


modeling ?

• Language modeling is a fundamental task in natural language processing (NLP) that


involves predicting the next word or sequence of words in a given context.
• It is a core component of many NLP applications, such as machine translation, speech
recognition, text generation, and more.
• Language models (LMs) are trained to capture the structure, grammar, and semantics of a
language, enabling them to generate coherent and contextually appropriate text.

TYPES (OR) Methods of Language Modelling

Two methods of Language Modeling:

1. Statistical Language Modelling: Statistical Language Modeling, or Language Modeling,


is the development of probabilistic models that can predict the next word in the
sequence given the words that precede.

○ Examples such as N-gram language modeling.

2. Neural Language Modeling: Neural network methods are achieving better results than

unit 4 Page 2
2. Neural Language Modeling: Neural network methods are achieving better results than
classical methods both on standalone language models and when models are incorporated
into larger models on challenging tasks like speech recognition and machine translation.
A way of performing a neural language model is through word embeddings.

Applications of Language Models


Language models have a wide range of applications, including:
Text Generation: Generating plausible and contextually relevant text by predicting the next
word in a sequence iteratively
Machine Translation: Translating text from one language to another by understanding and
generating grammatically correct sentences in the target language
Speech Recognition: Converting spoken language into text by predicting the most likely word
sequences
Handwriting Recognition: Converting handwritten text into digital text by predicting the
most likely word sequences.

unit 4 Page 3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy