Irs PPT Unit Ii
Irs PPT Unit Ii
Irs PPT Unit Ii
Retrieval Utilities
Utilities improve the results of a retrieval strategy. Most utilities add or remove
terms from the initial query in an attempt to refine the query.
Relevance Feedback
The final step in creating clusters is to determine when two objects (words) are in
the same cluster
• Cliques
• single link
• stars
• connected components
Cliques
Single Link:
Star:
Damashek
N-Gram Models
• Estimate probability of each word given prior context.
– P(phone | Please turn off your cell)
• Number of parameters required grows exponentially with
the number of words of prior context.
• An N-gram model uses only N1 words of prior context.
– Unigram: P(phone)
– Bigram: P(phone | cell)
– Trigram: P(phone | your cell)
• The Markov assumption is the presumption that the future
behavior of a dynamical system only depends on its recent
history. In particular, in a kth-order Markov model, the
next state only depends on the k most recent states,
therefore an N-gram model is a (N1)-order Markov
model.
Damashek
Regression Analysis
For a given age, it is possible to find the related life expectancy. Now, if we
wish to predict the likelihood of a person having heart disease, we might
obtain the following data:
Conti……
Stage 2:
The second stage of the staged logistic regression attempts to correct for
errors induced by the number of composite clues. As the number of
composite clues grows, the likelihood of error increases. For N composite
clues, the following logistic regression is computed: