0% found this document useful (0 votes)
109 views

7 Lexical Semantics

Introduction to Lexical semantics for computer text analysis.

Uploaded by

tele6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

7 Lexical Semantics

Introduction to Lexical semantics for computer text analysis.

Uploaded by

tele6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Computational

Linguistics

CSC 2501 / 485


Fall 2015

7.

Lexical semantics

Kathleen C. Fraser
Department of Computer Science, University of Toronto
Reading: Jurafsky & Martin: 19.14, 20.8; Bird et al: 2.5

Copyright 2015 Frank Rudzicz,


Graeme Hirst, and Suzanne
Stevenson. All rights reserved.

Lexical semantics
Word meanings and their internal structure.
The structure of the relations among words and
meanings.

Current CL research
Current focus in CL is on lexical semantics:
word senses;
detailed lexical representations;
organization of senses, or lexical entries more
generally.

Knowledge about words


Lexicon with entry for each word (or fixed phrase).
Senses (meanings). For each:
Surface form:
Orthography, phonology,
Syntax:
Part-of-speech, morphology, subcategorization,

Behaviour, usage, :
Collocations, register and genre,

Word senses
How are word senses defined?
Grounded in world knowledge?

Are they defined and fixed at all?


Or wholly context-dependent? (See also slide 9)

Constructional versus differential approaches.


Sense is built from
elements of a set of
universal primitives
of meaning.

Sense is distinguished
from others by a set of
(ad hoc) differentia.

Relating words and senses


Synonymy: Two (or more) words (synonyms)
having the same meaning.
What does this mean?

Homonymy, polysemy: Two (or more) meanings


having the same word (homonym, polyseme).
Lexical ambiguity

Lexical ambiguity:

Homonymy

Homonymy: meanings are unrelated.


[Etymology or history of word is not a deciding factor.]

Due to same spelling (homography):


bank for money, bank of river, bank of switches,
bank banque or bord or range or ?
bass: bss fish, bss guitar;
bow: bau to the audience, tie a b.

Due to same sound (homophony):


wood, would; weather, whether; you, ewe, yew;
bough, bow; sheet.
7

Lexical ambiguity: Polysemy 1


Polysemy: meanings are related.
run: of humans, rivers, buses, bus routes,
line: of people, of type, drawn on paper, transit
route,

Often, no clear line between polysemy and


homonymy.

Lexical ambiguity: Polysemy 2


Sense modulation by context:
fast train, fast typist, fast road.

Systematic polysemy or sense extension:


bank as financial institution and as building;
window as hole in wall or that which fits in hole;
bottle, book, DVD, Toyota, lamb,
Applies to most or all senses of certain semantic
classes.

Relations between senses


Hyponymy, hypernymy: subtype, supertype:
sedan is a hyponym of car;
car is a hypernym of sedan.
[hypo- = under; hyper- = over]

The fundamental relation for creating a taxonomy: a


tree-like structure that expresses classes and
inheritance of properties.
[Terminology:
is-a relation in ontologies of (language-independent) concepts;
hyponymy relation in taxonomies of (language-dependent) senses.]

10

Relations between senses

Meronymy, holonymy: part/whole, or


membership:
leg is a meronym of chair;
chair is a holonym of leg and
a meronym of dining-set.
Many subtypes of meronym relations.
Component-of: kitchenapartment
Member-of: soldierarmy
Portion-of: slicepie

Examples of meronymy from Roxana Girju, Adriana Badulescu, and Dan I. Moldovan, Automatic discovery of part-whole relations, Computational Linguistics, 32(1), 2006, 83135,
based on relations from Morton E. Winston, Roger Chaffin, and Douglas Herrmann, A taxonomy of part-whole relations, Cognitive Science, 11(4), 1987, 417444.

11

Relations between senses 3


Entailment, implicature: various kinds:
snore entails sleep;
manage implies try.

12

Lexical acquisition

Problem: We need a complete lexicon for each


natural language.
Dictionary as starting point? Limitations?
Text (corpus) as starting point? Limitations?
Build by hand (lexicographers) or automatically?
Limitations?

13

Lexical acquisition

Corpus-based machine learning methods.


Accurate, representative information.
Includes statistical information.

Extraction from online dictionary.


More knowledge-based.
Can treat dictionary as highly specialized corpus.

14

WordNet

WordNet: A hierarchical (taxonomic) lexicon


and thesaurus of English.
Developed by lexicographers at Princeton,
1990s to present.

Graph structure:
Nodes are synsets (synonym sets) ( word senses).

http://wordnetweb.princeton.edu/perl/webwn
15

Noun slip
faux pas#1, gaffe#1, solecism#1, slip#1, gaucherie#2
Synonyms for this sense
(a socially awkward or tactless act)
Gloss
slip#2, slip-up#1, miscue#2, parapraxis#1 (a minor inadvertent mistake usually
observed in speech or writing or in small accidents or memory lapses etc.)
slip#3 (potters clay that is thinned and used for coating or decorating ceramics)
cutting#2, slip#4 (a part (sometimes a root or leaf or bud) removed from a plant to
propagate a new plant through rooting or grafting)
Example
slip#5 (a young and slender person) "hes a mere slip of a lad"
mooring#1, moorage#2, berth#2, slip#6 (a place where a craft can be made fast)
slip#7, trip#3 (an accidental misstep threatening (or causing) a fall) "he blamed his
slip on the ice"; "the jolt caused many slips and a few spills"
slickness#3, slick#1, slipperiness#1, slip#8 (a slippery smoothness) "he could feel
the slickness of the tiller"
strip#2, slip#9 (artifact consisting of a narrow flat piece of material)
slip#10, slip of paper#1 (a small sheet of paper) "a receipt slip"
chemise#1, shimmy#2, shift#9, slip#11, teddy#2 (a woman's sleeveless
undergarment)

16

Noun slip: Hypernyms


slip#10, slip of paper#1 (a small sheet of paper)
sheet#2, piece of paper#1, sheet of paper#1 (paper used for writing or printing)
paper#1 (a material made of cellulose pulp derived mainly from wood or rags or cer
material#1, stuff#1 (the tangible substance that goes into the makeup of a physic
substance#1 (the real physical matter of which a person or thing consists)
matter#3 (that which has mass and occupies space)
physical entity#1 (an entity that has physical existence)
entity#1 (that which is perceived or known or inferred to have its own
part#1, portion#1, component part#1, component#2, constituent#3 (someth
relation#1 (an abstraction belonging to or characteristic of two entities or p
abstraction#6, abstract entity#1 (a general concept formed by extracting
entity#1 (that which is perceived or known or inferred to have its own

17

Noun slip: Sister terms


sheet#2, piece of paper#1, sheet of paper#1 (paper used for writing or printing)
slip#10, slip of paper#1 (a small sheet of paper)
signature#5 (a sheet with several pages printed on it; it folds to page size and is bound w
leaf#2, folio#2 (a sheet of any written or printed material (especially in a manuscript or
tear sheet#1 (a sheet that can be easily torn out of a publication)
foolscap#1 (a size of paper used especially in Britain)
style sheet#1 (a sheet summarizing the editorial conventions to be followed in preparin
worksheet#1 (a sheet of paper with multiple columns; used by an accountant to assemb
revenue stamp#1, stamp#6 (a small piece of adhesive paper that is put on an object to s

18

Diagram from Ellen Voorhees 1998

Eight senses of board in WordNet, and


their hypernyms and hyponyms
19

WordNet

Graph structure (cont.):


Edges from hyponymy relations: near-tree.
Edges from meronymy relations: network.

Index maps each word to all of its synsets.


Separate trees for nouns, verbs, adjectives,
adverbs (with derivational cross-connections).
Differential approach to meaning:
The hyponyms of a node are differentiations of its
meaning.

20

WordNet

WordNets now available or under construction


for many languages.
Afrikaans, Albanian, Arabic, Bantu, Basque, Bengali, Bulgarian, Catalan,
Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Farsi
(Persian), Finnish, French, German, Greek, Hebrew, Hindi, Hungarian,
Icelandic, Indonesian, Italian, Irish, Japanese, Kannada, Korean, Latin,
Latvian, Macedonian, Maltese, Marathi, Moldavian, Mongolian,
Myanmar, Nepali, Norwegian, Oriya, Polish, Portuguese, Romanian,
Russian, Sanskrit, Serbian, Slovenian, Spanish, Swedish, Tamil, Thai,
Turkish, Vietnamese
www.globalwordnet.org, July 2013

21

Building, updating WordNets


Problem: Need a complete lexicon and lexical
relations for each natural language.
Dictionary as starting point? Limitations?
Another WordNet as starting point? Limitations?
Build by hand (lexicographers) or automatically?
Limitations?

Text (corpus) as starting point?

Limitations?

22

Hearst

Discovering lexical relations 1

Corpus-based method.
Makes suggestions for lexicographers.
Scan partially-parsed text looking for instances
of patterns:
such NP1 as {NPi}* {or|and} NPi
implies NP1 is a hypernym of the NPi

Hearst, Marti. Automated discovery of WordNet relations. In: Fellbaum, Christane (editor), WordNet: An
electronic lexical database, The MIT Press, 1998, pages 131151.
23

24

Hearst

Discovering lexical relations 2

Develop patterns
by hand, or
by scanning for sentences containing known related
pairs.

25

Hearst

Results, good

1. Some relations already in WordNet:


fabricsilk, grainbarley, disordersepilepsy,

2. Some relations not already in WordNet (but the


words were):
cropsmilo, perishablesfruit, conditionsepilespy,

3. Some relations with words not yet in WordNet:


companiesShell, institutionsTufts,

26

Hearst

Results, less good

4. Some too-general relations:


thingsexercise, topicsnutrition, areasSacremento

5. Some too-context-specific relations:


othersMeadowbrook, classicsGaslight, categoriesdrama,

6. Some really bad relations (usually due to


parsing errors, not detecting full NP):
childrenHeadstart, jobscomputer, companiessports

27

Hearst

Limitations

Problems:
Which word is the hypernym?
A bearing is a structure that supports a rotating part of
a machine, such as a shaft, axle, spindle, or wheel.

Cant find good patterns for meronyms.


How to evaluate method quantitatively?

28

Since Hearsts paper

Methods that use syntactic (not just lexical)


patterns, and which derive the patterns from
corpora.
Methods that use senses, not words.
Methods for finding coordinate (sister) terms
by distributional similarity in text.
Methods that combine the evidence from both
of these to identify additional hyponym
relations.
SISTER(X,Y) HYPONYM (Y,Z) HYPONYM (X,Z)
29

Since Hearsts paper

Methods for meronymic relations.


Each subtype tends to have its own indicators.
These tend to have much more ambiguous patterns
than hyponymy.
Complex methods for learning additional semantic
constraints on the patterns.

30

Since Hearsts paper

Methods for causal relations.


Look esp for verbs such as give rise to, induce, generate,
cause,

Learning ontologies from text as important


research topic.
Learning commonsense knowledge from text
as new research topic.

31

Properties of verbs

Revision

Subcategorization of verbs:
VPs can include more than one NP, can include
clauses of various types.
Can classify verbs by kinds of VPs they permit.

Thematic roles of a verb some common


mappings:
Subject Agent / Experiencer
Object Theme
Object of preposition Goal / Location/
Recipient / Instrument

32

Lexical semantics of verbs


Verbs are more complex than nouns.
They are predicates that encode relations
between their arguments.
They place selectional restrictions on their
arguments.
E.g., agent of eat must be animate; theme must be
physical, edible.
Different senses of verb may impose different
selectional restrictions.
Hence argument types may indicate verb-sense
(see notes #8).
33

Lexical semantics of verbs


Their taxonomy is more difficult to determine.
Grouping is not as intuitively clear.
Differentiating sister nodes is more complex.

34

Lexical semantics of verbs


WordNet for verbs is not very useful.
Only shallow hierarchy of troponymy and
hypernymy.
e.g., to saunter is to walk in a certain manner.

Insufficient information about thematic roles,


selectional restrictions, and subcategorization.
No information about regularity in behaviour of
classes of verbs.

35

Verb
S: (v) spray (be discharged in sprays of liquid) "Water sprayed all over the floor"
S: (v) spray (scatter in a mass or jet of droplets) "spray water on someone"; "spray
paint on the wall"
S: (v) spray (cover by spraying with a liquid) "spray the wall with paint"

Verb
S: (v) spray (be discharged in sprays of liquid) "Water sprayed all over the floor"
direct hypernym / inherited hypernym / sister term
S: (v) scatter, sprinkle, dot, dust, disperse (distribute loosely) "He scattered
gun powder under the wagon"
S: (v) discharge (pour forth or release) "discharge liquids"
S: (v) spread, distribute (distribute or disperse widely) "The invaders
spread their language all over the country"
derivationally related form
sentence frame
Something ----s
Something is ----ing PP

36

Levins verb classification

Groups (English) verbs by diathesis alternations


syntactic patterns of argument structure.
May be subtle semantic differences between
alternations.

Shows mapping between semantics of


verbs and their syntactic behaviour /
subcategorization.

Levin, Beth. English Verb Classes and Alternations. University of Chicago Press, 1993.
Palmer, Martha; Gildea, Daniel; Xue, Nianwen. Semantic Role Labeling. Synthesis Lectures on Human
Language Technologies #6, Morgan & Claypool, 2010. www.morganclaypool.com/toc/hlt/1/1

37

Verb class behaviour

[Verb class 45.1]

[Verb class 20]

break, crack, rip,

touch, stroke, tickle,

Jay broke Bills finger.


Jay broke Bill on the finger.

Kay touched Bills neck.


Kay touched Bill on the neck.
Kay touched the cat.
Cats touch easily.

Jay broke the vase.


Vases break easily.

Motion/contact required for body-part alternation.


Change of state required for middle construction.
38

Diathesis alternation
[Alternation 2.3.1]

The sprayload alternation


Nadia sprayed paint onto the wall.
Nadia sprayed the wall with paint.
Paint sprayed onto the wall.
The wall sprayed with paint.
Walls spray easily.

Greater suggestion of
completeness of action

Other verbs that undergo this alternation:


brush, cram, crowd, dust, jam, load, scatter, splash,
39

Levins verb classification

~80 alternations, ~190 verb classes, ~3000


English verbs classified.
Subsequently extended by other researchers (Korhonen and
Briscoe 2004).

Different senses of a verb may fall into different


classes.
Used extensively in CL; basis for VerbNet.

Anna Korhonen and Ted Briscoe. Extended lexical-semantic classification of English verbs. HLT
NAACL Workshop on Computational Lexical Semantics, Boston, 2004.

40

VerbNet
Embeds Levins classes in a computational
lexicon.
Adds thematic roles and semantics.
Uses WordNet senses.

Karin Kipper, Hoa Trang Dang, Martha Palmer. Class-based construction of a verb lexicon. 17th National
Conference on Artificial Intelligence, 2000.
Karin Kipper Schuler. VerbNet: A Broad-Coverage Comprehensive Verb Lexicon. PhD thesis, University of
Pennsylvania, 2005.
41

Class Spray-9.7

Thematic roles and


restrictions on them
Semantic form for the
kind of event E the
frame represents

http://verbs.colorado.edu/verbindex/vn/spray-9.7.php
42

Class Spray-9.7

Restriction on
preposition PREP

Thematic roles and


restrictions on them

Unspecified
argument

Semantic form for the


kind of event E the
frame represents

http://verbs.colorado.edu/verbindex/vn/spray-9.7.php
43

Class Spray-9.7-1

WordNet and FrameNet


sense numbers

44

Class Spray-9.7-1-1

Class Spray-9.7-2

45

FrameNet
Semantics-first classification of verbs
(and nouns).
Frame: A conceptual structure that describes a
particular type of situation, object, or event
along with its participants and props.*
Groups of predicates in the same semantic
situation share case frames.
Includes both a lexicon and a corpus of annotated sentences to illustrate predicate usage.
*Josef Ruppenhofer et al. FrameNet II: Extended theory and practice. June 2010.

46

Example
Frame APPLY-HEAT:

bake, barbecue, blanch, boil, braise, broil, , poach, roast, saute,


scald, simmer, singe, steam, stew, toast

Nadia fried the sliced onions in a skillet.


Cook

Food

Heating instrument

Frame elements

Josef Ruppenhofer et al. FrameNet II: Extended theory and practice. June 2010.

47

Apply_heat

Inherits From: Activity, Intentionally_affect


Is Inherited By:
Is Used By: Cooking_creation
Is Causative of: Absorb_heat

https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=frameIndex

This frame differs from Cooking_creation in focusing on the process of handling the
ingredients, rather than the edible entity that results from the process.

48

Lexical entry for an Apply_heat word: bake


https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=frameIndex

CNI = Constructional
null instantiation

INI = Indefinite
null instantiation

Grammatical functions: Dependent, External argument, Object

50

Lexical entry for an Apply_heat word: bake

https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=frameIndex

Valence patterns

51

Text with FrameNet annotations 1

Subscripts: Frames
Italics: Unannotated words
Yellow: Named entities

https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=fulltextIn
dex
The text is from the American National Corpus.

As capital of Europes most explosive economy, Dublin seems to be changing


before your very eyes.

52

Text with FrameNet annotations 2


https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=fulltextIn
dex
The text is from the American National Corpus.

As capital of Europes most explosive economy, Dublin seems to be changing


before your very eyes.

53

FrameNet in other languages


FrameNets now available or under
construction for several other languages.
Brazilian Portuguese, Chinese, German, Japanese, Spanish, Swedish

https://framenet.icsi.berkeley.edu/fndrupal/framenets_in_other_languages, June 2014

54

FrameNet vs VerbNet

Complementary resources:
VerbNet:
Groups by syntactic behaviour (Levin classes).
Any resultant grouping by meaning is side-effect.

FrameNet:
Groups by meaning class (frame).
Not limited to verbs.
Any resultant grouping by syntactic behaviour is sideeffect.

55

FrameNet vs VerbNet

Combine both with WordNet.


Algorithmic methods to map VerbNet entries to
FrameNet entries and vice versa.
Semi-automatic methods to map VerbNet constraints
into the WordNet hierarchy.

Lei Shi and Rada Mihalcea. Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust
semantic parsing. 6th International Conference on Intelligent Text Processing and Computational Linguistics
(Springer Lecture Notes in Computer Science 3406), 2005, 100111.
56

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy