This is a parts of speech tagger written in python. It implements a hidden markov model and the viterbi algorithm. I recommend testing and training on the 'development.txt' and 'training.txt' files. Using any other files will require you to edit the code(only slightly). I was able to achieve ~95% accuracy doing this. The accuracy really depends on how large your corpus is. I wasn't able to get my hands on the Penn treebank corpus, but have read that it is the best for POS tagging.
-
Notifications
You must be signed in to change notification settings - Fork 1
EthanBlackburn/PartsOfSpeech_Tagger
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
parts of speech tagger using a HMM and the viterbi algorithm
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published