Content-Length: 282719 | pFad | http://github.com/rjrequina/Cebuano-Stemmer

9A GitHub - rjrequina/Cebuano-Stemmer: Cebuano Stemmer Based on Krovetz Algorithm
Skip to content

rjrequina/Cebuano-Stemmer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cebuano-Stemmer

Cebuano Stemmer based on Krovetz Algorithm

Note: Only prefixes, suffixes, infixes, and reduplication is covered

Installation

  • pip install cebstemmer or
  • inside the folder run python setup.py install

Requirements

  • cebdict>=2.1

Functions

  • stem_word(word='', as_object=False)
    • Accepts a Cebuano word and returns the morphemes of the word

    • Default Output: List of morphemes

        [root, prefix, infix, suffix]
      
    • OPTION: as_object - When true, Word object is returned.

       ```
       class Word:
         def __init__(self, text=None):
             self.orig_text = text
             self.text = text.lower() if text is not None else text
             self.prefix = None
             self.infix = None
             self.suffix = None
             self.root = text.lower() if text is not None else text
             self.is_entry = False
       ```
      

How to Use

from cebstemmer import stemmer

stemmer.stem_word('buangon')

Output: 
   ['buangon', None, None, on]

word = stemmer.stem_word('buangon', as_object=True)
print(word.root)
print(word.suffix)

Output:
  buang
  on

Evaluation

png-2

References









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/rjrequina/Cebuano-Stemmer

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy