- Reduce word endings to find common stem:
- housing, houses, house → hous
- Heuristic approach, not linguistic
- May find common stem for different concepts:
- organic → organ
Implemented in Snowball Stemming DSL
Match | Replace | Example |
---|---|---|
SSES | SS | caresses → caress |
IES | I | ponies → poni, ties → ti |
SS | SS | caress → caress |
S | empty | cats → cat |
… and so on
Purely algorithmitc, no linguistic knowledge. But usually good enough.
- Determine root based on linguistic rules
- Keeps the type of word:
- A saw → saw, I saw → see
- Can generate inflections, e.g., what's the plural of house?
- Benefits of lemmatization over stemming doubtful
Notes: