Stem¶
Description¶
Extracts the stem (available for various languages) from all strings in a [OBJ,STRING] input.
Strings are expected to be single words (see Tokenize block).
Input¶
SOURCE [OBJ,STRING]: a 2-column input with an object-string pair. Typically obtained with theExtract stringandTokenizeblocks.
Output¶
RESULT [OBJ,STRING]: the pairs fromSOURCE, where the string has been modifiedSTRINGS [STRING]: the modified strings, without the object they were paired to
Parameters¶
Stemming: strings (single words) can be stemmed for a specific language or left as they are
Output scores can be aggregated and/or normalised.