# Rank by Boolean Expr BM25 ### Description Ranks objects in `SOURCE [OBJ,STRING]` according to the relevance score of each `STRING` with the expression in `QUERY [STRING]`. The relevance is computed using [Okapi BM-25](https://en.wikipedia.org/wiki/Okapi_BM25) ranking method. ### Inputs - `SOURCE [OBJ,STRING]`: a 2-column input with an object-string pair. Typically obtained with the `Extract string` block ### Outputs - `RESULT [OBJ]`: a list of ranked objects ### Parameters - `Query`: a boolean query - Use `and`, `or` (case does not matter) to express conjunctions and disjunctions of terms - Use parentheses to group sub-expressions - Negations are not yet supported - Quotes to group terms into a phrase are not yet supported - Example: `apple AND (pear OR banana)` - `Stemming`: tokens can be stemmed for a specific language or left as they are - `Case-sensitive`: if set to `false`, upper/lower case is ignored - `Normalize diacritics`: transliterates non-ASCII characters into their closest ASCII form - `Tokenization`: the method to tokenize the input strings. - `None`: perform no tokenization - `Spaces`: all valid Unicode space characters - `Spaces/Punctuation`: `Spaces` + all valid Unicode punctuation characters - `Spaces/Punctuation/Digits`: `Spaces/Punctuation` + all valid Unicode digit characters - `Spaces/Punctuation/Digits/Symbols`: `Spaces/Punctuation/Digits` + all valid Unicode symbol characters - `Custom Regular Expression`: any [regular expression](https://www.regular-expressions.info) - `Min token length`: tokens whose character length is shorter than this value are discarded - `All query terms must match`: if set to `true`, only candidates where all tokens in `QTERMS` match a string in `SOURCE` are considered a match - `k1`: controls non-linear term frequency normalisation (saturation). Lower value = quicker saturation (term frequency is more quickly less important) - `b`: degree of document-length normalisation applied. `0`=no normalisation, `1`=full normalisation