# Match by string ### Description Finds matches between the `STRING`-columns in the inputs. Various comparison options can be chosen: equals, contains, startsWith, endsWith or edit-distance. The result provides both the matching items, as well as the items from both inputs that didn't generate a match. Optional input `Cands [OBJ,OBJ]` can limit the matching to only the pairs of candidates listed. - The first column corresponds to the first column of `A`. - The second column corresponds to the first column of `B`. - Scores are propagated to final matches. ### Input - `A [OBJ,STRING]`: a list of candidates, in which the `STRING`-column will be used for comparison and the `OBJ`-column will be the result - `Cands [OBJ,OBJ]` (optional): candidate pairs, only `A`s and `B`s that are in `Cands` will be matched - `B [OBJ,STRING]`: a list of candidates, in which the `STRING`-column will be used for comparison and the `OBJ`-column will be the result ### Output - `RESULT [OBJ,OBJ]`: the matched objects from `A` and `B` - `NOTA [OBJ]`: the objects from A that did not match with an item from `B` - `NOTB [OBJ]`: the objects from B that did not match with an item from `A` ### Parameters - `Comparison`: Comparison function to use - `equal`: the strings must be equal - `contains`: the string in `B` must be contained in `A` - `containsWholeWord`: the string in `B` must be contained in `A`, as a whole word (only punctuation/spaces around) - `startsWith`: the string in `A` must start with `B` - `endsWith`: the string in `A` must end with `B` - `prefix`: strings in `A` and `B` share a prefix of a given length - `levenshtein`: the string in A may not have more than `Max edit-distance` differences (character insertions or deletions) with B. - `jaro-winkler`: the strings in `A` and `B` must have a Jaro-Winkler similarity score not smaller than `Min similarity`. - `Case-sensitive`: if set to `false`, upper/lower case is ignored - `Exclude self-matches`: whether to emit the match if the objects in `A` and `B` are the same. Mostly useful when `A` and `B` come from the same source