# Extract RegEx [Obj,String] ### Description Extract sub-strings based on the given RegEx. ### Input - `SOURCE [OBJ,STRING]`: a list of object-string pairs. ### Output - `PAIR [OBJ,STRING]`: a result pair contains an object from the input source and an extracted sub-string. - `RESULT [STRING]`: the extracted sub-strings. Notice that reference to which object each sub-string came from is lost. ### Parameters - `Pattern RegEx`: the regular expression to use for matching sub-strings - `Max matches`: extract up to this number of sub-strings. Use 0 (default) for unlimited. - `Case-sensitive`: if set to `false`, upper/lower case is ignored ### Regular expressions Regular expressions are internally evaluated by a [PCRE](https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions) engine. For a syntax reference, see [this page](https://www.regular-expressions.info/refflavors.html). For a 1-page syntax reference, see this [cheat-sheet](https://www.debuggex.com/cheatsheet/regex/pcre). #### Some of the most common questions/mistakes - Regular expressions are different from [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming) using wildcards. In particular, `*` does NOT mean "anything", `.*` does. - All special characters (`. * + ? | \ ( ) [ ] ^ $`) must be escaped (prefixed with `\`) when they are meant literally, in the `RegEx`. - `^` indicates the beginning of an input text, or negation when used inside a multiple choice (e.g. `[^\d-_]`). `$` indicates the end of an input text. - `\b` indicates a word-boundary (spaces, punctuation, etc.). #### Examples - Find names in the form of `Smith, John`: - `Pattern RegEx`: `\b[^,]+\s*,\s*\b\w+\b` - Find any day of the week (with `Case-sensitive = false`): - `Pattern RegEx`: `\b(mon|tue|wednes|thurs|fri|sat|sun)day\b` Output scores can be [aggregated](docs://score_aggregation) and/or [normalised](docs://score_normalisation).