Extract RegEx [Obj,String]¶
Description¶
Extract sub-strings based on the given RegEx.
Input¶
SOURCE [OBJ,STRING]
: a list of object-string pairs.
Output¶
PAIR [OBJ,STRING]
: a result pair contains an object from the input source and an extracted sub-string.RESULT [STRING]
: the extracted sub-strings. Notice that reference to which object each sub-string came from is lost.
Parameters¶
Pattern RegEx
: the regular expression to use for matching sub-stringsMax matches
: extract up to this number of sub-strings. Use 0 (default) for unlimited.Case-sensitive
: if set tofalse
, upper/lower case is ignored
Regular expressions¶
Regular expressions are internally evaluated by a PCRE engine. For a syntax reference, see this page. For a 1-page syntax reference, see this cheat-sheet.
Some of the most common questions/mistakes¶
Regular expressions are different from [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming) using wildcards. In particular,
*
does NOT mean “anything”,.*
does.All special characters (
. * + ? | \ ( ) [ ] ^ $
) must be escaped (prefixed with\
) when they are meant literally, in theRegEx
.^
indicates the beginning of an input text, or negation when used inside a multiple choice (e.g.[^\d-_]
).$
indicates the end of an input text.\b
indicates a word-boundary (spaces, punctuation, etc.).
Examples¶
Find names in the form of
Smith, John
:Pattern RegEx
:\b[^,]+\s*,\s*\b\w+\b
Find any day of the week (with
Case-sensitive = false
):Pattern RegEx
:\b(mon|tue|wednes|thurs|fri|sat|sun)day\b
Output scores can be aggregated and/or normalised.