# Extract RegEx [Obj,String]

### Description
Extract sub-strings based on the given RegEx.

### Input
- `SOURCE [OBJ,STRING]`: a list of object-string pairs.

### Output
- `PAIR [OBJ,STRING]`: a result pair contains an object from the input source and an extracted sub-string.
- `RESULT [STRING]`: the extracted sub-strings. Notice that reference to which object each sub-string came from is lost.

### Parameters
- `Pattern RegEx`: the regular expression to use for matching sub-strings
- `Max matches`: extract up to this number of sub-strings. Use 0 (default) for unlimited.
- `Case-sensitive`: if set to `false`, upper/lower case is ignored

### Regular expressions
Regular expressions are internally evaluated by a [PCRE](https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions) engine.
For a syntax reference, see [this page](https://www.regular-expressions.info/refflavors.html).
For a 1-page syntax reference, see this [cheat-sheet](https://www.debuggex.com/cheatsheet/regex/pcre).

#### Some of the most common questions/mistakes
- Regular expressions are different from [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming) using wildcards.
  In particular, `*` does NOT mean "anything", `.*` does.
- All special characters (`. * + ? | \ ( ) [ ] ^ $`) must be escaped (prefixed with `\`) when they are meant literally, in the `RegEx`.
- `^` indicates the beginning of an input text, or negation when used inside a multiple choice (e.g. `[^\d-_]`).
  `$` indicates the end of an input text.
- `\b` indicates a word-boundary (spaces, punctuation, etc.).

#### Examples
- Find  names in the form of `Smith, John`:
  - `Pattern RegEx`: `\b[^,]+\s*,\s*\b\w+\b`
- Find any day of the week (with `Case-sensitive = false`):
  - `Pattern RegEx`: `\b(mon|tue|wednes|thurs|fri|sat|sun)day\b`

Output scores can be [aggregated](docs://score_aggregation) and/or [normalised](docs://score_normalisation).