# Filter by RegEx ### Description Selects objects by the value of a string property, using [regular expression](https://www.regular-expressions.info/) replacement. ### Input - `SOURCE [OBJ]`: the list of objects to filter ### Output - `TRUE [OBJ]`: the objects for which the selection applies - `FALSE [OBJ]`: the objects for which the selection does not apply ### Parameters - `Property`: the string property to check. Use `*` to consider all properties. - `Use sub-properties`: when set to `true`, the values of all sub properties are also included. Sub-properties can be defined in the data with the `rdfs:subPropertyOf` relation. - `Pattern RegEx`: the regular expression to use for the match. - `Language`: when a language is selected, only the strings in this language are extracted. This uses the language tags that are defined in the data. - `Case-sensitive`: if set to `false`, upper/lower case is ignored Output scores can be [aggregated](docs://score_aggregation) and/or [normalised](docs://score_normalisation). ### Regular expressions Regular expressions are internally evaluated by a [PCRE](https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions) engine. For a syntax reference, see [this page](https://www.regular-expressions.info/refflavors.html). For a 1-page syntax reference, see this [cheat-sheet](https://www.debuggex.com/cheatsheet/regex/pcre). #### Some of the most common questions/mistakes - Regular expressions are different from [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming) using wildcards. In particular, `*` does NOT mean "anything", `.*` does. - All special characters (`. * + ? | \ ( ) [ ] ^ $`) must be escaped (prefixed with `\`) when they are meant literally, in the `Pattern RegEx`. - `^` indicates the beginning of an input text, or negation when used inside a multiple choice (e.g. `[^\d-_]`). `$` indicates the end of an input text. - `\b` indicates a word-boundary (spaces, punctuation, etc.). #### Examples - Find names in the form of `Smith, John`: - `Pattern RegEx`: `\b[^,]+\s*,\s*\b\w+\b` - Find any day of the week (with `Case-sensitive = false`): - `Pattern RegEx`: `\b(mon|tue|wednes|thurs|fri|sat|sun)day\b`