Regex Match

 

Uses a regular expression (regex) pattern to search for a specific string or patterns of text in a string variable and branches accordingly.

 

For example, to search for all email addresses rather than for a particular email address in a string, you can use a regular expression pattern for an email address. In the following text string,

John lives in London. His work email address is jsmith01a@monumental.com and his personal email address is jrs124@teleworm.com.

The regular expression pattern:

[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}

will pick out the two email addresses from the paragraph and store them in the first two elements of a string array variable. You may then read one or both of these email addresses from the array using the Read From Array and other action cells.

It is recommended that you use an online regex tool such as regex101.com to test your matches before configuring the action cell.

Note: the action cell uses the .NET regex engine to perform matches.

Properties

Option

Description

Search String

The string to search (a literal value preceded by =, or a string variable).

Regex Pattern

Enter the regular expression pattern to search for (a literal value preceded by =, or a string variable).

Options

Select one or more of the following .NET options to modify the search.

Option

Meaning

Case-Insensitive Matching

Ignores character case when matching.

Multi-Line Mode

Matches a search string where the ^ and $ characters in the regular expression indicate the beginning and end of each line in the search string (instead of the beginning and end of the string).

Single-Line Mode

Matches a search string containing multiple lines by treating it as a single continuous string.

Explicit Captures Only

Use this for efficiency gains. It captures (that is, writes to memory) the parts of the search string that you want to use in other parts of the regular expression whilst excluding the parts that you do not want to use.

Ignore White Space

Ignores any whitespace in the regular expression.

Right-to-Left Mode

Matches the search string from right to left instead of from left to right.

For more information about the use of these options, see also, Regex Options below.

Matches

Enter an array variable of data type 'string' to store all occurrences of the strings that matched the regular expression.

Regex Options

This section provides information about the use of the options that are available in the Options property.

Case-Insensitive Matching

Multi-Line Mode

Single-Line Mode

Explicit Captures Only

Ignore White Space

Right-to-Left Mode

For further detail, see Microsoft .NET Regular Expression Options.

Case-Insensitive Matching

Ignores character case when matching.

Consider the following regular expression pattern for matching email addresses:

[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}

With the 'Case-Insensitive Matching' option selected, the search fully matches both email addresses in the following search string:

John lives in London. His work email address is jsmith01a@monumental.com and his personal email address is JRS124@teleworm.com.

Without the option selected, the 'JRS' part of the second email addresses is excluded from the match.

Multi-Line Mode

Matches a search string where the ^ and $ characters in the regular expression indicate the beginning and end of each line in the search string (instead of the beginning and end of the string).

Consider the following regular expression pattern for matching email addresses with these two characters inserted at the start and end of the expression:

^([a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,})$

By selecting the 'Multi-Line Mode' option, the search matches the highlighted items in the following multi-line string (that is, only lines that start and end with an email address).

Work email address - xyz@sales.co.uk
abc@monumental.com
def@hotmail.co.uk

xyz02@gmail.com - home email address

Omitting the $ character only from the expression includes the fourth email address 'xyz02@gmail.com' in the match; omitting the ^ character only, includes the first email address 'xyz@sales.co.uk' in the match.

Without the option selected, no match is returned.

Single-Line Mode

Matches a search string containing multiple lines by treating it as a single continuous string. 

Consider the following regular expression pattern for matching every character on a new line:

^.+

With the option selected, the search matches every character in the search string including newline (“\n”) characters. The entire search string is therefore matched.

abc@monumental.com
def@monumentalsales.com
xyz@hotmail.co.uk
xyz02@gmail.co.uk

Without the option selected, only the first line is matched because the new line character at the end of the first line is not matched.

Note: a new line used in .NET regular expressions is "\n" whereas the @New Line system variable contains a carriage return character followed by a newline character (that is, "\r\n").

Explicit Captures Only

Although this does not affect what is returned in a match, it improves efficiency by capturing (that is, writing to memory) the parts of the search string that you want to use in other parts of the regular expression whilst excluding the parts that you do not want to use. The parts to exclude must be tagged as non-capture groups by using the (?:) syntax in the regular expression.

Consider the following regular expression pattern for matching email addresses. The expression contains two groups (email user name and email domain name), where ?: excludes the domain name part of the email address from the match:

([a-z0-9._%+-]+)@(?:[a-z0-9.-]+\.[a-z]{2,})

With the option selected, although both email addresses in the following search string are matched, the usernames 'jsmith01a' and 'jrs124' are captured whilst the domains '@monumental.com' and '@teleworm.com' are not.

John lives in London. His work email address is jsmith01a@monumental.com and his personal email address is jrs124@teleworm.com.

Without the option selected, the usernames are captured.

Ignore White Space

Ignores any whitespace in the regular expression. This is particularly useful for complex regular expressions whose parts have been separated to aid readability.

Consider the following regular expression pattern for matching email addresses, where the patterns for each part of an email address are separated by spaces to aid readability of the expression:

[a-z0-9._%+-]+   @[a-z0-9.-]   +\.[a-z]{2,}

With the option selected, the search will fully match both email addresses in the following search string.

John lives in London. His work email address is jsmith01a@monumental.com and his personal email address is jrs124@teleworm.com.

Without the option selected, no match is returned.

Right-to-Left Mode

Matches the search string from right to left instead of from left to right. This is useful where, for example, you want the string array variable in the Matches property to be populated in reverse order without needing to change other action cells that use the returned matches. 

Consider the following regular expression pattern for matching email addresses:

[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}

With the 'Right-to-Left Mode' option selected, the search fully matches both email addresses in the following search string,

John lives in London. His work email address is jsmith01a@monumental.com and his personal email address is jrs124@teleworm.com.

and the string array is populated like this:

string array

jrs124@teleworm.com

jsmith01a@monumental.com

Without the option selected, the string array is populated like this:

string array

jsmith01a@monumental.com

jrs124@teleworm.com

Exit Points

Exit Points

Description

Matched

This is taken if there was at least one occurrence of the regular expression pattern in the search string.

Not Matched

This is taken if there were no occurrences of the regular expression pattern in the search string.

Error

This is taken if an internal error occurred or if the regular expression has not executed in one second (some regular expressions can take a long time to execute).