A.2 Regular Expressions

MailMarshal uses regular expressions in header matching and rewriting rules. For more information about these rules, see “Content Analysis Policy”. MailMarshal also uses regular expressions in category scripts. For more information about category scripts, see the technical references “MailMarshal Anti-Spam Configuration” and “MailMarshal Advanced Anti-Spam Configuration,” available from the MailMarshal support page at www.trustwave.com.

MailMarshal implements a full-featured regular expression syntax. Full documentation of this syntax is beyond the scope of this manual. For additional documentation and links to further information, see Trustwave Knowledge Base article Q10520.

This appendix provides limited information about some commonly used features and some extensions specific to MailMarshal.

Information 

Note: MailMarshal also uses Regular Expressions in TextCensor scripts. This feature uses a different Regular Expression engine with very similar syntax to that described in this section. For more information and a link to detailed documentation, see “Anchored Regular Expressions”.

 

A.2.1 Shortcuts

The arrow to the right of each field on the Expressions tab of the header rule panel provides access to some commonly used Regular Expression features.

Table 31: Regular Expression shortcuts

Selection

Inserts

Usage

Any Character

.

Matches any single character.

Character in range

[ ]

Enter a range or set of characters to be matched within the brackets. For instance, to match lower case characters you could enter a-z between the brackets.

Character not in range

[^]

Enter a range or set of characters after the ^. Matches any character not in the set.

Beginning of line

^

Text to the right of the ^ will only match if found at the beginning of the line.

End of line

$

Text to the left of the $ will only match if found at the end of the line.

Tagged expression

( )

The content within the parentheses will be considered as a single expression for repeat purposes. This expression will be saved for use within the substitution field.

Or

|

The field will be matched if it matches either the expression before the | or the expression after the |.

0 or more matches

*

The expression before the * will be matched if it is repeated any number of times, including zero.

1 or more matches

+

The expression before the + will be matched if it is repeated at least once.

Repeat

{ }

Enter a number or two numbers separated by a comma within the braces. The expression before the braces will be matched if it is repeated the number of times specified. See “Repeat Operators * + ? {}”.

Whitespace

[[:space:]]

Matches a single whitespace character (space, tab, and so on.).

Alphanumeric character

[[:alnum:]]

Matches a single letter or number character.

Alphabetic character

[[:alpha:]]

Matches a single letter character.

Decimal digit

[[:digit:]]

Matches a single number character 0-9.

A.2.2 Reserved Characters

Some characters have special meanings within regular expressions.

A.2.2.1 Operators

The following characters are reserved as regular expression operators:

* . ? + ( ) { } [ ] $ \ | ^ <

To match any of these characters literally, precede it with \

For example, to match marshal.com enter Marshal\.com

A.2.2.2 Wildcard Character .

The dot character (.) matches any single character.

A.2.2.3 Repeat Operators * + ? {}

A repeat is an expression that occurs an arbitrary number of times.

An expression followed by * can be present any number of times, including zero. An expression followed by + can be present any number of times, but must occur at least once. An expression followed by ? may occur zero times or once only. You can specify a precise range of repeated occurrences as a comma-separated pair of numbers within {}. For instance,

   ba* will match b, ba, baaa, etc.

   ba+ will match ba or baaaa for example but not b.

   ba? will match b or ba.

   ba{2,4} will match baa, baaa and baaaa.

A.2.2.4 Parentheses ( )

Parentheses serve two purposes:

To group items together into a sub-expression. You can apply repeat operators to sub-expressions in order to search for repeated text.

To mark a sub-expression that generated a match, so it can be used later for substitution.

For example, the expression (ab)* would match all of the string

   ababab

The expression “ab” would be available in a variable (tagged expression) with a name in the range $1...$9 (see the matching and substitution examples in following sections).

A.2.2.5 Alternatives

Alternatives occur when the expression can match either one sub-expression or another. In this case, each alternative is separated by a |. Each alternative is the largest possible previous sub-expression (this is the opposite to repetition operator behavior).

a(b|c) could match ab or ac

abc|def could match abc or def

A.2.3 Examples

The following sections show examples of matching and substitution strings.

A.2.3.1 Matching

The expression

   (.+)@(.+)\.ourcompany\.com$

will match a sequence of 1 or more characters followed by an @ followed by another sequence of 1 or more characters, followed by .ourcompany.com at the end of the field.

That is, it will match john@host.ourcompany.com and john.smith@host.subdomain.ourcompany.com but not peter@host.ourcompany.com.au

A.2.3.2 Substitution

Using the example given in the preceding section, the substitution expression

   $1@$2.co.uk.eu

would yield john@host.co.uk.eu, john.smith@host.subdomain.co.uk.eu and peter@host.ourcompany.com.au respectively. The last result may be somewhat surprising, but data that does not match part of the regular expression is simply copied across.

A.2.4 Map Files

MailMarshal allows substitution using regular expressions to search for an entry in text file known as a map file. Each line in the map file contains two values separated by a comma. If the search expression matches the first value in a line, MailMarshal substitutes the second value. If the search expression does not match the first value in any line, MailMarshal substitutes the search expression.

A typical use of map files is to redirect incoming email to arbitrary addresses. The following simple example modifies email addresses using a map file.

A.2.4.1 Map file

   john@domain.co.uk, john@domain2.co.uk
   peter@domain.co.uk, peter@host1.domain.co.uk

A.2.4.2 Search expression

(.+)@domain\.co\.uk$

A.2.4.3 Lookup key

$1@domain.co.uk

A.2.4.4 Sample results

The following table shows the matching addresses when the sample mapping file above is used.

Table 32: Map file example results

Input Email Address

Result

john@domain.co.uk

john@domain2.co.uk

peter@domain.co.uk

peter@host1.domain.co.uk

alice@domain.co.uk

alice@domain.co.uk

 

Trustwave MailMarshal 10.1.0 User Guide March 2024
< Previous Section   |   Next Section >
Full document: see MailMarshal Documentation.