Zero-width assertions

Most characters and metacharacters cause the matching engine to move forward in some way. Not so for the zero-width assertions. They don’t make the engine advancing through the string, they simply succeed or fail at the current cursor position of the engine.

Zero-width assertions
Zero-width assertions don’t cause the matching engine to advance through the string. They simply succeed or fail at the current cursor position of the engine.

 

Simple zero-width assertions

The following table lists the simpler zero-width assertions from the regex language:

Zero-width assertions
|    OR operator. It has a very low precedence. Crow|Servo will match either Crow or Servo, not Cro, a 'w' or an 'S', and ervo.
^    Matches at the beginning of lines in MULTILINE mode. Otherwise, matches only at the beginning of the string.
$    Matches at the end of lines in MULTILINE mode. Otherwise, matches only at the end of the string (and immediately before the newline – if any – at the end of the string).
\A    Matches only at the beginning of the string. When not in MULTILINE mode, \A and ^ are effectively the same.
\Z    Matches only at the end of the string.
\b    Word boundary: matches at the beginning or end of a word. A word is a sequence of alphanumeric characters. The beginning or end of a word is any whitespace or other non-alphanumeric character.
Attention! When in a character class, [\b] denotes the backspace character!
\B    Matches if current position is NOT a word boundary.

 

Lookahead assertions

(?= … ) The positive lookahead assertion succeeds if the contained regular expression successfully matches at the current location. But once the expression has been tried, the matching engine doesn’t advance at all. The rest of the pattern is tried right where the assertion started.
 
(?! … ) The negative lookahead assertion succeeds if the contained regular expression doesn’t match at the current location.
 

Lookahead assertions are especially useful to match filenames. The following example matches all filenames, except .bat and .exe files:
.*[.](?!bat$|exe$)[^.]*$

 

Lookbehind assertions

Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string and check if the lookbehind token can be matched there.
(?<= … ) The positive lookbehind assertion succeeds if the contained regular expression successfully matches immediately to the left of the current location.
 
(?<! … ) The negative lookbehind assertion succeeds if the contained regular expression doesn’t match immediately to the left of the current location.