Regular Expression features

Regular expressions are special text templates that are used to find and match certain strings of characters in the text.

Regular expression support in filters and replaces in our service allows you to:

  • Add any text to the beginning or end of messages.
  • Filter messages with any links.
  • Delete links, phone numbers, email, and other contact information from copied messages.
  • Filter long messages or vice versa only short ones.

To activate the syntax of regular expressions, you must activate the corresponding option in the filter or replacement settings dialog.

The syntax of the controls

To start using regular expressions, you need to know the syntax, i.e. understand what certain characters mean.

 

Main characters

  • ^ - the beginning of the text
  • $ - the end of the text
  • . - any character
  • \d - any digit
  • \w - any verbal character (Cyrillic, Latin, numbers, "_", etc.)
  • \n - line break character (new line)
  • \t - tab character
  • \s - all whitespace characters

 

Quantitative characters

Quantitative symbols indicate the number of characters or groups facing them.

  • * - 0 or more times
  • + - 1 or more times
  • ? - 0 or 1 time
  • {5} - exactly 5 times
  • {1.5} - from 1 to 5 times
  • {1,} - from 1 time
  • {,5} - up to 5 times

For example:

 

  • a+ - corresponds to a, aa, aaa, etc.
  • .* - corresponds to any text, even empty;
  • ab{3} - abbb

Groups

Parentheses ( ) - allow you to combine characters into a group. Groups are most often used with quantitative symbols from the previous section, or with a vertical line symbol | which denotes a logical OR.

For example:

  • ((Bob)|(Alice)) - corresponds to the lines of both Bob and Alice;
  • (abc){3} - corresponds to abcabcabc;

 

Character set

Square brackets [ ] are used to specify the character set, and characters can be listed inside them as follows:

  • [a-z] - any Cyrillic character
  • [abvgd] - any character a,b,c,d or d (the same as and [a-d])
  • [a-z] - any Latin character
  • [0-5] - any digit from 1 to 5

Inside square brackets, the symbol ^ means a logical NOT (negation symbol):

  • [^a-z] - any character EXCEPT the Cyrillic character
  • [a-z] - any character EXCEPT the Latin character
  • [^\s] - any NON-whitespace character

 

Escaping characters

If you want to use a service character simply as text, and not as part of the syntax of a regular expression, then it must be escaped using the backslash \ .

For example:

  • :\) - corresponds to the string :) (the bracket does not mean a group, but just a bracket);
  • who\? - corresponds to the line who? (the question mark in this case is not a quantitative symbol, but just a question mark)

Ready-made templates

Below are ready-made regular expression templates that you can copy and use for your tasks:

  • Any telegram username is @[a-z0-9_]{4,}
  • Any link is https?:\/\/[^\s]*
The list will be expanded later, to add new ones, write to support @value_maker.To test your own controls, use the site regexp.com .