What is a Regex?

Imagine you have a text and want to extract all years that are mentioned in that document. Key word search is not an option, so what is there to do?


Regexes offer a solution, because they look for patterns rather than exact matches. It is possible to mix and match these patterns, which helps us extract a wide variety of elements from a text.

The above example – all years mentioned in a document – can be reframed as looking a pattern of four consecutive digits. The regex “\\d” for instance looks for all digits in a document. The regex “\\d{4}” would look for all series of 4 digits in a document. This is an easy strategy to help you find all years that are listed.


pattern <- "\\d{4}"

There are many other patterns, too. You can look for:

  • – “\\w” matches all letters
  • – “\\w+” matches all words with at least one letter
  • – “[A-Z]\w+” matches all capitalized words
  • – “\\s” matches all whitespaces
  • – …

There are lots of regexes. Mastery of regexes requires practice. A great website to test regexes, find support, and try out different code is https://regexr.com.

access_time Last update May 8, 2020.

chat networking coding local-network layer menu folders diagram panel route line-chart compass search flow data-sharing search-1 message target translator candidates studying chat networking coding local-network layer menu folders diagram panel route line-chart compass search flow data-sharing search-1 message target translator candidates studying