R comes with different regex functions. Some find matched patterns, others replace them. Check out the different commands.
?grep
## Description - grep, grepl, regexpr, gregexpr and regexec search for matches to argument pattern within each element of a character vector: they differ in the format of and amount of detail in the results.
Let’s work with an example. Say you want to extract the years from the following sentence:
# Example text:
sample_text "World War II lasted from 1939 to 1945."
In R, you announce a regex with “\\” As mentioned previously, d stands for digit and {4} indicates the length of a sequence.
# So your regex would be "\\d{4}"
pattern <- "\\d{4}"
To extract a pattern, you first identify it through gregexpr().
pattern_matching <- gregexpr(pattern, sample_text)
Second, you match it through regmatches().
regmatches(sample_text, pattern_matching)[[1]]
## [1] "1939" "1945"
access_time Last update September 3, 2019.