[regex] Extract a substring according to a pattern

Suppose I have a list of string:

string = c("G1:E001", "G2:E002", "G3:E003")

Now I hope to get a vector of string that contains only the parts after the colon ":", i.e substring = c(E001,E002,E003).

Is there a convenient way in R to do this? Using substr?

This question is related to regex r substr

The answer is


For example using gsub or sub

    gsub('.*:(.*)','\\1',string)
    [1] "E001" "E002" "E003"

If you are using data.table then tstrsplit() is a natural choice:

tstrsplit(string, ":")[[2]]
[1] "E001" "E002" "E003"

The unglue package provides an alternative, no knowledge about regular expressions is required for simple cases, here we'd do :

# install.packages("unglue")
library(unglue)
string = c("G1:E001", "G2:E002", "G3:E003")
unglue_vec(string,"{x}:{y}", var = "y")
#> [1] "E001" "E002" "E003"

Created on 2019-11-06 by the reprex package (v0.3.0)

More info : https://github.com/moodymudskipper/unglue/blob/master/README.md


This should do:

gsub("[A-Z][1-9]:", "", string)

gives

[1] "E001" "E002" "E003"

Late to the party, but for posterity, the stringr package (part of the popular "tidyverse" suite of packages) now provides functions with harmonised signatures for string handling:

string <- c("G1:E001", "G2:E002", "G3:E003")
# match string to keep
stringr::str_extract(string = string, pattern = "E[0-9]+")
# [1] "E001" "E002" "E003"

# replace leading string with ""
stringr::str_remove(string = string, pattern = "^.*:")
# [1] "E001" "E002" "E003"

Another method to extract a substring

library(stringr)
substring <- str_extract(string, regex("(?<=:).*"))
#[1] "E001" "E002" "E003
  • (?<=:): look behind the colon (:)

Here is another simple answer

gsub("^.*:","", string)

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to substr

Get last field using awk substr Extract a substring according to a pattern How do I remove the last comma from a string using PHP? How to grab substring before a specified character jQuery or JavaScript How to trim a file extension from a String in JavaScript? What is the difference between String.slice and String.substring? What linux shell command returns a part of a string?