[r] Pattern matching using a wildcard

How do I identify a string using a wildcard?

I've found glob2rx, but I don't quite understand how to use it. I tried using the following code to pick the rows of the data frame that begin with the word blue:

# make data frame
a <- data.frame( x =  c('red','blue1','blue2', 'red2'))

# 1
result <- subset(a, x == glob2rx("blue*") )

# 2
test = ls(pattern = glob2rx("blue*"))
result2 <- subset(a, x == test )

# 3
result3 <- subset(a, x == pattern("blue*") )

However, neither of these worked. I'm not sure if I should be using a different function to try and do this.

This question is related to r pattern-matching wildcard

The answer is


glob2rx() converts a pattern including a wildcard into the equivalent regular expression. You then need to pass this regular expression onto one of R's pattern matching tools.

If you want to match "blue*" where * has the usual wildcard, not regular expression, meaning we use glob2rx() to convert the wildcard pattern into a useful regular expression:

> glob2rx("blue*")
[1] "^blue"

The returned object is a regular expression.

Given your data:

x <- c('red','blue1','blue2', 'red2')

we can pattern match using grep() or similar tools:

> grx <- glob2rx("blue*")
> grep(grx, x)
[1] 2 3
> grep(grx, x, value = TRUE)
[1] "blue1" "blue2"
> grepl(grx, x)
[1] FALSE  TRUE  TRUE FALSE

As for the selecting rows problem you posted

> a <- data.frame(x =  c('red','blue1','blue2', 'red2'))
> with(a, a[grepl(grx, x), ])
[1] blue1 blue2
Levels: blue1 blue2 red red2
> with(a, a[grep(grx, x), ])
[1] blue1 blue2
Levels: blue1 blue2 red red2

or via subset():

> with(a, subset(a, subset = grepl(grx, x)))
      x
2 blue1
3 blue2

Hope that explains what grob2rx() does and how to use it?


If you really do want to use wildcards to identify specific variables, then you can use a combination of ls() and grep() as follows:

l = ls()
vars.with.result <- l[grep("result", l)]


You can also use package data.table and it's Like function, details given below How to select R data.table rows based on substring match (a la SQL like)


You're on the right track - the keyword you should be googling is Regular Expressions. R does support them in a more direct way than this using grep() and a few other alternatives.

Here's a detailed discussion: http://www.regular-expressions.info/rlanguage.html


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to pattern-matching

How to select lines between two marker patterns which may occur multiple times with awk/sed Check if string ends with certain pattern Ruby Regexp group matching, assign variables on 1 line Pattern matching using a wildcard How to find a whole word in a String in java Count character occurrences in a string in C++ How can I use inverse or negative wildcards when pattern matching in a unix/linux shell?

Examples related to wildcard

Matching strings with wildcard Rename multiple files in cmd Google Spreadsheet, Count IF contains a string using wildcards in LDAP search filters/queries Can the "IN" operator use LIKE-wildcards (%) in Oracle? Using SED with wildcard Need to perform Wildcard (*,?, etc) search on a string using Regex Check if a file exists with wildcard in shell script Pattern matching using a wildcard wildcard * in CSS for classes