How do I identify a string using a wildcard?
I've found glob2rx
, but I don't quite understand how to use it. I tried using the following code to pick the rows of the data frame that begin with the word blue
:
# make data frame
a <- data.frame( x = c('red','blue1','blue2', 'red2'))
# 1
result <- subset(a, x == glob2rx("blue*") )
# 2
test = ls(pattern = glob2rx("blue*"))
result2 <- subset(a, x == test )
# 3
result3 <- subset(a, x == pattern("blue*") )
However, neither of these worked. I'm not sure if I should be using a different function to try and do this.
This question is related to
r
pattern-matching
wildcard
glob2rx()
converts a pattern including a wildcard into the equivalent regular expression. You then need to pass this regular expression onto one of R's pattern matching tools.
If you want to match "blue*"
where *
has the usual wildcard, not regular expression, meaning we use glob2rx()
to convert the wildcard pattern into a useful regular expression:
> glob2rx("blue*")
[1] "^blue"
The returned object is a regular expression.
Given your data:
x <- c('red','blue1','blue2', 'red2')
we can pattern match using grep()
or similar tools:
> grx <- glob2rx("blue*")
> grep(grx, x)
[1] 2 3
> grep(grx, x, value = TRUE)
[1] "blue1" "blue2"
> grepl(grx, x)
[1] FALSE TRUE TRUE FALSE
As for the selecting rows problem you posted
> a <- data.frame(x = c('red','blue1','blue2', 'red2'))
> with(a, a[grepl(grx, x), ])
[1] blue1 blue2
Levels: blue1 blue2 red red2
> with(a, a[grep(grx, x), ])
[1] blue1 blue2
Levels: blue1 blue2 red red2
or via subset()
:
> with(a, subset(a, subset = grepl(grx, x)))
x
2 blue1
3 blue2
Hope that explains what grob2rx()
does and how to use it?
If you really do want to use wildcards to identify specific variables, then you can use a combination of ls()
and grep()
as follows:
l = ls()
vars.with.result <- l[grep("result", l)]
You can also use package data.table and it's Like function, details given below How to select R data.table rows based on substring match (a la SQL like)
You're on the right track - the keyword you should be googling is Regular Expressions. R does support them in a more direct way than this using grep()
and a few other alternatives.
Here's a detailed discussion: http://www.regular-expressions.info/rlanguage.html
Source: Stackoverflow.com