A little modification on @Andrie solution gives also the complement:
substrR <- function(x, n) {
if(n > 0) substr(x, (nchar(x)-n+1), nchar(x)) else substr(x, 1, (nchar(x)+n))
}
x <- "moSvmC20F.5.rda"
substrR(x,-4)
[1] "moSvmC20F.5"
That was what I was looking for. And it invites to the left side:
substrL <- function(x, n){
if(n > 0) substr(x, 1, n) else substr(x, -n+1, nchar(x))
}
substrL(substrR(x,-4),-2)
[1] "SvmC20F.5"
I use substr
too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:
(1) Split the characters
splits <- strsplit("Give me your food.", split = "")
(2) Extract the last 6 characters
tail(splits[[1]], n=6)
Output:
[1] " " "f" "o" "o" "d" "."
Each of the character can be accessed by splits[[1]][x]
, where x is 1 to 6.
Try this:
x <- "some text in a string"
n <- 5
substr(x, nchar(x)-n, nchar(x))
It shoudl give:
[1] "string"
I wrote some functions that can do all of these after failing my last exam on string manipulation in R programming. If you are coming from Excel, these functions will be similar to LEFT()
, RIGHT()
, and MID()
functions.
# This counts from the left and then extract n characters
str_left <- function(string, n) {
substr(string, 1, n)
}
# This counts from the right and then extract n characters
str_right <- function(string, n) {
substr(string, nchar(string) - (n - 1), nchar(string))
}
# This extract characters from the middle
str_mid <- function(string, from = 2, to = 5){
substr(string, from, to)
}
Examples:
x <- "some text in a string"
str_left(x, 4)
[1] "some"
str_right(x, 6)
[1] "string"
str_mid(x, 6, 9)
[1] "text"
Just in case if a range of characters need to be picked:
# For example, to get the date part from the string
substrRightRange <- function(x, m, n){substr(x, nchar(x)-m+1, nchar(x)-m+n)}
value <- "REGNDATE:20170526RN"
substrRightRange(value, 10, 8)
[1] "20170526"
Use stri_sub
function from stringi
package.
To get substring from the end, use negative numbers.
Look below for the examples:
stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",-3,-1)
[1] "cde"
You can install this package from github: https://github.com/Rexamine/stringi
It is available on CRAN now, simply type
install.packages("stringi")
to install this package.
I used the following code to get the last character of a string.
substr(output, nchar(stringOfInterest), nchar(stringOfInterest))
You can play with the nchar(stringOfInterest) to figure out how to get last few characters.
An alternative to substr
is to split the string into a list of single characters and process that:
N <- 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)
someone before uses a similar solution to mine, but I find it easier to think as below:
> text<-"some text in a string" # we want to have only the last word "string" with 6 letter
> n<-5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)-n,stop=nchar(text))
This will bring the last characters as desired.
A simple base R solution using the substring()
function (who knew this function even existed?):
RIGHT = function(x,n){
substring(x,nchar(x)-n+1)
}
This takes advantage of basically being substr()
underneath but has a default end value of 1,000,000.
Examples:
> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"
UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.
And if you want a vectorised version (based on Andrie's code)
substrRight <- function(x, n){
sapply(x, function(xx)
substr(xx, (nchar(xx)-n+1), nchar(xx))
)
}
> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
"45" "DE"
Note that I have changed (nchar(x)-n)
to (nchar(x)-n+1)
to get n
characters.
If you don't mind using the stringr
package, str_sub
is handy because you can use negatives to count backward:
x <- "some text in a string"
str_sub(x,-6,-1)
[1] "string"
Or, as Max points out in a comment to this answer,
str_sub(x, start= -6)
[1] "string"
str = 'This is an example'
n = 7
result = substr(str,(nchar(str)+1)-n,nchar(str))
print(result)
> [1] "example"
>
Another reasonably straightforward way is to use regular expressions and sub
:
sub('.*(?=.$)', '', string, perl=T)
So, "get rid of everything followed by one character". To grab more characters off the end, add however many dots in the lookahead assertion:
sub('.*(?=.{2}$)', '', string, perl=T)
where .{2}
means ..
, or "any two characters", so meaning "get rid of everything followed by two characters".
sub('.*(?=.{3}$)', '', string, perl=T)
for three characters, etc. You can set the number of characters to grab with a variable, but you'll have to paste
the variable value into the regular expression string:
n = 3
sub(paste('.+(?=.{', n, '})', sep=''), '', string, perl=T)
Source: Stackoverflow.com