[r] Extracting the last n characters from a string in R

How can I get the last n characters from a string in R? Is there a function like SQL's RIGHT?

This question is related to r string substring

The answer is


A little modification on @Andrie solution gives also the complement:

substrR <- function(x, n) { 
  if(n > 0) substr(x, (nchar(x)-n+1), nchar(x)) else substr(x, 1, (nchar(x)+n))
}
x <- "moSvmC20F.5.rda"
substrR(x,-4)
[1] "moSvmC20F.5"

That was what I was looking for. And it invites to the left side:

substrL <- function(x, n){ 
  if(n > 0) substr(x, 1, n) else substr(x, -n+1, nchar(x))
}
substrL(substrR(x,-4),-2)
[1] "SvmC20F.5"

I use substr too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:

(1) Split the characters

splits <- strsplit("Give me your food.", split = "")

(2) Extract the last 6 characters

tail(splits[[1]], n=6)

Output:

[1] " " "f" "o" "o" "d" "."

Each of the character can be accessed by splits[[1]][x], where x is 1 to 6.


Try this:

x <- "some text in a string"
n <- 5
substr(x, nchar(x)-n, nchar(x))

It shoudl give:

[1] "string"

I wrote some functions that can do all of these after failing my last exam on string manipulation in R programming. If you are coming from Excel, these functions will be similar to LEFT(), RIGHT(), and MID() functions.


# This counts from the left and then extract n characters

str_left <- function(string, n) {
  substr(string, 1, n)
}



# This counts from the right and then extract n characters

str_right <- function(string, n) {
  substr(string, nchar(string) - (n - 1), nchar(string))
}


# This extract characters from the middle

str_mid <- function(string, from = 2, to = 5){
  
  substr(string, from, to)
  }

Examples:

x <- "some text in a string"
str_left(x, 4)
[1] "some"

str_right(x, 6)
[1] "string"

str_mid(x, 6, 9)
[1] "text"


Just in case if a range of characters need to be picked:

# For example, to get the date part from the string

substrRightRange <- function(x, m, n){substr(x, nchar(x)-m+1, nchar(x)-m+n)}

value <- "REGNDATE:20170526RN" 
substrRightRange(value, 10, 8)

[1] "20170526"

Use stri_sub function from stringi package. To get substring from the end, use negative numbers. Look below for the examples:

stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",-3,-1)
[1] "cde"

You can install this package from github: https://github.com/Rexamine/stringi

It is available on CRAN now, simply type

install.packages("stringi")

to install this package.


I used the following code to get the last character of a string.

    substr(output, nchar(stringOfInterest), nchar(stringOfInterest))

You can play with the nchar(stringOfInterest) to figure out how to get last few characters.


An alternative to substr is to split the string into a list of single characters and process that:

N <- 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)

someone before uses a similar solution to mine, but I find it easier to think as below:

> text<-"some text in a string" # we want to have only the last word "string" with 6 letter
> n<-5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)-n,stop=nchar(text))

This will bring the last characters as desired.


A simple base R solution using the substring() function (who knew this function even existed?):

RIGHT = function(x,n){
  substring(x,nchar(x)-n+1)
}

This takes advantage of basically being substr() underneath but has a default end value of 1,000,000.

Examples:

> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"

UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.

And if you want a vectorised version (based on Andrie's code)

substrRight <- function(x, n){
  sapply(x, function(xx)
         substr(xx, (nchar(xx)-n+1), nchar(xx))
         )
}

> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
 "45"  "DE"

Note that I have changed (nchar(x)-n) to (nchar(x)-n+1) to get n characters.


If you don't mind using the stringr package, str_sub is handy because you can use negatives to count backward:

x <- "some text in a string"
str_sub(x,-6,-1)
[1] "string"

Or, as Max points out in a comment to this answer,

str_sub(x, start= -6)
[1] "string"

str = 'This is an example'
n = 7
result = substr(str,(nchar(str)+1)-n,nchar(str))
print(result)

> [1] "example"
> 

Another reasonably straightforward way is to use regular expressions and sub:

sub('.*(?=.$)', '', string, perl=T)

So, "get rid of everything followed by one character". To grab more characters off the end, add however many dots in the lookahead assertion:

sub('.*(?=.{2}$)', '', string, perl=T)

where .{2} means .., or "any two characters", so meaning "get rid of everything followed by two characters".

sub('.*(?=.{3}$)', '', string, perl=T)

for three characters, etc. You can set the number of characters to grab with a variable, but you'll have to paste the variable value into the regular expression string:

n = 3
sub(paste('.+(?=.{', n, '})', sep=''), '', string, perl=T)

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to substring

Go test string contains substring How does String substring work in Swift Delete the last two characters of the String Split String by delimiter position using oracle SQL How do I check if a string contains another string in Swift? Python: Find a substring in a string and returning the index of the substring bash, extract string before a colon SQL SELECT everything after a certain character Finding second occurrence of a substring in a string in Java Select query to remove non-numeric characters