[r] How to remove last n characters from every element in the R vector

I am very new to R, and I could not find a simple example online of how to remove the last n characters from every element of a vector (array?)

I come from a Java background, so what I would like to do is to iterate over every element of a$data and remove the last 3 characters from every element.

How would you go about it?

This question is related to r string

The answer is


Here is an example of what I would do. I hope it's what you're looking for.

char_array = c("foo_bar","bar_foo","apple","beer")
a = data.frame("data"=char_array,"data2"=1:4)
a$data = substr(a$data,1,nchar(a$data)-3)

a should now contain:

  data data2
1 foo_ 1
2 bar_ 2
3   ap 3
4    b 4

Similar to @Matthew_Plourde using gsub

However, using a pattern that will trim to zero characters i.e. return "" if the original string is shorter than the number of characters to cut:

cs <- c("foo_bar","bar_foo","apple","beer","so","a")
gsub('.{0,3}$', '', cs)
# [1] "foo_" "bar_" "ap"   "b"    ""    ""

Difference is, {0,3} quantifier indicates 0 to 3 matches, whereas {3} requires exactly 3 matches otherwise no match is found in which case gsub returns the original, unmodified string.

N.B. using {,3} would be equivalent to {0,3}, I simply prefer the latter notation.

See here for more information on regex quantifiers: https://www.regular-expressions.info/refrepeat.html


The same may be achieved with the stringi package:

library('stringi')
char_array <- c("foo_bar","bar_foo","apple","beer")
a <- data.frame("data"=char_array, "data2"=1:4)
(a$data <- stri_sub(a$data, 1, -4)) # from the first to the last but 4th char
## [1] "foo_" "bar_" "ap"   "b" 

Here's a way with gsub:

cs <- c("foo_bar","bar_foo","apple","beer")
gsub('.{3}$', '', cs)
# [1] "foo_" "bar_" "ap"   "b"

Although this is mostly the same with the answer by @nfmcclure, I prefer using stringr package as it provdies a set of functions whose names are most consistent and descriptive than those in base R (in fact I always google for "how to get the number of characters in R" as I can't remember the name nchar()).

library(stringr)
str_sub(iris$Species, end=-4)
#or 
str_sub(iris$Species, 1, str_length(iris$Species)-3)

This removes the last 3 characters from each value at Species column.