[r] How to access the last value in a vector?

Suppose I have a vector that is nested in a dataframe one or two levels. Is there a quick and dirty way to access the last value, without using the length() function? Something ala PERL's $# special var?

So I would like something like:

dat$vec1$vec2[$#]

instead of

dat$vec1$vec2[length(dat$vec1$vec2)]

This question is related to r dataframe vector

The answer is


I use the tail function:

tail(vector, n=1)

The nice thing with tail is that it works on dataframes too, unlike the x[length(x)] idiom.


I just benchmarked these two approaches on data frame with 663,552 rows using the following code:

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    s[length(s)]
  })
  )

 user  system elapsed 
  3.722   0.000   3.594 

and

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    tail(s, n=1)
  })
  )

   user  system elapsed 
 28.174   0.000  27.662 

So, assuming you're working with vectors, accessing the length position is significantly faster.


If you're looking for something as nice as Python's x[-1] notation, I think you're out of luck. The standard idiom is

x[length(x)]  

but it's easy enough to write a function to do this:

last <- function(x) { return( x[length(x)] ) }

This missing feature in R annoys me too!


I use the tail function:

tail(vector, n=1)

The nice thing with tail is that it works on dataframes too, unlike the x[length(x)] idiom.


Combining lindelof's and Gregg Lind's ideas:

last <- function(x) { tail(x, n = 1) }

Working at the prompt, I usually omit the n=, i.e. tail(x, 1).

Unlike last from the pastecs package, head and tail (from utils) work not only on vectors but also on data frames etc., and also can return data "without first/last n elements", e.g.

but.last <- function(x) { head(x, n = -1) }

(Note that you have to use head for this, instead of tail.)


If you're looking for something as nice as Python's x[-1] notation, I think you're out of luck. The standard idiom is

x[length(x)]  

but it's easy enough to write a function to do this:

last <- function(x) { return( x[length(x)] ) }

This missing feature in R annoys me too!


Combining lindelof's and Gregg Lind's ideas:

last <- function(x) { tail(x, n = 1) }

Working at the prompt, I usually omit the n=, i.e. tail(x, 1).

Unlike last from the pastecs package, head and tail (from utils) work not only on vectors but also on data frames etc., and also can return data "without first/last n elements", e.g.

but.last <- function(x) { head(x, n = -1) }

(Note that you have to use head for this, instead of tail.)


I use the tail function:

tail(vector, n=1)

The nice thing with tail is that it works on dataframes too, unlike the x[length(x)] idiom.


The dplyr package includes a function last():

last(mtcars$mpg)
# [1] 21.4

Package data.table includes last function

library(data.table)
last(c(1:10))
# [1] 10

The dplyr package includes a function last():

last(mtcars$mpg)
# [1] 21.4

The xts package provides a last function:

library(xts)
a <- 1:100
last(a)
[1] 100

I have another method for finding the last element in a vector. Say the vector is a.

> a<-c(1:100,555)
> end(a)      #Gives indices of last and first positions
[1] 101   1
> a[end(a)[1]]   #Gives last element in a vector
[1] 555

There you go!


Another way is to take the first element of the reversed vector:

rev(dat$vect1$vec2)[1]

If you're looking for something as nice as Python's x[-1] notation, I think you're out of luck. The standard idiom is

x[length(x)]  

but it's easy enough to write a function to do this:

last <- function(x) { return( x[length(x)] ) }

This missing feature in R annoys me too!


Whats about

> a <- c(1:100,555)
> a[NROW(a)]
[1] 555

To answer this not from an aesthetical but performance-oriented point of view, I've put all of the above suggestions through a benchmark. To be precise, I've considered the suggestions

  • x[length(x)]
  • mylast(x), where mylast is a C++ function implemented through Rcpp,
  • tail(x, n=1)
  • dplyr::last(x)
  • x[end(x)[1]]]
  • rev(x)[1]

and applied them to random vectors of various sizes (10^3, 10^4, 10^5, 10^6, and 10^7). Before we look at the numbers, I think it should be clear that anything that becomes noticeably slower with greater input size (i.e., anything that is not O(1)) is not an option. Here's the code that I used:

Rcpp::cppFunction('double mylast(NumericVector x) { int n = x.size(); return x[n-1]; }')
options(width=100)
for (n in c(1e3,1e4,1e5,1e6,1e7)) {
  x <- runif(n);
  print(microbenchmark::microbenchmark(x[length(x)],
                                       mylast(x),
                                       tail(x, n=1),
                                       dplyr::last(x),
                                       x[end(x)[1]],
                                       rev(x)[1]))}

It gives me

Unit: nanoseconds
           expr   min      lq     mean  median      uq   max neval
   x[length(x)]   171   291.5   388.91   337.5   390.0  3233   100
      mylast(x)  1291  1832.0  2329.11  2063.0  2276.0 19053   100
 tail(x, n = 1)  7718  9589.5 11236.27 10683.0 12149.0 32711   100
 dplyr::last(x) 16341 19049.5 22080.23 21673.0 23485.5 70047   100
   x[end(x)[1]]  7688 10434.0 13288.05 11889.5 13166.5 78536   100
      rev(x)[1]  7829  8951.5 10995.59  9883.0 10890.0 45763   100
Unit: nanoseconds
           expr   min      lq     mean  median      uq    max neval
   x[length(x)]   204   323.0   475.76   386.5   459.5   6029   100
      mylast(x)  1469  2102.5  2708.50  2462.0  2995.0   9723   100
 tail(x, n = 1)  7671  9504.5 12470.82 10986.5 12748.0  62320   100
 dplyr::last(x) 15703 19933.5 26352.66 22469.5 25356.5 126314   100
   x[end(x)[1]] 13766 18800.5 27137.17 21677.5 26207.5  95982   100
      rev(x)[1] 52785 58624.0 78640.93 60213.0 72778.0 851113   100
Unit: nanoseconds
           expr     min        lq       mean    median        uq     max neval
   x[length(x)]     214     346.0     583.40     529.5     720.0    1512   100
      mylast(x)    1393    2126.0    4872.60    4905.5    7338.0    9806   100
 tail(x, n = 1)    8343   10384.0   19558.05   18121.0   25417.0   69608   100
 dplyr::last(x)   16065   22960.0   36671.13   37212.0   48071.5   75946   100
   x[end(x)[1]]  360176  404965.5  432528.84  424798.0  450996.0  710501   100
      rev(x)[1] 1060547 1140149.0 1189297.38 1180997.5 1225849.0 1383479   100
Unit: nanoseconds
           expr     min        lq        mean    median         uq      max neval
   x[length(x)]     327     584.0     1150.75     996.5     1652.5     3974   100
      mylast(x)    2060    3128.5     7541.51    8899.0     9958.0    16175   100
 tail(x, n = 1)   10484   16936.0    30250.11   34030.0    39355.0    52689   100
 dplyr::last(x)   19133   47444.5    55280.09   61205.5    66312.5   105851   100
   x[end(x)[1]] 1110956 2298408.0  3670360.45 2334753.0  4475915.0 19235341   100
      rev(x)[1] 6536063 7969103.0 11004418.46 9973664.5 12340089.5 28447454   100
Unit: nanoseconds
           expr      min         lq         mean      median          uq       max neval
   x[length(x)]      327      722.0      1644.16      1133.5      2055.5     13724   100
      mylast(x)     1962     3727.5      9578.21      9951.5     12887.5     41773   100
 tail(x, n = 1)     9829    21038.0     36623.67     43710.0     48883.0     66289   100
 dplyr::last(x)    21832    35269.0     60523.40     63726.0     75539.5    200064   100
   x[end(x)[1]] 21008128 23004594.5  37356132.43  30006737.0  47839917.0 105430564   100
      rev(x)[1] 74317382 92985054.0 108618154.55 102328667.5 112443834.0 187925942   100

This immediately rules out anything involving rev or end since they're clearly not O(1) (and the resulting expressions are evaluated in a non-lazy fashion). tail and dplyr::last are not far from being O(1) but they're also considerably slower than mylast(x) and x[length(x)]. Since mylast(x) is slower than x[length(x)] and provides no benefits (rather, it's custom and does not handle an empty vector gracefully), I think the answer is clear: Please use x[length(x)].


I use the tail function:

tail(vector, n=1)

The nice thing with tail is that it works on dataframes too, unlike the x[length(x)] idiom.


I have another method for finding the last element in a vector. Say the vector is a.

> a<-c(1:100,555)
> end(a)      #Gives indices of last and first positions
[1] 101   1
> a[end(a)[1]]   #Gives last element in a vector
[1] 555

There you go!


Whats about

> a <- c(1:100,555)
> a[NROW(a)]
[1] 555

I just benchmarked these two approaches on data frame with 663,552 rows using the following code:

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    s[length(s)]
  })
  )

 user  system elapsed 
  3.722   0.000   3.594 

and

system.time(
  resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
    s <- strsplit(x, ".", fixed=TRUE)[[1]]
    tail(s, n=1)
  })
  )

   user  system elapsed 
 28.174   0.000  27.662 

So, assuming you're working with vectors, accessing the length position is significantly faster.


The xts package provides a last function:

library(xts)
a <- 1:100
last(a)
[1] 100

Package data.table includes last function

library(data.table)
last(c(1:10))
# [1] 10

If you're looking for something as nice as Python's x[-1] notation, I think you're out of luck. The standard idiom is

x[length(x)]  

but it's easy enough to write a function to do this:

last <- function(x) { return( x[length(x)] ) }

This missing feature in R annoys me too!


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe

Examples related to vector

How to plot vectors in python using matplotlib How can I get the size of an std::vector as an int? Convert Mat to Array/Vector in OpenCV Are vectors passed to functions by value or by reference in C++ Why is it OK to return a 'vector' from a function? Append value to empty vector in R? How to initialize a vector with fixed length in R How to initialize a vector of vectors on a struct? numpy matrix vector multiplication Using atan2 to find angle between two vectors