[r] How to sum data.frame column values?

I have a data frame with several columns; some numeric and some character. How to compute the sum of a specific column? I’ve googled for this and I see numerous functions (sum, cumsum, rowsum, rowSums, colSums, aggregate, apply) but I can’t make sense of it all.

For example suppose I have a data frame people with the following columns

people <- read(
  text = 
    "Name Height Weight
    Mary 65     110
    John 70     200
    Jane 64     115", 
  header = TRUE
)
…

How do I get the sum of all the weights?

This question is related to r dataframe sum aggregate-functions

The answer is


to order after the colsum :

order(colSums(people),decreasing=TRUE)

if more than 20+ columns

order(colSums(people[,c(5:25)],decreasing=TRUE) ##in case of keeping the first 4 columns remaining.

To sum values in data.frame you first need to extract them as a vector.

There are several way to do it:

# $ operatior
x <- people$Weight
x
# [1] 65 70 64

Or using [, ] similar to matrix:

x <- people[, 'Weight']
x
# [1] 65 70 64

Once you have the vector you can use any vector-to-scalar function to aggregate the result:

sum(people[, 'Weight'])
# [1] 199

If you have NA values in your data, you should specify na.rm parameter:

sum(people[, 'Weight'], na.rm = TRUE)

When you have 'NA' values in the column, then

sum(as.numeric(JuneData1$Account.Balance), na.rm = TRUE)

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe

Examples related to sum

Iterating over arrays in Python 3 Get total of Pandas column Pandas: sum DataFrame rows for given columns How to find sum of several integers input by user using do/while, While statement or For statement SQL Sum Multiple rows into one Python Pandas counting and summing specific conditions SELECT query with CASE condition and SUM() Using SUMIFS with multiple AND OR conditions How to SUM parts of a column which have same text value in different column in the same row how to count the total number of lines in a text file using python

Examples related to aggregate-functions

Spark SQL: apply aggregate functions to a list of columns GROUP BY without aggregate function GROUP BY + CASE statement must appear in the GROUP BY clause or be used in an aggregate function Naming returned columns in Pandas aggregate function? Concatenate multiple result rows of one column into one, group by another column How to include "zero" / "0" results in COUNT aggregate? Apply multiple functions to multiple groupby columns Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause Optimal way to concatenate/aggregate strings