[r] R error "sum not meaningful for factors"

I have a file called rRna_RDP_taxonomy_phylum with the following data :

364  "Firmicutes"            39.31
244  "Proteobacteria"        26.35
218  "Actinobacteria"        23.54
65   "Bacteroidetes"         7.02
22   "Fusobacteria"          2.38
6    "Thermotogae"           0.65
3     unclassified_Bacteria  0.32
2    "Spirochaetes"          0.22
1    "Tenericutes"           0.11
1     Cyanobacteria          0.11

And I'm using this code for creating a pie chart in R:

if(file.exists("rRna_RDP_taxonomy_phylum")){
    family <- read.table ("rRna_RDP_taxonomy_phylum", sep="\t")
    piedat <- rbind(family[1:7, ],
                as.data.frame(t(c(sum(family[8:nrow(family),1]),
                                "Others",
                                sum(family[8:nrow(family),3])))))
    png(file="../graph/RDP_phylum_low.png", width=600, height=550, res=75)
    pie(as.numeric(piedat$V3), labels=piedat$V3, clockwise=TRUE, col=graph_col, main="More representative Phyliums")
    legend("topright", legend=piedat$V2, cex=0.8, fill=graph_col)
    dev.off()
    png(file="../graph/RDP_phylm_high.png", width=1300, height=850, res=75)
    pie(as.numeric(piedat$V3), labels=piedat$V3, clockwise=TRUE, col=graph_col, main="More representative Phyliums")
    legend("topright", legend=piedat$V2, cex=0.8, fill=graph_col)
    dev.off()
}

I've been using this code for different datafiles and it works fine, but with the file presented adobe it crash returning the following message:

Error in Summary.factor(c(6L, 2L, 1L), na.rm = FALSE) : 
  sum not meaningful for factors
Calls: rbind -> as.data.frame -> t -> Summary.factor
Execution halted

I need to understand why it crash with this file and if there's any way to prevent this kind of errors.

Thanks!

This question is related to r r-factor categorical-data

The answer is


The error comes when you try to call sum(x) and x is a factor.

What that means is that one of your columns, though they look like numbers are actually factors (what you are seeing is the text representation)

simple fix, convert to numeric. However, it needs an intermeidate step of converting to character first. Use the following:

family[, 1] <- as.numeric(as.character( family[, 1] ))
family[, 3] <- as.numeric(as.character( family[, 3] ))

For a detailed explanation of why the intermediate as.character step is needed, take a look at this question: How to convert a factor to integer\numeric without loss of information?


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to r-factor

Coerce multiple columns to factors at once Plotting with ggplot2: "Error: Discrete value supplied to continuous scale" on categorical y-axis R error "sum not meaningful for factors" How do I convert certain columns of a data frame to become factors? Colouring plot by factor in R Converting a factor to numeric without losing information R (as.numeric() doesn't seem to work) Imported a csv-dataset to R but the values becomes factors Drop unused factor levels in a subsetted data frame

Examples related to categorical-data

pandas dataframe convert column type to string or categorical Plotting with ggplot2: "Error: Discrete value supplied to continuous scale" on categorical y-axis Make Frequency Histogram for Factor Variables R error "sum not meaningful for factors" How to force R to use a specified factor level as reference in a regression?