[r] Merge or combine by rownames

In the example below I have two datasets (Z and A). I want to merge or combine these sets by the ILMN numbers. If there is no match, fill in NA.

z <- matrix(c(0,0,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,1,0,0,0,"RND1","WDR", "PLAC8","TYBSA","GRA","TAF"), nrow=6,
    dimnames=list(c("ILMN_1651838","ILMN_1652371","ILMN_1652464","ILMN_1652952","ILMN_1653026","ILMN_1653103"),c("A","B","C","D","symbol")))

t<-matrix(c("GO:0002009", 8, 342, 1, 0.07, 0.679, 0, 0, 1, 0, 
        "GO:0030334", 6, 343, 1, 0.07, 0.065, 0, 0, 1, 0,
        "GO:0015674", 7, 350, 1, 0.07, 0.065, 1, 0, 0, 0), nrow=10, dimnames= list(c("GO.ID","LEVEL","Annotated","Significant","Expected","resultFisher","ILMN_1652464","ILMN_1651838","ILMN_1711311","ILMN_1653026")))

The result will be like this:

             [,1]         [,2]         [,3]         [,4]
GO.ID        "GO:0002009" "GO:0030334" "GO:0015674"  NA
LEVEL        "8"          "6"          "7"           NA
Annotated    "342"        "343"        "350"         NA
Significant  "1"          "1"          "1"           NA
Expected     "0.07"       "0.07"       "0.07"        NA
resultFisher "0.679"      "0.065"      "0.065"       NA
ILMN_1652464 "0"          "0"          "1"           PLAC8
ILMN_1651838 "0"          "0"          "0"           RND1
ILMN_1711311 "1"          "1"          "0"           NA
ILMN_1653026 "0"          "0"          "0"           GRA

This question is related to r merge dataframe

The answer is


you can wrap -Andrie answer into a generic function

mbind<-function(...){
 Reduce( function(x,y){cbind(x,y[match(row.names(x),row.names(y)),])}, list(...) )
}

Here, you can bind multiple frames with rownames as key


Not perfect but close:

newcol<-sapply(rownames(t), function(rn){z[match(rn, rownames(z)), 5]})
cbind(data.frame(t), newcol)

cbind.fill <- function(x, y){
  xrn <- rownames(x)
  yrn <- rownames(y)
  rn <- union(xrn, yrn)
  xcn <- colnames(x)
  ycn <- colnames(y)
  if(is.null(xrn) | is.null(yrn) | is.null(xcn) | is.null(ycn)) 
    stop("NULL rownames or colnames")
  z <- matrix(NA, nrow=length(rn), ncol=length(xcn)+length(ycn))
  rownames(z) <- rn
  colnames(z) <- c(xcn, ycn)
  idx <- match(rn, xrn)
  z[!is.na(idx), 1:length(xcn)] <- x[na.omit(idx),]
  idy <- match(rn, yrn)
  z[!is.na(idy), length(xcn)+(1:length(ycn))] <- y[na.omit(idy),]
  return(z)
}

Using merge and renaming your t vector as tt (see the PS of Andrie) :

merge(tt,z,by="row.names",all.x=TRUE)[,-(5:8)]

Now if you would work with dataframes instead of matrices, this would even become a whole lot easier :

z <- as.data.frame(z)
tt <- as.data.frame(tt)
merge(tt,z["symbol"],by="row.names",all.x=TRUE)

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to merge

Pandas Merging 101 Python: pandas merge multiple dataframes Git merge with force overwrite Merge two dataframes by index Visual Studio Code how to resolve merge conflicts with git? merge one local branch into another local branch Merging dataframes on index with pandas Git merge is not possible because I have unmerged files Git merge develop into feature branch outputs "Already up-to-date" while it's not How merge two objects array in angularjs?

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe