[r] How can I subset rows in a data frame in R based on a vector of values?

Per the comments to the original post, merges / joins are well-suited for this problem. In particular, an inner join will return only values that are present in both dataframes, making thesetdiff statement unnecessary.

Using the data from Dinre's example:

In base R:

cleanedA <- merge(data_A, data_B[, "index"], by = 1, sort = FALSE)
cleanedB <- merge(data_B, data_A[, "index"], by = 1, sort = FALSE)

Using the dplyr package:

library(dplyr)
cleanedA <- inner_join(data_A, data_B %>% select(index))
cleanedB <- inner_join(data_B, data_A %>% select(index))

To keep the data as two separate tables, each containing only its own variables, this subsets the unwanted table to only its index variable before joining. Then no new variables are added to the resulting table.

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to subset

how to remove multiple columns in r dataframe? Using grep to help subset a data frame in R Subset a dataframe by multiple factor levels creating a new list with subset of list using index in python subsetting a Python DataFrame Undefined columns selected when subsetting data frame How to select some rows with specific rownames from a dataframe? Subset data to contain only columns whose names match a condition find all subsets that sum to a particular value Subset and ggplot2

Examples related to r-faq

What does "The following object is masked from 'package:xxx'" mean? What does "Error: object '<myvariable>' not found" mean? How do I deal with special characters like \^$.?*|+()[{ in my regex? What does %>% function mean in R? How to plot a function curve in R Use dynamic variable names in `dplyr` Error: unexpected symbol/input/string constant/numeric constant/SPECIAL in my code How should I deal with "package 'xxx' is not available (for R version x.y.z)" warning? How to select the row with the maximum value in each group R data formats: RData, Rda, Rds etc