Skip to content
Prev 172003 / 398503 Next

Filtering a dataset's columns by another dataset's column names

So you want the data that is in Dataset 1 but only the column names  
that are also in Dataset 2:

How about:

  subset(DS1, select = names(DS1) %in% names(DS2) )

 > DS1 <-read.table(textConnection("Individual    SNP1    SNP2     
SNP3    SNP4    SNP5
+ 1    A    G    T    C    A
+ 2    T    C    A    G    T
+ 3    A    C    T    C    A"),header=TRUE)
 > DS2 <-read.table(textConnection("Individual    SNP1    SNP3     
SNP5    SNP6    SNP7
+ 4    A    T    T    G    C
+ 5    T    A    A    G    G
+ 6    A    A    T    C    G"),header=TRUE)

 > subset(DS1, select= names(DS1) %in% names(DS2) )
   Individual SNP1 SNP3 SNP5
1          1    A    T    A
2          2    T    A    T
3          3    A    T    A

Tested!