Skip to content

same value in column-->delete

5 messages · Duijvesteijn, Naomi, jim holtman, Gustaf Rydevik +2 more

#
Hi Readers,


   I have a question.


   I have a large dataset and want to throw away columns that have the same
   value in the column itself and I want to know which column this was.


   For example

   > x<-data.frame(id=c(1,2,3), snp1=c("A","G",
   "G"),snp2=c("G","G","G"),snp3=c("G","G","A"))

   > x

     id snp1 snp2 snp3

   1  1    A    G    G

   2  2    G    G    G

   3  3    G    G    A


   Now I want to know that snp2 in monomorphic (the same value for the column)
   and after I know which column it is I want to take these columns out.


   Thanks,

   Naomi


   
   
   
   Disclaimer:  De  informatie opgenomen in dit bericht (en bijlagen) kan
   vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n).
   Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te
   gebruiken, de afzender direct te informeren en het bericht te vernietigen.
   Aan dit bericht kunnen geen rechten of plichten worden ontleend.

   ----------------------------------------------------------------------------
   ----------------------------

   Disclaimer: The information contained in this message may be confidential
   and is intended to be exclusively for the addressee. Should you receive this
   message unintentionally, you are expected not to use the contents herein, to
   notify the sender immediately and to destroy the message. No rights can be
   derived from this message.
#
Try this:
id snp1 snp2 snp3
1  1    A    G    G
2  2    G    G    G
3  3    G    G    A
'data.frame':   3 obs. of  4 variables:
 $ id  : num  1 2 3
 $ snp1: Factor w/ 2 levels "A","G": 1 2 2
 $ snp2: Factor w/ 1 level "G": 1 1 1
 $ snp3: Factor w/ 2 levels "A","G": 2 2 1
id  snp1  snp2  snp3
FALSE FALSE  TRUE FALSE
On Thu, Mar 26, 2009 at 7:15 AM, Duijvesteijn, Naomi
<Naomi.Duijvesteijn at ipg.nl> wrote:

  
    
#
On Thu, Mar 26, 2009 at 12:15 PM, Duijvesteijn, Naomi
<Naomi.Duijvesteijn at ipg.nl> wrote:
Another, perhaps slightly more intuitive solution than Jim's would be
the following:

 x<-data.frame(id=c(1,2,3), snp1=c("A","G",
"G"),snp2=c("G","G","G"),snp3=c("G","G","A"))
is.monovalued<-function(df){
              sapply(df,function(x){
                            length(unique(x))==1
              })
}

monovaluedCols<-is.monovalued(x)
which(monovaluedCols)
x[!monovaluedCols]

/Gustaf
#
this works

which.is.not.unique <- apply(x,2,function(x)ifelse(length(unique(x))==1,F,T))
x[,which.is.not.unique]

patrizio

2009/3/26 Duijvesteijn, Naomi <Naomi.Duijvesteijn at ipg.nl>:
#
Patrizio Frederic wrote:
or you simplify that idea and say

x[, apply(x, 2, function(x) length(unique(x)) > 1)]

Uwe Ligges