Skip to content
Back to formatted view

Raw Message

Message-ID: <45f568c70903260451h3ed79d45qebf64e5ea66f3dc9@mail.gmail.com>
Date: 2009-03-26T11:51:52Z
From: Gustaf Rydevik
Subject: same value in column-->delete
In-Reply-To: <4CDCB6746FDEE34CA66F8B557440A60AC7F670CB45@ipgpost.ipg.lan>

On Thu, Mar 26, 2009 at 12:15 PM, Duijvesteijn, Naomi
<Naomi.Duijvesteijn at ipg.nl> wrote:
>
> ? Hi Readers,
>
>
> ? I have a question.
>
>
> ? I have a large dataset and want to throw away columns that have the same
> ? value in the column itself and I want to know which column this was.
>
>
> ? For example
>
> ? > x<-data.frame(id=c(1,2,3), snp1=c("A","G",
> ? "G"),snp2=c("G","G","G"),snp3=c("G","G","A"))
>
> ? > x
>
> ? ? id snp1 snp2 snp3
>
> ? 1 ?1 ? ?A ? ?G ? ?G
>
> ? 2 ?2 ? ?G ? ?G ? ?G
>
> ? 3 ?3 ? ?G ? ?G ? ?A
>
>
> ? Now I want to know that snp2 in monomorphic (the same value for the column)
> ? and after I know which column it is I want to take these columns out.
>
>
> ? Thanks,
>
> ? Naomi
>


Another, perhaps slightly more intuitive solution than Jim's would be
the following:

 x<-data.frame(id=c(1,2,3), snp1=c("A","G",
"G"),snp2=c("G","G","G"),snp3=c("G","G","A"))
is.monovalued<-function(df){
              sapply(df,function(x){
                            length(unique(x))==1
              })
}

monovaluedCols<-is.monovalued(x)
which(monovaluedCols)
x[!monovaluedCols]

/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik