David Winsemius
On Jan 16, 2009, at 3:50 PM, Karl Healey wrote:
> Hi All,
>
> I want to take a matrix (or data frame) and winsorize each variable.
> So I can, for example, correlate the winsorized variables.
>
> The code below will winsorize a single vector, but when applied to
> several vectors, each ends up sorted independently in ascending
> order so that a given observation is no longer on the same row for
> each vector.
>
> So I need to winsorize the variable but then return it to its
> original order. Or another solution that will take a data frame,
> wisorize each variable, and return a new data frame with all the
> variables in the original order.
>
> Thanks for any help!
>
> -Karl
>
>
> #The function I'm working from
>
> win<-function(x,tr=.2,na.rm=F){
>
> if(na.rm)x<-x[!is.na(x)]
> y<-sort(x)
> n<-length(x)
> ibot<-floor(tr*n)+1
> itop<-length(x)-ibot+1
> xbot<-y[ibot]
> xtop<-y[itop]
> y<-ifelse(y<=xbot,xbot,y)
> y<-ifelse(y>=xtop,xtop,y)
> win<-y
> win
> }
>
> #Produces an example data frame, ss is the observation id, vars 1-5
> are the variables I want to winzorise.
>
> ss
> =
> c
> (1
> :
> 5
> );var1
> =
> rnorm
> (5
> );var2
> =
> rnorm
> (5
> );var3
> =rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))-
> >data
> data
>
> #Winsorizes each variable, but sorts them independently so the
> observations no longer line up.
>
> sapply(data,win)
>
>
> ___________________________
> M. Karl Healey
> Ph.D. Student
>
> Department of Psychology
> University of Toronto
> Sidney Smith Hall
> 100 St. George Street
> Toronto, ON
> M5S 3G3
>
> karl at psych.utoronto.ca
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.