Hi folks,
I have a dataframe df.vars with the follwing structure:
var1 var2 var3 group
Group is a factor.
Now I want to standardize the vars 1-3 (actually - there are many
more) by class, so I define
z.mean.sd <- function(data){
return.values <- (data - mean(data)) / (sd(data))
return(return.values)
}
now I can call for each var
z.var1 <- by(df.vars$var1, group, z.mean.sd)
which gives me the standardised data for each subgroup in a list with
the subgroups
z.var1 <- unlist(z.var1)
then gives me the z-standardised data for var1 in one vector. Great!
Now I would like to do this for the whole dataframe, but probably I am
not thinking vectorwise enough.
z.df.vars <- by(df.vars, group, z.mean.sd)
does not work. I banged my head on other solutions trying out sapply
and tapply, but did not succeed. Do I need to loop and put everything
together by hand? But I want to keep the columnnames in the vector?
-karsten
---------------------------------------------------------------------------------------------
Karsten D. Wolf
Didactical Design of Interactive
Learning Environments
Universit?t Bremen - Fachbereich 12
web: http://www.ifeb.uni-bremen.de/wolf/
How to z-standardize for subgroups?
5 messages · Jorge Ivan Velez, John Kane, Chuck Cleland +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091129/a998e5ba/attachment-0001.pl>
http://finzi.psych.upenn.edu/R/library/QuantPsyc/html/Make.Z.html Make.Z in the QuantPsych package may already do it.
--- On Sun, 11/29/09, Karsten Wolf <wolf at uni-bremen.de> wrote:
From: Karsten Wolf <wolf at uni-bremen.de>
Subject: [R] How to z-standardize for subgroups?
To: r-help at r-project.org
Received: Sunday, November 29, 2009, 10:41 AM
Hi folks,
I have a dataframe df.vars with the follwing structure:
var1???var2???var3???group
Group is a factor.
Now I want to standardize the vars 1-3 (actually - there
are many more) by class, so I define
z.mean.sd <- function(data){
??? return.values <- (data? -
mean(data)) / (sd(data))
??? return(return.values)
}
now I can call for each var
z.var1 <- by(df.vars$var1, group, z.mean.sd)
which gives me the standardised data for each subgroup in a
list with the subgroups
z.var1 <- unlist(z.var1)
then gives me the z-standardised data for var1 in one
vector. Great!
Now I would like to do this for the whole dataframe, but
probably I am not thinking vectorwise enough.
z.df.vars <- by(df.vars, group, z.mean.sd)
does not work. I banged my head on other solutions trying
out sapply and tapply, but did not succeed. Do I need to
loop and put everything together by hand? But I want to keep
the columnnames in the vector?
-karsten
---------------------------------------------------------------------------------------------
Karsten D. Wolf
Didactical Design of Interactive
Learning Environments
Universit?t Bremen - Fachbereich 12
web: http://www.ifeb.uni-bremen.de/wolf/
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
__________________________________________________ Do You Yahoo!? Tired of spam?
On 11/29/2009 4:23 PM, John Kane wrote:
http://finzi.psych.upenn.edu/R/library/QuantPsyc/html/Make.Z.html Make.Z in the QuantPsych package may already do it.
For a single variable, you could use ave() and scale() together like this:
with(iris, ave(Sepal.Width, Species, FUN = scale))
To scale more than one variable in a concise call, consider something
along these lines:
apply(iris[,1:4], 2, function(x){ave(x, iris$Species, FUN = scale)})
hope this helps,
Chuck Cleland
--- On Sun, 11/29/09, Karsten Wolf <wolf at uni-bremen.de> wrote:
From: Karsten Wolf <wolf at uni-bremen.de>
Subject: [R] How to z-standardize for subgroups?
To: r-help at r-project.org
Received: Sunday, November 29, 2009, 10:41 AM
Hi folks,
I have a dataframe df.vars with the follwing structure:
var1 var2 var3 group
Group is a factor.
Now I want to standardize the vars 1-3 (actually - there
are many more) by class, so I define
z.mean.sd <- function(data){
return.values <- (data -
mean(data)) / (sd(data))
return(return.values)
}
now I can call for each var
z.var1 <- by(df.vars$var1, group, z.mean.sd)
which gives me the standardised data for each subgroup in a
list with the subgroups
z.var1 <- unlist(z.var1)
then gives me the z-standardised data for var1 in one
vector. Great!
Now I would like to do this for the whole dataframe, but
probably I am not thinking vectorwise enough.
z.df.vars <- by(df.vars, group, z.mean.sd)
does not work. I banged my head on other solutions trying
out sapply and tapply, but did not succeed. Do I need to
loop and put everything together by hand? But I want to keep
the columnnames in the vector?
-karsten
---------------------------------------------------------------------------------------------
Karsten D. Wolf
Didactical Design of Interactive
Learning Environments
Universit?t Bremen - Fachbereich 12
web: http://www.ifeb.uni-bremen.de/wolf/
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
__________________________________________________ Do You Yahoo!? Tired of spam? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
Hi Jorge, Chuck and Kane,
thanks for your input!
The following code based on Jorge's answer did the trick to
standardize for subgroups within multiple columns:
# define a standardize function, but you could also define your custom
standardize function here
z.mean.sd <- function(data){
return.values <- (data - mean(data, na.rm = TRUE)) / (sd(data, na.rm
= TRUE))
return(return.values)
}
# assume there is some data.frame called sole.data with a group factor
sole.data$studie already read into R
sole.data <- read.csv2("SoLe.dat")
attach(sole.data)
# assume I have created a subset of the data.frame cor.vars with only
some of the vars needed to be standardized
cor.vars <- data.frame(var02, var04, var07, var10, var17, var24, var 36)
z.cor.vars <- apply(cor.vars, 2, tapply, sole.data$studie, z.mean.sd)
z.cor.vars <- sapply(z.cor.vars, unlist, USE.NAMES = FALSE)
z.cor.vars
BUT then Chuck's answer was much more elegant than my first woodpecker
solution
apply(iris[,1:4], 2, function(x){ave(x, iris$Species, FUN = scale)})
could be translated into
apply(sole.data[,c(2,4,7,10,17,24,36)], 2, function(x){ave(x,sole.data
$studie, FUN=scale)})
Thanks for the beauty of this code with an anonymous function call :)
-karsten
Am 29.11.2009 um 16:47 schrieb Jorge Ivan Velez:
Hi Karsten,
Let me assume your data is called d. If I understood what you are
trying to do, the following might help:
res <- apply(d, 2, tapply, d$group, scale)
res
See ?apply, ?tapply and ?scale for more information.
HTH,
Jorge
On Sun, Nov 29, 2009 at 10:41 AM, Karsten Wolf <> wrote:
Hi folks,
I have a dataframe df.vars with the follwing structure:
var1 var2 var3 group
Group is a factor.
Now I want to standardize the vars 1-3 (actually - there are many
more) by class, so I define
z.mean.sd <- function(data){
return.values <- (data - mean(data)) / (sd(data))
return(return.values)
}
now I can call for each var
z.var1 <- by(df.vars$var1, group, z.mean.sd)
which gives me the standardised data for each subgroup in a list
with the subgroups
z.var1 <- unlist(z.var1)
then gives me the z-standardised data for var1 in one vector. Great!
Now I would like to do this for the whole dataframe, but probably I
am not thinking vectorwise enough.
z.df.vars <- by(df.vars, group, z.mean.sd)
does not work. I banged my head on other solutions trying out sapply
and tapply, but did not succeed. Do I need to loop and put
everything together by hand? But I want to keep the columnnames in
the vector?
-karsten
---------------------------------------------------------------------------------------------
Karsten D. Wolf
Didactical Design of Interactive
Learning Environments
Universit?t Bremen - Fachbereich 12
web: http://www.ifeb.uni-bremen.de/wolf/
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.