Difficult subset challenge

Sat, Dec 10, 2011 1:44 PM
Hi,

I'm having difficulty coming up with a good way to subest some data to generate statistics.

My data frame has multiple observations by group.

Here is an overly-simplified toy example of the data
==========================
code	v1	v2
G1		1.2	2.3
G1		0	2.4
G1		1.4	3.4
G2		2.9	2.3
G2		4.3	4.4
etc..
=========================

I want to normalize the data *by group*  for certain variable.  But, I want to ignore 0 values when calculating the mean and standard deviation.

What I *want* to do is something like this:
=======================
	 for (code in unique (d$code) ){ 
		 mu <- mean( d[which(d[d$code==code,v1] !=0 ), v1] ) 
		 sig <- sd( d[which(d[d$code==code,v1] !=0 ), v1] ) 
		 d[which(d[d$code==code,v1] !=0 ), cname] <- (d[which(d[d$code==code,v1] !=0 ), v1] - mu) / sig
	 }
=======================

My goal, if it isn't apparent, is to replace values with their normalized value.  (But, the statistics used for normalization are calculated skipping zero values.)

This doesn't work as the indexing from the which command is relative (1,2,3, etc.)


Suggestions?



--
Noah Silverman
UCLA Department of Statistics
8208 Math Sciences Building
Los Angeles, CA 90095
Difficult subset challenge

Thread (2 messages)