Hey Guys
I want to divide(numerically) all the columns of a data frame by
different numbers. Here is what I am doing but getting a weird error.
The values in each column are not getting divided by the corresponding
value in the denominator vector instead by the alternative values. I
am sure I am messing up something.
Appreciate your help
-Abhi
head(counts)
WT_CON WT_RB MU_CON MU_RB
row1 839 180 477 187
head(counts/c(2,3,4,5))
WT_CON WT_RB MU_CON MU_RB
row1 419.50000 45.0000 238.5000 46.7500
dividing values of each column in a dataframe
5 messages · Abhishek Pratap, David Winsemius, Rolf Turner +1 more
On Feb 6, 2012, at 2:22 PM, Abhishek Pratap wrote:
Hey Guys
I want to divide(numerically) all the columns of a data frame by
different numbers. Here is what I am doing but getting a weird error.
The values in each column are not getting divided by the corresponding
value in the denominator vector instead by the alternative values. I
am sure I am messing up something.
Appreciate your help
-Abhi
head(counts)
WT_CON WT_RB MU_CON MU_RB
row1 839 180 477 187
head(counts/c(2,3,4,5))
WT_CON WT_RB MU_CON MU_RB
row1 419.50000 45.0000 238.5000 46.7500
The argument recycling is going to be done on a column basis. You apaear to expect it to be done on a row basis. For that you need apply Probably: t( apply(counts, 1, "/", c(2,3,4,5)) ) But untested in absence of example. --- David Winsemius, MD West Hartford, CT
On 06-02-2012, at 20:22, Abhishek Pratap wrote:
Hey Guys I want to divide(numerically) all the columns of a data frame by different numbers. Here is what I am doing but getting a weird error.
I don't think you are getting an error. You are getting an unexpected result (you think).
The values in each column are not getting divided by the corresponding
value in the denominator vector instead by the alternative values. I
am sure I am messing up something.
Appreciate your help
-Abhi
head(counts)
WT_CON WT_RB MU_CON MU_RB
row1 839 180 477 187
head(counts/c(2,3,4,5))
WT_CON WT_RB MU_CON MU_RB
row1 419.50000 45.0000 238.5000 46.7500
I tried this df <- data.frame(a=runif(10),b=runif(10),c=runif(10)) scal <- c(2,3,4)
df
a b c 1 0.47661685 0.73457617 0.90045279 2 0.06916502 0.76374600 0.07630196 3 0.17029174 0.29450289 0.07416969 4 0.03126839 0.09864740 0.08230353 5 0.23713816 0.06342224 0.91241698 6 0.21970595 0.86890690 0.47316101 7 0.46380324 0.26142304 0.87823277 8 0.81256517 0.76097474 0.98956553 9 0.74425369 0.29228545 0.27496707 10 0.65425285 0.40166967 0.12231213
df/scal
a b c 1 0.23830843 0.24485872 0.22511320 2 0.02305501 0.19093650 0.03815098 3 0.04257293 0.14725144 0.02472323 4 0.01563419 0.03288247 0.02057588 5 0.07904605 0.01585556 0.45620849 6 0.05492649 0.43445345 0.15772034 7 0.23190162 0.08714101 0.21955819 8 0.27085506 0.19024369 0.49478276 9 0.18606342 0.14614273 0.09165569 10 0.32712642 0.13388989 0.03057803 Checking:
df[4,]/scal
a b c 4 0.01563419 0.03288247 0.02057588 This is what you want guessing from your description. You haven't given us an example with desired output. Berend
I believe your post is misleading. Your example "works"
purely by chance.
R uses "column ordering", so entries 1 to 3 of column 1 in your
example get divided by 2, 3, and 4 respectively. Then "scal" is
*recycled* and entries 4, 5, and 6, get divided by 2, 3, and 4
respectively, and so on.
It just *happens* that entry 4 of column 1 gets divided by 2,
entry 4 of column 2 gets divided by 3, and entry 4 of column 3
gets divided by 4, giving the impression that you are getting
what you want. But if you look at row 5 of your "df" you'll see
something different.
# The same (due to serendipity --- or it's converse!):
> (df/scal)[4,]
a b c
4 0.0156342 0.03288247 0.02057588
> df[4,]/scal
a b c
4 0.0156342 0.03288247 0.02057588
# Not the same!
> (df/scal)[5,]
a b c
5 0.07904605 0.01585556 0.4562085
> df[5,]/scal
a b c
5 0.1185691 0.02114075 0.2281042
As has already been pointed out in this thread, to get what you think
you're getting, you need to use transpose t(t(df)/scal).
cheers,
Rolf Turner
On 07/02/12 08:54, Berend Hasselman wrote:
On 06-02-2012, at 20:22, Abhishek Pratap wrote:
Hey Guys I want to divide(numerically) all the columns of a data frame by different numbers. Here is what I am doing but getting a weird error.
I don't think you are getting an error. You are getting an unexpected result (you think).
The values in each column are not getting divided by the corresponding
value in the denominator vector instead by the alternative values. I
am sure I am messing up something.
Appreciate your help
-Abhi
head(counts)
WT_CON WT_RB MU_CON MU_RB
row1 839 180 477 187
head(counts/c(2,3,4,5))
WT_CON WT_RB MU_CON MU_RB
row1 419.50000 45.0000 238.5000 46.7500
I tried this df<- data.frame(a=runif(10),b=runif(10),c=runif(10)) scal<- c(2,3,4)
df
a b c 1 0.47661685 0.73457617 0.90045279 2 0.06916502 0.76374600 0.07630196 3 0.17029174 0.29450289 0.07416969 4 0.03126839 0.09864740 0.08230353 5 0.23713816 0.06342224 0.91241698 6 0.21970595 0.86890690 0.47316101 7 0.46380324 0.26142304 0.87823277 8 0.81256517 0.76097474 0.98956553 9 0.74425369 0.29228545 0.27496707 10 0.65425285 0.40166967 0.12231213
df/scal
a b c 1 0.23830843 0.24485872 0.22511320 2 0.02305501 0.19093650 0.03815098 3 0.04257293 0.14725144 0.02472323 4 0.01563419 0.03288247 0.02057588 5 0.07904605 0.01585556 0.45620849 6 0.05492649 0.43445345 0.15772034 7 0.23190162 0.08714101 0.21955819 8 0.27085506 0.19024369 0.49478276 9 0.18606342 0.14614273 0.09165569 10 0.32712642 0.13388989 0.03057803 Checking:
df[4,]/scal
a b c 4 0.01563419 0.03288247 0.02057588 This is what you want guessing from your description. You haven't given us an example with desired output. Berend
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 07-02-2012, at 00:09, Rolf Turner wrote:
I believe your post is misleading. Your example "works" purely by chance. R uses "column ordering", so entries 1 to 3 of column 1 in your example get divided by 2, 3, and 4 respectively. Then "scal" is *recycled* and entries 4, 5, and 6, get divided by 2, 3, and 4 respectively, and so on. It just *happens* that entry 4 of column 1 gets divided by 2, entry 4 of column 2 gets divided by 3, and entry 4 of column 3 gets divided by 4, giving the impression that you are getting what you want. But if you look at row 5 of your "df" you'll see something different. # The same (due to serendipity --- or it's converse!):
(df/scal)[4,]
a b c 4 0.0156342 0.03288247 0.02057588
df[4,]/scal
a b c 4 0.0156342 0.03288247 0.02057588 # Not the same!
(df/scal)[5,]
a b c 5 0.07904605 0.01585556 0.4562085
df[5,]/scal
a b c 5 0.1185691 0.02114075 0.2281042 As has already been pointed out in this thread, to get what you think you're getting, you need to use transpose t(t(df)/scal).
You are completely correct. I should have checked more thoroughly. Using this dataframe makes the error glaring df <- data.frame(a=rep(2,10),b=rep(3,10),c=rep(4,10)) My apologies to the OP and the list. Berend Hasselman