Skip to content

column subtraction by row

3 messages · Vining, Kelly, Pete Brecknock, Rolf Turner

#
Dear UseRs,

I have a data frame that looks like this:

head(test2)
    attributes start   end StemExplant Callus RegenPlant
1  LTR_Unknown   120   535       3.198  1.931      1.927
3  LTR_Unknown  2955  3218       0.541  0.103      0.613
6  LTR_Unknown  6210  6423       6.080  4.650      9.081
9  LTR_Unknown  9658 10124       0.238  0.117      0.347
14 LTR_Unknown 14699 14894       3.545  3.625      2.116
25 LTR_Unknown 33201 33474       1.275  1.194      0.591


I need to subtract each value in the "end" column from its corresponding value in the "start" column, then append that difference as a new column in this data frame. 

It seems like apply could be the way to approach this, but I can't see any easy way to designate "difference" as a function, like, say, sum or mean. Plus, all the apply/lapply examples I'm looking at seem to depend on a data frame being just the two columns on which to operate, without any way to designate which columns to use in the function(x,y) part of an lapply statement.  Another alternative would be a for loop, but when I try this:

for(i in 1:nrow(test2)) {
	testout[i] <- (test2$end[i] - test2$start[i])
	}

I get an error. So I'm stuck at the first step here. I think that once I can figure out how to get the differences, I can use cbind to append the data frame. But if there is a better way to do it, I'd like to know that as well.

Any help is appreciated.

--Kelly V.
#
Vining, Kelly wrote:
IS this what you are looking for?

# Data
lines <- "attributes start end StemExplant Callus RegenPlant
LTR_Unknown   120   535 3.198  1.931 1.927
LTR_Unknown  2955  3218 0.541  0.103 0.613
LTR_Unknown  6210  6423 6.080  4.650 9.081
LTR_Unknown  9658 10124 0.238  0.117 0.347
LTR_Unknown 14699 14894 3.545  3.625 2.116
LTR_Unknown 33201 33474 1.275  1.194 0.591"

d = read.table(textConnection(lines), header=TRUE) 

# Create new variable
d$new=d$end - d$start

print(d)

   attributes start   end StemExplant Callus RegenPlant new
1 LTR_Unknown   120   535       3.198  1.931      1.927 415
2 LTR_Unknown  2955  3218       0.541  0.103      0.613 263
3 LTR_Unknown  6210  6423       6.080  4.650      9.081 213
4 LTR_Unknown  9658 10124       0.238  0.117      0.347 466
5 LTR_Unknown 14699 14894       3.545  3.625      2.116 195
6 LTR_Unknown 33201 33474       1.275  1.194      0.591 273

HTH

Pete

--
View this message in context: http://r.789695.n4.nabble.com/column-subtraction-by-row-tp3938399p3938461.html
Sent from the R help mailing list archive at Nabble.com.
#
On 26/10/11 10:46, Vining, Kelly wrote:
Learn to think the R-ish way.

     test2$diff <- test2$end - test2$start

Simple as that.  Your unnecessary and inefficient for-loop approach 
probably would
have *worked* had you initialised "testout" before the for loop.  Like:

     testout <- numeric(nrow(2))

It's hard to be sure since you didn't say *what* error was thrown.  But 
anyhow, *don't* do
it that way.

     cheers,

         Rolf Turner