Dear UseRs,
I have a data frame that looks like this:
head(test2)
attributes start end StemExplant Callus RegenPlant
1 LTR_Unknown 120 535 3.198 1.931 1.927
3 LTR_Unknown 2955 3218 0.541 0.103 0.613
6 LTR_Unknown 6210 6423 6.080 4.650 9.081
9 LTR_Unknown 9658 10124 0.238 0.117 0.347
14 LTR_Unknown 14699 14894 3.545 3.625 2.116
25 LTR_Unknown 33201 33474 1.275 1.194 0.591
I need to subtract each value in the "end" column from its corresponding value in the "start" column, then append that difference as a new column in this data frame.
It seems like apply could be the way to approach this, but I can't see any easy way to designate "difference" as a function, like, say, sum or mean. Plus, all the apply/lapply examples I'm looking at seem to depend on a data frame being just the two columns on which to operate, without any way to designate which columns to use in the function(x,y) part of an lapply statement. Another alternative would be a for loop, but when I try this:
for(i in 1:nrow(test2)) {
testout[i] <- (test2$end[i] - test2$start[i])
}
I get an error. So I'm stuck at the first step here. I think that once I can figure out how to get the differences, I can use cbind to append the data frame. But if there is a better way to do it, I'd like to know that as well.
Any help is appreciated.
--Kelly V.
column subtraction by row
3 messages · Vining, Kelly, Pete Brecknock, Rolf Turner
Vining, Kelly wrote:
Dear UseRs,
I have a data frame that looks like this:
head(test2)
attributes start end StemExplant Callus RegenPlant
1 LTR_Unknown 120 535 3.198 1.931 1.927
3 LTR_Unknown 2955 3218 0.541 0.103 0.613
6 LTR_Unknown 6210 6423 6.080 4.650 9.081
9 LTR_Unknown 9658 10124 0.238 0.117 0.347
14 LTR_Unknown 14699 14894 3.545 3.625 2.116
25 LTR_Unknown 33201 33474 1.275 1.194 0.591
I need to subtract each value in the "end" column from its corresponding
value in the "start" column, then append that difference as a new column
in this data frame.
It seems like apply could be the way to approach this, but I can't see any
easy way to designate "difference" as a function, like, say, sum or mean.
Plus, all the apply/lapply examples I'm looking at seem to depend on a
data frame being just the two columns on which to operate, without any way
to designate which columns to use in the function(x,y) part of an lapply
statement. Another alternative would be a for loop, but when I try this:
for(i in 1:nrow(test2)) {
testout[i] <- (test2$end[i] - test2$start[i])
}
I get an error. So I'm stuck at the first step here. I think that once I
can figure out how to get the differences, I can use cbind to append the
data frame. But if there is a better way to do it, I'd like to know that
as well.
Any help is appreciated.
--Kelly V.
______________________________________________ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
IS this what you are looking for? # Data lines <- "attributes start end StemExplant Callus RegenPlant LTR_Unknown 120 535 3.198 1.931 1.927 LTR_Unknown 2955 3218 0.541 0.103 0.613 LTR_Unknown 6210 6423 6.080 4.650 9.081 LTR_Unknown 9658 10124 0.238 0.117 0.347 LTR_Unknown 14699 14894 3.545 3.625 2.116 LTR_Unknown 33201 33474 1.275 1.194 0.591" d = read.table(textConnection(lines), header=TRUE) # Create new variable d$new=d$end - d$start print(d) attributes start end StemExplant Callus RegenPlant new 1 LTR_Unknown 120 535 3.198 1.931 1.927 415 2 LTR_Unknown 2955 3218 0.541 0.103 0.613 263 3 LTR_Unknown 6210 6423 6.080 4.650 9.081 213 4 LTR_Unknown 9658 10124 0.238 0.117 0.347 466 5 LTR_Unknown 14699 14894 3.545 3.625 2.116 195 6 LTR_Unknown 33201 33474 1.275 1.194 0.591 273 HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/column-subtraction-by-row-tp3938399p3938461.html Sent from the R help mailing list archive at Nabble.com.
On 26/10/11 10:46, Vining, Kelly wrote:
Dear UseRs,
I have a data frame that looks like this:
head(test2)
attributes start end StemExplant Callus RegenPlant
1 LTR_Unknown 120 535 3.198 1.931 1.927
3 LTR_Unknown 2955 3218 0.541 0.103 0.613
6 LTR_Unknown 6210 6423 6.080 4.650 9.081
9 LTR_Unknown 9658 10124 0.238 0.117 0.347
14 LTR_Unknown 14699 14894 3.545 3.625 2.116
25 LTR_Unknown 33201 33474 1.275 1.194 0.591
I need to subtract each value in the "end" column from its corresponding value in the "start" column, then append that difference as a new column in this data frame.
It seems like apply could be the way to approach this, but I can't see any easy way to designate "difference" as a function, like, say, sum or mean. Plus, all the apply/lapply examples I'm looking at seem to depend on a data frame being just the two columns on which to operate, without any way to designate which columns to use in the function(x,y) part of an lapply statement. Another alternative would be a for loop, but when I try this:
for(i in 1:nrow(test2)) {
testout[i]<- (test2$end[i] - test2$start[i])
}
I get an error. So I'm stuck at the first step here. I think that once I can figure out how to get the differences, I can use cbind to append the data frame. But if there is a better way to do it, I'd like to know that as well.
Any help is appreciated.
Learn to think the R-ish way.
test2$diff <- test2$end - test2$start
Simple as that. Your unnecessary and inefficient for-loop approach
probably would
have *worked* had you initialised "testout" before the for loop. Like:
testout <- numeric(nrow(2))
It's hard to be sure since you didn't say *what* error was thrown. But
anyhow, *don't* do
it that way.
cheers,
Rolf Turner