Skip to content
Prev 325224 / 398503 Next

Add a column to a dataframe based on multiple other column values

Tom,

Here is my solution. Note that I assume the columns are interleaved as you describe below. I'm sure others will have better replies.

Note that using dput helps the helpers.

# From dput(mdat)
mdat<-structure(list(x1 = c(2L, 2L, 2L, 3L, 3L, 30L, 32L, 33L, 33L), 
    y1 = c(100L, 100L, 100L, 0L, 0L, 0L, 100L, 82L, 0L), x2 = c(190L, 
    192L, 192L, 195L, 198L, 198L, 868L, 870L, 871L), y2 = c(99L, 
    63L, 63L, 99L, 98L, 100L, 100L, 100L, 82L), x3 = c(1430L, 
    1431L, 1444L, 1499L, 1500L, 1451L, 1451L, 1490L, 1494L), 
    y3 = c(79L, 75L, 51L, 50L, 80L, 97L, 97L, 97L, 85L), output = c(89, 
    69, 57, 74.5, 89, 65.66666667, 99, 93, 55.66666667)), .Names = c("x1", 
"y1", "x2", "y2", "x3", "y3", "output"), class = "data.frame", row.names = c(NA, 
-9L))

mdat.pure<-mdat[,-ncol(mdat)]

# Function to apply to rows
theFunk<-function(x) {
  nxy<-length(x)/2
  idx<-seq_len(nxy)
  xvec<-x[idx*2 - 1]
  yvec<-x[idx*2]
  mean(yvec[xvec>10])
}

# Apply the function to rows
output<-apply(mdat.pure, 1, theFunk)

Or 

mdat.pure$output<-apply(mdat.pure, 1, theFunk)

will put the calculated column at the end of mdat.pure.

Note that I haven't taken account of missing values.

Hope this helps,
KW

--
On Jun 12, 2013, at 6:00 AM, r-help-request at r-project.org wrote: