How to replace a column in a data frame with another one with a different size
Your
On Sun, Jul 8, 2012 at 12:22 PM, Stathis Kamperis <ekamperi at gmail.com> wrote:
2012/7/8 Michael Weylandt <michael.weylandt at gmail.com>:
On Jul 8, 2012, at 9:31 AM, Stathis Kamperis <ekamperi at gmail.com> wrote:
Hello everyone, I have a dataframe with 1 column and I'd like to replace that column with a moving average. Example:
library('zoo')
mydat <- seq_len(10)
mydat
[1] 1 2 3 4 5 6 7 8 9 10
df <- data.frame("V1" = mydat)
df
V1 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
df[df$V1 <- rollapply(df$V1, 3, mean)]
Error in `$<-.data.frame`(`*tmp*`, "V1", value = c(2, 3, 4, 5, 6, 7, 8, : replacement has 8 rows, data has 10
I'm not sure you need the outer df[...] -- I think you just want df$V1 <- rollapply(df$V1,3,mean) However, this will still give you the error message you're seeing because rollapply() only returns 8 values here (you don't get the "endpoints" by default). To get the right number of rows, you want rollapply(df$V1, 3, mean, fill = NA) # Change NA if desired which will put NA's on each end and give you a length 10 result, as needed.
Thanks Michael (and arun@)! If I would do that, then (in my particular case), I'd need to eliminate NA's, with something like: df$V1 <- df$V1[!is.na(df$V1)] which would still fail with the same error message :-P
You're getting tripped up (again) by trying to sub-assign something that's too small. df is a rectangular array of data: on the RHS of that expression, you are selecting out a subset of it of say 8 rows and telling R to replace the 10-row V1 column with those 8 elements. This cannot be done with the fixed rectangular structure and hence the error message. What you want to do is something like this: df[!is.na(df$V1), ] Let's walk through that df$V1 -- take the V1 column of df is.na() -- get a logical vector saying where NAs are !is.na() -- identify the rows where there _aren't_ NAs df[ !is.na(), ] -- (the important one) take the rows of df (all columns) where there aren't NAs What you might be wanting to do is df <- df[!is.na(df$V1), ] This is much better than what you are trying to do (working on the whole array at a time and trusting R to keep it all together than trying to manipulate slices individually) But even more idiomatic would be complete.cases(df) Take a look at some introductory material and try to wrap your head around indexing rows and columns together again: it's a fantastic paradigm and will be of much more use to you long run than trying to work on individual columns for subsetting/data-cleaning. Best, Michael
Regards, Stathis
Best, Michael
I could use a temporary variable to store the results of rollapply() and then reconstruct the data frame, but I was wondering if there is a one-liner that can achieve the same thing. Best regards, Stathis P.S. If you don't mind, cc me at your reply because I'm not subscribed to the list (but I will check the archive anyway).
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.