Here is the data I'm working with: http://r.789695.n4.nabble.com/file/n4530888/new.txt new.txt http://r.789695.n4.nabble.com/file/n4530888/old.txt old.txt My code is here: http://pastebin.com/9jjs6Ahr I'm looking for away to simply attach the new.txt to the bottom of old.txt through R, else I'll just throw it in Excel to do some preprocessing. I've looked into using merge, cbind, concatenate, and rbind. However, I'm running into problems where the 2012 data keeps ending up on top before the 2010 and 2011 data or the function just adds more extra columns to the right side. Is there a simple method of doing this? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Trying-to-merge-new-data-set-to-bottom-of-old-data-set-Both-are-zoo-objects-tp4530888p4530888.html Sent from the R help mailing list archive at Nabble.com.
Trying to merge new data set to bottom of old data set. Both are zoo objects.
5 messages · knavero, Ashish Agarwal, Gabor Grothendieck
Here's a case where it doesn't work. Again, the problem is that when I use the rbind or concatenate functions, the 2012 data set seems to go ahead of the 2010 and 2011 portions of the data set. The problem seems dependent on the text files I read in: http://r.789695.n4.nabble.com/file/n4531011/old.txt old.txt http://r.789695.n4.nabble.com/file/n4531011/new.txt new.txt using this code: http://pastebin.com/8W6KaaPQ In a case where it works, and the data seemed to be in the right order, I read in a different old.txt named old1.txt and somehow it seemed to work. The contents and format were similar to that of new.txt where there was 18 columns with the same headers. Here are the files to use: http://r.789695.n4.nabble.com/file/n4531011/old1.txt old1.txt http://r.789695.n4.nabble.com/file/n4531011/new.txt new.txt using this code: http://pastebin.com/6iNF5bPd That should clarify the issue I'm having. Let me know if a dput is necessary here. However all the vectors and vector modes seem to check out okay. -- View this message in context: http://r.789695.n4.nabble.com/Trying-to-merge-new-data-set-to-bottom-of-old-data-set-Both-are-zoo-objects-tp4530888p4531011.html Sent from the R help mailing list archive at Nabble.com.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120404/f02e41b8/attachment.pl>
On Wed, Apr 4, 2012 at 1:47 AM, knavero <knavero at gmail.com> wrote:
Here's a case where it doesn't work. Again, the problem is that when I use the rbind or concatenate functions, the 2012 data set seems to go ahead of the 2010 and 2011 portions of the data set. The problem seems dependent on the text files I read in: http://r.789695.n4.nabble.com/file/n4531011/old.txt old.txt http://r.789695.n4.nabble.com/file/n4531011/new.txt new.txt using this code: http://pastebin.com/8W6KaaPQ In a case where it works, and the data seemed to be in the right order, I read in a different old.txt named old1.txt and somehow it seemed to work. The contents and format were similar to that of new.txt where there was 18 columns with the same headers. Here are the files to use: http://r.789695.n4.nabble.com/file/n4531011/old1.txt old1.txt http://r.789695.n4.nabble.com/file/n4531011/new.txt new.txt using this code: http://pastebin.com/6iNF5bPd That should clarify the issue I'm having. Let me know if a dput is necessary here. However all the vectors and vector modes seem to check out okay.
The problem is that the dates in the new file are of the form 2/23/12 but they are being read in using "%m/%d/%Y %H:%M" . The %Y should be %y. For the old file the format is correct. A few other points: - it would be better to use library() than require() here. If there is some problem and it can't load the package then library will fail with an error right at that point -- this is what we want in order to best reveal where the problem is but with require() it will simply return FALSE and keep processing and then the error will be later in the code which is not as convenient for figuring out what went wrong. Alternately you can use stopifnot(require(...whatever...)). - please try to cut your data down as far as feasible. If each file had 3 lines, say, the same error would have been revealed and it would have been easier to manage. Also it would have been possible to remove all the columns not used and still illustrate this error. The very process of reducing it to the smallest dataset you can often reveal the error. - if you must post in this fashion then note that read.zoo uses read.table which can read directly off the net: new.txt <- "http://r.789695.n4.nabble.com/file/n4531011/new.txt" new <- read.zoo(new.txt, ...whatever...) - its better to write out TRUE and FALSE since F and T can be ordinary variables that a program can create but TRUE and FALSE are keywords so they can't be overwritten. - you may or may not prefer this style but it would be possible to replace this: cls <- c("NULL", NA, "numeric", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL") with this: cls <- rep(c("NULL", NA, "numeric", "NULL"), c(1, 1, 1, 15))
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Okay, will do. Thanks for all the handy advice Gabor. Ugh, it's such a stupid bug once I actually know what is going on. I need to go over my Unix date/time format specifiers, and I'll probably use the rep function to simplify and reducing the amount of code. A lot of that is definitely new to me. As for shortening the read in data, I do it find it tricky sometimes since you have to incrementally test it in the sense that you want to shorten it to the point that it still reproduces the problem. Anyway, I'll try to make the data significantly shorter in my next post if possible. Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/Trying-to-merge-new-data-set-to-bottom-of-old-data-set-Both-are-zoo-objects-tp4530888p4532484.html Sent from the R help mailing list archive at Nabble.com.