I am able to split my df into two like so: dataset <- trainset index <- 1:nrow(dataset) testindex <- sample(index, trunc(length(index)*30/100)) trainset <- dataset[-testindex,] testset <- dataset[testindex,-1] So I have the index information, how could I re-combine the data using that back into a single df? I tried what I thought might work, but failed with: newdataset[testindex] = testset[testindex] object 'dataset' not found newdataset[-testindex] = trainset[-testindex] object 'dataset' not found Brian
How to re-combine values based on an index?
6 messages · Brian Feeny, William Dunlap, arun +1 more
newdataset[testindex] = testset[testindex] object 'dataset' not found
Is that really what R printed? I get
> newdataset[testindex] = testset[testindex]
Error in newdataset[testindex] = testset[testindex] :
object 'newdataset' not found
but perhaps you have a different problem. Copy and paste
(and read) the error message you got.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Brian Feeny Sent: Saturday, December 01, 2012 8:04 PM To: r-help at r-project.org Subject: [R] How to re-combine values based on an index? I am able to split my df into two like so: dataset <- trainset index <- 1:nrow(dataset) testindex <- sample(index, trunc(length(index)*30/100)) trainset <- dataset[-testindex,] testset <- dataset[testindex,-1] So I have the index information, how could I re-combine the data using that back into a single df? I tried what I thought might work, but failed with: newdataset[testindex] = testset[testindex] object 'dataset' not found newdataset[-testindex] = trainset[-testindex] object 'dataset' not found Brian
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you for your response, here is a better example of what I am trying to do: data(iris) index_setosa <- which(iris$Species == "setosa") iris_setosa <- data.frame() iris_setosa[index_setosa,] <-iris[index_setosa,] iris_others <- data.frame() iris_others[-index_setosa,] <- iris[-index_setosa,] So the idea would be that iris_setosa is a dataframe of size 150, with 50 observations of setosa, using their original same indices, and 100 observations of NA. Likewise iris_others would be 100 observations of species besides setosa, using their original indices, and there would be 50 NA's. The above doesn't work. When I execute it, I am left with iris_setosa having 0 columns, I wish it to have all the original columns of iris. That said, once I get past the above (being able to split them out and keep original indices), I wish to be able to combine iris_setosa and iris_others so that iris_combined is a data frame with no NA's and all the original data. Does this make sense? So I am basically taking a dataframe, splitting it based on some criteria, and working on the two split dataframes separately, and then I wish to recombine. Brian So at this point, I have iris_setosa a dataframe of size
On Dec 1, 2012, at 11:34 PM, William Dunlap wrote:
newdataset[testindex] = testset[testindex] object 'dataset' not found
Is that really what R printed? I get
newdataset[testindex] = testset[testindex]
Error in newdataset[testindex] = testset[testindex] : object 'newdataset' not found but perhaps you have a different problem. Copy and paste (and read) the error message you got. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Brian Feeny Sent: Saturday, December 01, 2012 8:04 PM To: r-help at r-project.org Subject: [R] How to re-combine values based on an index? I am able to split my df into two like so: dataset <- trainset index <- 1:nrow(dataset) testindex <- sample(index, trunc(length(index)*30/100)) trainset <- dataset[-testindex,] testset <- dataset[testindex,-1] So I have the index information, how could I re-combine the data using that back into a single df? I tried what I thought might work, but failed with: newdataset[testindex] = testset[testindex] object 'dataset' not found newdataset[-testindex] = trainset[-testindex] object 'dataset' not found Brian
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi,
?merge(), ?rbind(), or ?join() from library(plyr)
set.seed(5)
trainset<-data.frame(ID=1:10,col2=runif(10,0,1))
?dataset <- trainset
?trainset<-dataset[-testindex,]
testset<-dataset[testindex,]
merge(testset,trainset,by=c("ID","col2"),all=TRUE)
#?? ID????? col2
#1?? 1 0.2002145
#2?? 2 0.6852186
#3?? 3 0.9168758
#4?? 4 0.2843995
#5?? 5 0.1046501
#6?? 6 0.7010575
#7?? 7 0.5279600
#8?? 8 0.8079352
#9?? 9 0.9565001
#10 10 0.1104530
You can also do this as:
?newdataset<-data.frame(ID=rep(NA,nrow(dataset)),col2=rep(NA,nrow(dataset)))
?newdataset[testindex,]<-testset
newdataset[-testindex,]<-trainset
?head(newdataset)
#? ID????? col2
#1? 1 0.2002145
#2? 2 0.6852186
#3? 3 0.9168758
#4? 4 0.2843995
#5? 5 0.1046501
#6? 6 0.7010575
A.K.
----- Original Message -----
From: Brian Feeny <bfeeny at mac.com>
To: r-help at r-project.org
Cc:
Sent: Saturday, December 1, 2012 11:04 PM
Subject: [R] How to re-combine values based on an index?
I am able to split my df into two like so:
dataset <- trainset
index <- 1:nrow(dataset)
testindex <- sample(index, trunc(length(index)*30/100))
trainset <- dataset[-testindex,]
testset <- dataset[testindex,-1]
So I have the index information, how could I re-combine the data using that back into a single df?
I tried what I thought might work, but failed with:
newdataset[testindex] = testset[testindex]
? object 'dataset' not found
newdataset[-testindex] = trainset[-testindex]
? object 'dataset' not found
Brian
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On 02-12-2012, at 06:06, Brian Feeny wrote:
Thank you for your response, here is a better example of what I am trying to do: data(iris) index_setosa <- which(iris$Species == "setosa") iris_setosa <- data.frame() iris_setosa[index_setosa,] <-iris[index_setosa,] iris_others <- data.frame() iris_others[-index_setosa,] <- iris[-index_setosa,]
Change you example to make it actually do something data(iris) index_setosa <- which(iris$Species == "setosa") iris_setosa <-iris[index_setosa,] head(iris_setosa) # iris_others <- data.frame() # iris_others[-index_setosa,] <- iris[-index_setosa,] iris_others <- iris[-index_setosa,] head(iris_others) tail(iris_others) The head() and tail() calls are for checking. Combine the two like this z <- rbind(iris_setosa,iris_others) head(z) tail(z) Berend
So the idea would be that iris_setosa is a dataframe of size 150, with 50 observations of setosa, using their original same indices, and 100 observations of NA. Likewise iris_others would be 100 observations of species besides setosa, using their original indices, and there would be 50 NA's. The above doesn't work. When I execute it, I am left with iris_setosa having 0 columns, I wish it to have all the original columns of iris. That said, once I get past the above (being able to split them out and keep original indices), I wish to be able to combine iris_setosa and iris_others so that iris_combined is a data frame with no NA's and all the original data. Does this make sense? So I am basically taking a dataframe, splitting it based on some criteria, and working on the two split dataframes separately, and then I wish to recombine. Brian So at this point, I have iris_setosa a dataframe of size On Dec 1, 2012, at 11:34 PM, William Dunlap wrote:
newdataset[testindex] = testset[testindex] object 'dataset' not found
Is that really what R printed? I get
newdataset[testindex] = testset[testindex]
Error in newdataset[testindex] = testset[testindex] : object 'newdataset' not found but perhaps you have a different problem. Copy and paste (and read) the error message you got. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Brian Feeny Sent: Saturday, December 01, 2012 8:04 PM To: r-help at r-project.org Subject: [R] How to re-combine values based on an index? I am able to split my df into two like so: dataset <- trainset index <- 1:nrow(dataset) testindex <- sample(index, trunc(length(index)*30/100)) trainset <- dataset[-testindex,] testset <- dataset[testindex,-1] So I have the index information, how could I re-combine the data using that back into a single df? I tried what I thought might work, but failed with: newdataset[testindex] = testset[testindex] object 'dataset' not found newdataset[-testindex] = trainset[-testindex] object 'dataset' not found Brian
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121201/dfd52e32/attachment.pl>