Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.
About populating a dataframe in a loop
7 messages · Richard M. Heiberger, jeremiah rounds, lily li +1 more
Hello,
Works with me:
set.seed(6574)
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}
nrow(pre.mat) # should be 50
Can you give us an example that doesn't work?
Rui Barradas
Em 06-01-2017 18:00, lily li escreveu:
Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Rui, Thanks for your reply. Yes, when I tried to rbind two dataframes, it works. However, if there are more than 50, it got stuck for hours. When I tried to terminate the process and open the csv file separately, it has only one data frame. What is the problem? Thanks.
On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
Hello,
Works with me:
set.seed(6574)
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}
nrow(pre.mat) # should be 50
Can you give us an example that doesn't work?
Rui Barradas
Em 06-01-2017 18:00, lily li escreveu:
Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
Incrementally increasing the size of an array is not efficient in R. The recommended technique is to allocate as much space as you will need, and then fill it.
system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
user system elapsed 0.011 0.000 0.011
dim(tmp)
[1] 1001 5
system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001) tmp[i,] <- 1:5})
user system elapsed 0.001 0.000 0.001
dim(tmp)
[1] 1001 5
On Fri, Jan 6, 2017 at 11:46 PM, lily li <chocold12 at gmail.com> wrote:
Hi Rui, Thanks for your reply. Yes, when I tried to rbind two dataframes, it works. However, if there are more than 50, it got stuck for hours. When I tried to terminate the process and open the csv file separately, it has only one data frame. What is the problem? Thanks. On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
Hello,
Works with me:
set.seed(6574)
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}
nrow(pre.mat) # should be 50
Can you give us an example that doesn't work?
Rui Barradas
Em 06-01-2017 18:00, lily li escreveu:
Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
As a rule never rbind in a loop. It has O(n^2) run time because the rbind
itself can be O(n) (where n is the number of data.frames). Instead either
put them all into a list with lapply or vector("list", length=) and then
datatable::rbindlist, do.call(rbind, thelist) or use the equivalent from
dplyr. All of which will be much more efficient.
On Fri, Jan 6, 2017 at 8:46 PM, lily li <chocold12 at gmail.com> wrote:
Hi Rui, Thanks for your reply. Yes, when I tried to rbind two dataframes, it works. However, if there are more than 50, it got stuck for hours. When I tried to terminate the process and open the csv file separately, it has only one data frame. What is the problem? Thanks. On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
Hello,
Works with me:
set.seed(6574)
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}
nrow(pre.mat) # should be 50
Can you give us an example that doesn't work?
Rui Barradas
Em 06-01-2017 18:00, lily li escreveu:
Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks, Richard. But if the data cannot fill the constructed data frame, will there be NA values? On Fri, Jan 6, 2017 at 10:07 PM, Richard M. Heiberger <rmh at temple.edu> wrote:
Incrementally increasing the size of an array is not efficient in R. The recommended technique is to allocate as much space as you will need, and then fill it.
system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
user system elapsed 0.011 0.000 0.011
dim(tmp)
[1] 1001 5
system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001) tmp[i,] <-
1:5}) user system elapsed 0.001 0.000 0.001
dim(tmp)
[1] 1001 5 On Fri, Jan 6, 2017 at 11:46 PM, lily li <chocold12 at gmail.com> wrote:
Hi Rui, Thanks for your reply. Yes, when I tried to rbind two dataframes, it
works.
However, if there are more than 50, it got stuck for hours. When I tried
to
terminate the process and open the csv file separately, it has only one data frame. What is the problem? Thanks. On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas <ruipbarradas at sapo.pt>
wrote:
Hello,
Works with me:
set.seed(6574)
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}
nrow(pre.mat) # should be 50
Can you give us an example that doesn't work?
Rui Barradas
Em 06-01-2017 18:00, lily li escreveu:
Hi R users,
I have a question about filling a dataframe in R using a for loop.
I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired
for.
What is the problem and how to solve it? Thanks.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello, I believe you should follow Jeremiah's sugestion to first read all csv files into a list and then rbind them. Something like the following. file_list <- list.files(pattern = "*.csv") df_list <- lapply(file_list, read.csv) result <- do.call(rbind, df_list) Hope this helps, Rui Barradas Em 07-01-2017 06:51, lily li escreveu:
Thanks, Richard. But if the data cannot fill the constructed data frame,
will there be NA values?
On Fri, Jan 6, 2017 at 10:07 PM, Richard M. Heiberger <rmh at temple.edu
<mailto:rmh at temple.edu>> wrote:
Incrementally increasing the size of an array is not efficient in R.
The recommended technique is to allocate as much space as you will
need, and then fill it.
> system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
user system elapsed
0.011 0.000 0.011
> dim(tmp)
[1] 1001 5
> system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001)
tmp[i,] <- 1:5})
user system elapsed
0.001 0.000 0.001
> dim(tmp)
[1] 1001 5
On Fri, Jan 6, 2017 at 11:46 PM, lily li <chocold12 at gmail.com
<mailto:chocold12 at gmail.com>> wrote:
> Hi Rui,
>
> Thanks for your reply. Yes, when I tried to rbind two dataframes,
it works.
> However, if there are more than 50, it got stuck for hours. When
I tried to
> terminate the process and open the csv file separately, it has
only one
> data frame. What is the problem? Thanks.
>
>
> On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas
<ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>> wrote:
>
>> Hello,
>>
>> Works with me:
>>
>> set.seed(6574)
>>
>> pre.mat = data.frame()
>> for(i in 1:10){
>> mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5,
TRUE))
>> pre.mat = rbind(pre.mat, mat.temp)
>> }
>>
>> nrow(pre.mat) # should be 50
>>
>>
>> Can you give us an example that doesn't work?
>>
>> Rui Barradas
>>
>>
>> Em 06-01-2017 18:00, lily li escreveu:
>>
>>> Hi R users,
>>>
>>> I have a question about filling a dataframe in R using a for loop.
>>>
>>> I created an empty dataframe first and then filled it, using
the code:
>>> pre.mat = data.frame()
>>> for(i in 1:10){
>>> mat.temp = data.frame(some values filled in)
>>> pre.mat = rbind(pre.mat, mat.temp)
>>> }
>>> However, the resulted dataframe has not all the rows that I
desired for.
>>> What is the problem and how to solve it? Thanks.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
>>> PLEASE do read the posting guide http://www.R-project.org/posti >>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.