Skip to content

importing large datasets in R

8 messages · gaurav singh, Wensui Liu, Duncan Murdoch +4 more

#
On 13-01-19 3:28 AM, gaurav singh wrote:
Specifying the type of each column with colClasses will speed up 
read.table a lot in a big file.

You have a lot of data, so having a lot of memory will help.  You may 
want to work in 64 bit R, which has access to a lot more than 32 bit R sees.

Duncan Murdoch
#
I'm not sure I understand your question. It is always better to use an
example:
+     matrix(rnorm(6), 2, 3))), c(2, 3, 5))
[,1]       [,2]       [,3]
[1,]  0.189669255  0.3368646 0.34261301
[2,] -0.009700353 -0.4676745 0.01974906

The result is a matrix, not a data frame, and certainly not "resulting
data frames." What are you trying to cbind?

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
#
Hi,

This could be also done by:
#Using Arun's example:
?res<- Reduce('+', split(df, grp))/length(levels(grp))
?# ??? V1??? V2??? V3??? V4??? V5??? V6??? V7??? V8??? V9?? V10
#1? 417.3 792.2 504.2 506.1 513.9 480.7 545.4 564.4 473.7 486.2
#2? 585.8 416.6 409.5 417.8 480.1 586.4 436.1 615.1 449.8 501.2
#3? 459.3 449.1 542.0 411.6 404.6 507.6 472.0 344.0 363.2 485.1
#4? 591.1 448.4 482.6 464.0 554.0 374.1 567.9 450.0 477.9 488.0
#5? 433.1 438.2 441.4 596.4 356.9 461.6 356.7 457.4 434.9 510.4
6 # 425.7 498.3 452.0 489.4 302.8 538.1 270.6 418.6 564.1 545.8
#7? 755.1 526.4 615.2 559.9 483.3 379.7 439.3 458.8 528.5 564.0
#8? 599.7 579.4 473.2 585.1 508.3 643.7 432.1 587.2 547.6 506.2
#9? 471.8 321.0 375.8 394.4 355.5 434.4 532.1 640.5 490.1 619.1
#10 356.6 434.3 403.9 445.0 416.2 532.8 570.9 548.9 697.9 488.8
library(plyr)

?res1<-aaply(laply(split(df,((1:nrow(df)-1)%/% 10)+1),as.matrix),c(2,3),mean)

res1
#??? X2
#X1????? V1??? V2??? V3??? V4??? V5??? V6??? V7??? V8??? V9?? V10
?# 1? 417.3 792.2 504.2 506.1 513.9 480.7 545.4 564.4 473.7 486.2
?# 2? 585.8 416.6 409.5 417.8 480.1 586.4 436.1 615.1 449.8 501.2
?# 3? 459.3 449.1 542.0 411.6 404.6 507.6 472.0 344.0 363.2 485.1
?# 4? 591.1 448.4 482.6 464.0 554.0 374.1 567.9 450.0 477.9 488.0
?# 5? 433.1 438.2 441.4 596.4 356.9 461.6 356.7 457.4 434.9 510.4
?# 6? 425.7 498.3 452.0 489.4 302.8 538.1 270.6 418.6 564.1 545.8
?# 7? 755.1 526.4 615.2 559.9 483.3 379.7 439.3 458.8 528.5 564.0
? #8? 599.7 579.4 473.2 585.1 508.3 643.7 432.1 587.2 547.6 506.2
? #9? 471.8 321.0 375.8 394.4 355.5 434.4 532.1 640.5 490.1 619.1
? #10 356.6 434.3 403.9 445.0 416.2 532.8 570.9 548.9 697.9 488.8
A.K.



----- Original Message -----
From: ya <xinxi813 at 126.com>
To: r-help <r-help at r-project.org>
Cc: 
Sent: Saturday, January 19, 2013 9:49 AM
Subject: [R] calculating mean matrix

Hi list,

Thank you vey much for reading this post.

I have a data frame, I am trying to split it into a couple of data frame using one of the columns, say, x. After I get the data frames, I am planning to treat them as matrices and trying to calculate an element by element mean matrix. Could anyone give me some advice how to do it?

So far, I know that if I have a couple of matrices, say data1,data2,data3,data4...dataN, I can do it like this:

data=array(cbind(data1,data2,data3,data4,....dataN), c(2, 3, N))
#2 refers to row number of matrix, 3 refers to column number of matrix, N refers to number of matrices to be averaged.
meanmtrx=apply(data,1:2,mean)

but I do not know how to use the resulting data frames with cbind(). Maybe there are other better ways. Any advice is appreciated.

Thank you very much.

Have a nice day.

ya 
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.