Try this:
read.table(pipe("/Rtools/bin/gawk -f cut.awk bigdata.dat"))
where cut.awk contains the single line (assuming you
want fields 101 through 110 and none other):
{ for(i = 101; i <= 110; i++) printf("%s ", $i); printf "\n" }
or just use cut. I tried the gawk command above on Windows
Vista with an artificial file of 500,000 columns and 2 rows and it seemed
instantaneous.
On Windows the above uses gawk from Rtools available at:
http://www.murdoch-sutherland.com/Rtools/
or you can separately install gawk. Rtools also has cut if you
prefer that.
On Mon, Sep 22, 2008 at 2:50 AM, Jos? E. Lozano <lozalojo at jcyl.es> wrote:
Hello,
Recently I have been trying to open a huge database with no success.
It's a 4GB csv plain text file with around 2000 rows and over 500,000
columns/variables.
I have try with The SAS System, but it reads only around 5000 columns, no
more. R hangs up when opening.
Is there any way to work with "parts" (a set of columns) of this database,
since its impossible to manage it all at once?
Is there any way to establish a link to the csv file and to state the
columns you want to fetch every time you make an analysis?
I've been searching the net, but found little about this topic.
Best regards,
Jose Lozano
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.