Skip to content

problem

9 messages · Erika Frigo, jim holtman, Roland Rau +4 more

#
Is it just a file with a million values or is it some type of a
structure with a million rows of indeterinent columns?  If it is just
a million numbers, you can easily read with is 'scan' or 'read.table'
with no problem.  I work with data structures that have several
million rows and 4-5 columns without any problems.  What is the format
of the input?
On 3/4/08, Erika Frigo <erika.frigo at unimi.it> wrote:

  
    
#
Hi,
Erika Frigo wrote:
Maybe the package SQLiteDF could be useful for you. 
http://cran.r-project.org/web/packages/SQLiteDF/index.html

But since you mention that the data has 1 mio values, I think it should 
be no problem to read the data set "conventionally".
 > (object.size(rnorm(1e06)))/(1024^2)
[1] 7.629417

Assuming that all data are numeric, the data-set should consume less 
than 8MB.

I hope this helps,
Roland
#
On Tue, Mar 4, 2008 at 10:35 AM, Erika Frigo <erika.frigo at unimi.it> wrote:
A good place to start is the manual "R Data Import/Export" that comes
with every installed version of R.
#
Goodmorning Jim,
My file has not only more than a million values, but more than a million 
rows and moreless 30 columns (it is a productive dataset for cows), infact 
with read.table i'm not able to import it.
It is an xls file.
How do you import your million rows and 4-5 columns files?
thank you
regards

Dr.ssa Erika Frigo
Universit? degli Studi di Milano
Facolt? di Medicina Veterinaria
Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza Alimentare 
(VSA)

Via Grasselli, 7
20137 Milano
Tel. 02/50318515
Fax 02/50318501
----- Original Message ----- 
From: "jim holtman" <jholtman a gmail.com>
To: "Erika Frigo" <erika.frigo a unimi.it>
Cc: <r-help a r-project.org>
Sent: Tuesday, March 04, 2008 6:13 PM
Subject: Re: [R] problem
#
On Wed, Mar 05, 2008 at 12:32:19PM +0100, Erika Frigo wrote:
read.table() expects clear text -- e.g. csv or tab separated in the case
of read.delim(). If your file is in xls format the simplest option would
be to export the data to CSV format from Excel.

If for some reason that is not an option please have a look at the "R
Data Import/Export" manual.

Of course neither will solve the problem of not enough memory if your
file is simply too large. In that case you will may want to put your
data into a database and have R connect to it and retrieve the data in
smaller chunks as required.

cu
	Philipp
#
Philipp Pagel <p.pagel at wzw.tum.de> wrote in
news:20080305120637.GA8181 at localhost:
There is something very wrong here. Even the most recent versions of 
Excel cannot handle files with a million rows. Heck, they can't even 
handle files with one-tenth than number. In earlier versions the limit 
was on the order of 36K.
#
On Thu, Mar 6, 2008 at 12:00 AM, David Winsemius <dwinsemius at comcast.net> wrote:
Excel 2007 can handle over 1 million rows:

http://office.microsoft.com/en-us/excel/HP100738491033.aspx#WorksheetWorkbook
#
"Gabor Grothendieck" <ggrothendieck at gmail.com> wrote in
news:971536df0803052116q6a91bd95ja50ed541330d8ff1 at mail.gmail.com:
Yes. I was going to correct myself. I saw another posting that said they 
and Excel file with had 200,000 and just got back from checking the 
2007version. 1,048,576 rows. The 2003 version's limit was 65,536 rows.