Hi, I have a medium-sized (19MB) CSV file that I'd like to read into R. The read.csv() function seems to be a bit inefficient to deal with it, and I remember that using scan() with "what" options is better. However I'm unable to understand how to use it. The first few lines of the data look like: USAGE,MILEAGE,SEX,EXCESS,NCD,PRIMAGE,MINAGE,DRIVERS,DISTRICT,CARGROUP,CAR_AGE,WSCLMS,ADCLMS,FTCLMS,PDCLMS,PICLMS,ADINCUR,PDINCUR,WSINCUR,FTINCUR,PIINCUR,RECORD,DAYS,MINAGEN,PRIMAGEN SC,7000,M,100,4,59,25,3,4,7,6,0,0,0,,,0,,0,0,,1,85,25,59 SC,7000,M,100,4,59,59,2,4,13,5,0,0,0,,,0,,0,0,,2,278,59,59 SC,7000,M,100,4,60,60,2,4,13,5,0,0,0,,,0,,0,0,,3,364,60,60 SB,10000,M,75,4,53,44,2,3,14,4,1,0,0,0,0,0,0,146.18,0,0,4,364,44,53 SB,10000,M,75,4,54,45,2,3,14,4,0,0,0,,,0,,0,0,,5,363,45,54 i.e. columns are separated by commas and may contain missing values, and has headers. I'd really appreciated it if someone can tell me how to use the scan() command to read this data in. Cheers, Kevin ------------------------------------------------------------------------------ /* Time is the greatest teacher, unfortunately it kills its students */ Ko-Kang Kevin Wang Master of Science (MSc) Student Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022
scan() with "what"
7 messages · Ko-Kang Kevin Wang, Ken Lee, Michel ARNAUD +3 more
On Sun, 15 Dec 2002, Ko-Kang Kevin Wang wrote:
Hi, I have a medium-sized (19MB) CSV file that I'd like to read into R. The read.csv() function seems to be a bit inefficient to deal with it, and I remember that using scan() with "what" options is better.
Unlikely if you specify colClasses, which sets up calls to scan() for you.
However I'm unable to understand how to use it. The first few lines of the data look like: USAGE,MILEAGE,SEX,EXCESS,NCD,PRIMAGE,MINAGE,DRIVERS,DISTRICT,CARGROUP,CAR_AGE,WSCLMS,ADCLMS,FTCLMS,PDCLMS,PICLMS,ADINCUR,PDINCUR,WSINCUR,FTINCUR,PIINCUR,RECORD,DAYS,MINAGEN,PRIMAGEN SC,7000,M,100,4,59,25,3,4,7,6,0,0,0,,,0,,0,0,,1,85,25,59 SC,7000,M,100,4,59,59,2,4,13,5,0,0,0,,,0,,0,0,,2,278,59,59 SC,7000,M,100,4,60,60,2,4,13,5,0,0,0,,,0,,0,0,,3,364,60,60 SB,10000,M,75,4,53,44,2,3,14,4,1,0,0,0,0,0,0,146.18,0,0,4,364,44,53 SB,10000,M,75,4,54,45,2,3,14,4,0,0,0,,,0,,0,0,,5,363,45,54 i.e. columns are separated by commas and may contain missing values, and has headers. I'd really appreciated it if someone can tell me how to use the scan() command to read this data in.
Try colClasses first.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595
1 day later
Dear,
coltypes<-rep("character(0)",25)
x<-scan(file,what=noquote(as.list(coltypes)),sep=",",quiet=TRUE,skip=1)
names(x)<-scan(file,what="",nlines=1, sep=",")
x<-as.data.frame(x)
I hope it can help you.
Ken
-----Original Message-----
From: r-help-admin at stat.math.ethz.ch [mailto:r-help-admin at stat.math.ethz.ch]On Behalf Of Ko-Kang Kevin Wang
Sent: Sunday, December 15, 2002 4:45 AM
To: R Help
Subject: [R] scan() with "what"
Hi,
I have a medium-sized (19MB) CSV file that I'd like to read into R. The
read.csv() function seems to be a bit inefficient to deal with it, and I
remember that using scan() with "what" options is better.
However I'm unable to understand how to use it. The first few lines of
the data look like:
USAGE,MILEAGE,SEX,EXCESS,NCD,PRIMAGE,MINAGE,DRIVERS,DISTRICT,CARGROUP,CAR_AGE,WSCLMS,ADCLMS,FTCLMS,PDCLMS,PICLMS,ADINCUR,PDINCUR,WSINCUR,FTINCUR,PIINCUR,RECORD,DAYS,MINAGEN,PRIMAGEN
SC,7000,M,100,4,59,25,3,4,7,6,0,0,0,,,0,,0,0,,1,85,25,59
SC,7000,M,100,4,59,59,2,4,13,5,0,0,0,,,0,,0,0,,2,278,59,59
SC,7000,M,100,4,60,60,2,4,13,5,0,0,0,,,0,,0,0,,3,364,60,60
SB,10000,M,75,4,53,44,2,3,14,4,1,0,0,0,0,0,0,146.18,0,0,4,364,44,53
SB,10000,M,75,4,54,45,2,3,14,4,0,0,0,,,0,,0,0,,5,363,45,54
i.e. columns are separated by commas and may contain missing values, and
has headers.
I'd really appreciated it if someone can tell me how to use the
scan() command to read this data in.
Cheers,
Kevin
------------------------------------------------------------------------------
/* Time is the greatest teacher, unfortunately it kills its students */
Ko-Kang Kevin Wang
Master of Science (MSc) Student
Department of Statistics
University of Auckland
New Zealand
Homepage: http://www.stat.auckland.ac.nz/~kwan022
______________________________________________
R-help at stat.math.ethz.ch mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Hello Does anybody know how to draw a plot with same units on each axes ? For exemple, if on X the range of value is [1, 2] and on Y the range is [1, 10] I would like the length of Y is 5*the length of X. Any suggestions ? -- Michel ARNAUD CIRAD TA60/15 73, av. Jean Fran?ois Breton 34938 MONTPELLIER CEDEX 5 tel : 04 67 59 38 34 Fax : 04 67 59 38 38 -------------- next part -------------- A non-text attachment was scrubbed... Name: michel.arnaud.vcf Type: text/x-vcard Size: 204 bytes Desc: Carte pour Michel ARNAUD Url : https://stat.ethz.ch/pipermail/r-help/attachments/20021216/53c9e65b/michel.arnaud.vcf
Michel ARNAUD wrote:
Hello Does anybody know how to draw a plot with same units on each axes ? For exemple, if on X the range of value is [1, 2] and on Y the range is [1, 10] I would like the length of Y is 5*the length of X. Any suggestions ?
Does plot(...., asp = 1) help? For details see ?plot.default and ?plot.window. Uwe Ligges
-- Michel ARNAUD CIRAD TA60/15 73, av. Jean Fran?ois Breton 34938 MONTPELLIER CEDEX 5 tel : 04 67 59 38 34 Fax : 04 67 59 38 38
On Mon, 16 Dec 2002, Michel ARNAUD wrote:
Does anybody know how to draw a plot with same units on each axes ? For exemple, if on X the range of value is [1, 2] and on Y the range is [1, 10] I would like the length of Y is 5*the length of X. Any suggestions ?
1) use eqscplot in package MASS 2) use argument `asp' (documented in ?plot.window)
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595
"Michel" == Michel ARNAUD <michel.arnaud at cirad.fr>
on Mon, 16 Dec 2002 08:34:12 +0100 writes:
[by replying to another R-help message. ***Please don't do this!*** It destroys sensible threading! Please search "Michel Arnoud" in the archive, https://www.stat.math.ethz.ch/pipermail/r-help/2002-December/thread.html to see the the wrong thread your message is in!] Michel> Hello Does anybody know how to draw a plot with same Michel> units on each axes ? For exemple, if on X the range Michel> of value is [1, 2] and on Y the range is [1, 10] I Michel> would like the length of Y is 5*the length of X. Michel> Any suggestions ? Use plot(....., asp = 1) ## asp = [aspect ratio] = 1 This is explained a bit in help(plot.default) An older alternative that also works in S-PLUS is library(MASS) eqscplot(.....) Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><