Can I improve the efficiency of my scan() command?
On Sat, 12 Apr 2003, Ko-Kang Kevin Wang wrote:
Hi, Suppose I use the following codes to read in a data set. ###############################################
rating <- scan("../Data/Rating.csv",
+ what = list( + usage = "", + mileage = 0, + sex = "", + excess = "", + ncd = "", + primage = "", + minage = "", + drivers = "", + district = "", + cargroup = "", + car.age = 0, + wsclms = "", + adclms = "", + ftclms = "", + pdclms = "", + piclms = "", + adincur = 0, + pdincur = 0, + wsincur = 0, + ftincur = 0, + piincur = 0, + record = 0, + days = 0, + minagen = 0, + primagen = 0), + sep=",", quiet = TRUE, skip = 1)
rating.df <- as.data.frame(rating) rating.df <- rating.df[, c(-6, -7, -22)] attach(rating.df) summary(rating.df)
<snip>
######################################################################### It worked all right, but I'm just wondering if there is a more efficient way (it takes about 10 minutes to run the above scripts, for my 300,000 x 25 CSV file)?
It should be quicker not to convert to a data frame. You can just keep the data as a list of vectors and lapply() the summary() function. -thomas