Skip to content

different outcomes using read.table vs read.csv

3 messages · jatwood, Barry Rowlingson, Jason Rupert

#
Good Afternoon
I have noticed results similar to the following several times as I 
have used R over the past several years.
My .csv file has a header row and 3073 rows of data.

 > rskreg<-read.table('D:/data/riskregions.csv',header=T,sep=",")
 > dim(rskreg)
[1] 2722   13
 > rskreg<-read.csv('D:/data/riskregions.csv',header=T)
 > dim(rskreg)
[1] 3073   13
 >

Does someone know what could be causing the read.table and read.csv 
functions to give different results on some occasions?  The 
riskregions.csv file was generated with and saved from MS.Excel.

Joe A
#
2009/3/13 jatwood <jatwood at montana.edu>:
read.table has 'comment.char="#"', so if a line starts with # it gets
ignored. read.csv doesn't have this set, so it might explain why
read.csv gets more than read.table...

 Do you have lines starting with #? Try read.table with
comment.char="" and see if you get the right number. See the help for
read.table for more info.

 I'd not seen this before, hope it hasn't bitten me...

Barry
#
Without data it is a bit difficult.  However, you may want to check out the following:

library(prob)

That is from:
http://finzi.psych.upenn.edu/R/R-devel/archive/26683.html

It allows you to diff the data.frames, so you can see what is missing. 

This should allow you to find out what rows are missing.  Maybe some NA rows were automatically removed.
--- On Fri, 3/13/09, jatwood <jatwood at montana.edu> wrote: