Dear R-users,
I have encountered the following problem every now and then. But I was
dealing with a very small dataset before, so it wasn't a problem (I
just edited the dataset in Openoffice speadsheet). This time I have to
deal with many large datasets containing commuting flow data. I
appreciate if anyone could give me a hint or clue to get out of this
problem.
I have a .dat file called "1081.dat": 1001 means Birmingham, AL.
I imported this .dat file using read.table like
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)
Then I got this error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9499 did not have 209 elements
Since I got an error message saying other rows did not have 209
elements, I added skip=c(205,9499,9294)) in hoping that R would take
care of this problem. But I got a similar error message:
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9294 did not have 209 elements
In addition: Warning message:
the condition has length > 1 and only the first element will be used
in: if (skip > 0) readLines(file, skip)
Is there any way to let a R code to automatically skip problematic
rows? Thank you very much!
Taka
problems in read.table
4 messages · Gabor Grothendieck, Peter Dalgaard, Takatsugu Kobayashi
See ?count.fields to get a vector of how many fields are on each line. Also fill = TRUE on read.table() can be used to fill out short lines if that is appropriate.
On 9/6/07, tkobayas at indiana.edu <tkobayas at indiana.edu> wrote:
Dear R-users,
I have encountered the following problem every now and then. But I was
dealing with a very small dataset before, so it wasn't a problem (I
just edited the dataset in Openoffice speadsheet). This time I have to
deal with many large datasets containing commuting flow data. I
appreciate if anyone could give me a hint or clue to get out of this
problem.
I have a .dat file called "1081.dat": 1001 means Birmingham, AL.
I imported this .dat file using read.table like
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)
Then I got this error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9499 did not have 209 elements
Since I got an error message saying other rows did not have 209
elements, I added skip=c(205,9499,9294)) in hoping that R would take
care of this problem. But I got a similar error message:
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9294 did not have 209 elements
In addition: Warning message:
the condition has length > 1 and only the first element will be used
in: if (skip > 0) readLines(file, skip)
Is there any way to let a R code to automatically skip problematic
rows? Thank you very much!
Taka
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
tkobayas at indiana.edu wrote:
Dear R-users,
I have encountered the following problem every now and then. But I was
dealing with a very small dataset before, so it wasn't a problem (I
just edited the dataset in Openoffice speadsheet). This time I have to
deal with many large datasets containing commuting flow data. I
appreciate if anyone could give me a hint or clue to get out of this
problem.
I have a .dat file called "1081.dat": 1001 means Birmingham, AL.
I imported this .dat file using read.table like
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)
Then I got this error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9499 did not have 209 elements
Since I got an error message saying other rows did not have 209
elements, I added skip=c(205,9499,9294)) in hoping that R would take
care of this problem. But I got a similar error message:
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9294 did not have 209 elements
In addition: Warning message:
the condition has length > 1 and only the first element will be used
in: if (skip > 0) readLines(file, skip)
Is there any way to let a R code to automatically skip problematic
rows? Thank you very much!
Skip is the NUMBER of rows to skip before reading. It has to be a single number. You can use fill and flush to read lines with too few or too many elements, but it might be better to investigate the cause of the problem. What are in those lines? Quote and comment characters are common culprits.
Taka
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Thank you very much for help. I am learning R every day.... Taka Quoting Peter Dalgaard <p.dalgaard at biostat.ku.dk>:
tkobayas at indiana.edu wrote:
Dear R-users,
I have encountered the following problem every now and then. But I
was dealing with a very small dataset before, so it wasn't a problem
(I just edited the dataset in Openoffice speadsheet). This time I
have to deal with many large datasets containing commuting flow
data. I appreciate if anyone could give me a hint or clue to get out
of this problem.
I have a .dat file called "1081.dat": 1001 means Birmingham, AL.
I imported this .dat file using read.table like
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)
Then I got this error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, :
line 9499 did not have 209 elements
Since I got an error message saying other rows did not have 209
elements, I added skip=c(205,9499,9294)) in hoping that R would take
care of this problem. But I got a similar error message:
tmp<-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, :
line 9294 did not have 209 elements
In addition: Warning message:
the condition has length > 1 and only the first element will be used
in: if (skip > 0) readLines(file, skip)
Is there any way to let a R code to automatically skip problematic
rows? Thank you very much!
Skip is the NUMBER of rows to skip before reading. It has to be a single number. You can use fill and flush to read lines with too few or too many elements, but it might be better to investigate the cause of the problem. What are in those lines? Quote and comment characters are common culprits.
Taka
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
------------------------------------ Takatsugu Kobayashi PhD Student Indiana University, Dept. Geography