Skip to content

read.table and missing values

6 messages · Peter Breuer, Brian Ripley, Ragnar Beer

#
I'd like to share data files (tab delimited text files)
that i read in via read.table with other applications.
Missing data are empty fields (two tabs following each other)
I couldn't find a way yet to 'convince' R to interprete this to be 
missing values.
The read.table option na.strings="" or na.strings='' didn't work.
using any special character in the data file for missings (like '#') 
and defining it for missing using na.strings="#" worked.
So I was able to solve the problem doing a global replace.
But I'd prefer to read in the data files as is and wonder whether 
there is a possibility to that in R.
Data Import/Export Manual says that empty fields in numeric columns 
are regarded as missing values.
But with my versions of R (Windows 1.2.3, Mac) it didn't work
Documention, FAQ and the mailing archive didn't have any further hint.
Thanks for any advice

Peter
#
On Tue, 19 Jun 2001, Peter Breuer wrote:

            
It's the default ....
It works as documented, and as documented does what you want (so no
further hint is needed).  For file foo.dat

a,b,c
1,2,3
4,,6
,8,9

I get
a  b c
1  1  2 3
2  4 NA 6
3 NA  8 9

as advertised.

You have set sep="\t", haven't you?  The default separator is white space,
and that will swallow two tabs.  The following had tabs in originally,

a	b	c
1	2	3
4		6
	8	9

and gives
a  b c
1  1  2 3
2  4 NA 6
3 NA  8 9

So, if you do that it should work for you too.  And if you want to claim
on R-help that things do not work, please include an example so we can see
what you are doing (or not doing).
#
Thanks - everything is o.k. and working
This did solve the problem. I didn't set sep="\t". R read complete
tab delimited data files nontheless and complained only when it met 
missing data.
This confused me.
sorry for having missed to send an example.

Peter
#
On Tue, 19 Jun 2001, Peter Breuer wrote:

            
[...]
Using read.delim (same help page) might be less confusing, although I
think fill = TRUE might also cause confusion.
#
<snip>
<snip>

Cool! Thanks a lot! I had the same problem a while ago and solved it
by replacing two consecutive tabs with \t""\t. Now I see what the
problem was: When I'm not using R I'm using StatView and StatView
defaults to the tab as separator. (I don't remember what it was in
SPSS or Systat.) So I assumed that it's the same with R.
I didn't notice the sep = "" in the Usage section of the help page.
To make the default value more visible in this case it would be
nice if in the Arguments/sep section for read.table it would read
... If sep = "" (the default) the separator is ...
                 -------------

Cheers,

Ragnar

(Must've said "blimey" once too often -
  now I'm really blind every now and then ;)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
<snip>

Oops, I meant:
... If sep = "" (the default for read.table) the separator is ...

Ragnar
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._