Mysterious issues with reading text files from R in ArcGIS and Excel

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/351cfcd4/attachment.pl>
It seems within the last ~3 months Ive been having issues with writing text or csv files from a R data frame.  The problem is multifold and it is hard to filter  out what is going on and where the problem is.  So, Im hoping someone else has come across this and may provide insight.
I think you need to provide a simple example for us to try, either by 
putting a small example of one of your files online for us to download, 
or (better) by giving us self-contained code to duplicate the problem.

You might also get better help (especially about ArcGIS) on the 
R-sig-Geo mailing list: <https://stat.ethz.ch/mailman/listinfo/r-sig-geo>.

Duncan Murdoch

My current settings for R:
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:

[1] LC_COLLATE=Swedish_Sweden.1252  LC_CTYPE=Swedish_Sweden.1252    LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252

attached base packages:
[1] tcltk     stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] adehabitat_1.8.11 shapefiles_0.6    foreign_0.8-51    tkrplot_0.0-23    ade4_1.5-1

loaded via a namespace (and not attached):
[1] tools_2.15.2

I am using Microsoft Excel 2010 and ArcGIS 10.1sp1 for Desktop

Basically, no matter what data frame I am working on, when I export it to a text file to be use in Excel or ArcGIS problems arise.  Im not sure if it is R or these other programs, maybe forums for ArcGIS might be more appropriate, but this problem only occurs when I use tables that have been produced from an R session.

When I try to open a text file in Excel, either I get an error message stating
The file you are trying to open is in a different format than specified by the file extension.  Verify that the file is not corrupted and is from a trusted source.
Followed by
Excel has detected that 'file.txt' is a SYLK file, but cannot load it.  Either the file has errors or is not a SYLK file format.  Click OK to open the file in a different format
Then the file opens

Otherwise, the file opens "fine" the first time through - and "looks" ok. I can't figure out what Im doing different between the two commands of write.table as they are always written the same:
write.csv(file, file = "D:/mylocations/fileofinterest.csv") or write.table(file, file = "D:/mylocations/fileofinterest.txt")
Sometimes I will try to add sep = "," or sep = ";" but these don't make a difference (which I didn't figure they would).

The other program I use is ArcGIS and bringing in a txt file from R is really messing things up as 2 new columns of information are typically added and date/time data is usually lost with txt files, but not with csv files.

For instance - a text file that looks like this in Excel:
     id       x       y                date    R1dmed    R1dmean R1error R2error
1 F07001 1482445 6621768 2007-03-05 10:00:53 2498.2973 2498.2973   FALSE   FALSE
2 F07001 1481274 6619628 2007-03-05 12:00:41  657.1029  657.1029    FALSE   FALSE
3 F07001 1481279 6619630 2007-03-05 14:01:12  660.3569  660.3569    FALSE   FALSE
4 F07001 1481271 6619700 2007-03-05 16:00:39  620.1397  620.1397    FALSE   FALSE

  in ArcGIS now looks like this:

Field1idid_Xid_YxydateR1dmedR1dmean R1errorR2errorOBJECTID *
1F07001118.0818119.485541e+01514824456621768NA2498.297272498.29727FALSEFALSE1
2F07001118.0818119.485541e+01514812746619628NA657.102922657.102922FALSEFALSE2
3F07001118.0818119.485541e+01514812796619630NA660.356911660.356911FALSEFALSE3
4F07001118.0818119.485541e+01514812716619700NA620.139702620.139702FALSEFALSE4
5F07001118.0818119.485541e+01514808496620321NA378.186792378.186792FALSEFALSE5

Where did id_X and id_Y come from?? What are they??
What happened to the Date column???  Why does the date column show up when I use write.csv but not write.table?

Thank you for your help.

~K
	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/65d922af/attachment.pl>
Here's the first 5 lines of my dataset:

structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
    R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
    620.139702002702, 378.186792471657), R1dmean = c(2498.29727014221,
    657.102921923195, 660.356911071581, 620.139702002702, 378.186792471657
    ), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error = c(FALSE,
    FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")

and here's the code I wrote for this file:

write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")
That's not a CSV file, it is being written with a blank as separator.   
Since it also has blanks in the formatted POSIXct column, you're very 
likely to run into problems reading it.

Use write.csv(test, "test.csv") and you'll have fewer problems.  If you 
want tab-delimited columns instead, you'll need to specify that in the 
write.table call.

Duncan Murdoch
~K

------------------------------------------------------------------------
*From:* Duncan Murdoch <murdoch.duncan at gmail.com>
*To:* Kerry <kernicholson at yahoo.com>
*Cc:* "r-help at r-project.org" <r-help at r-project.org>
*Sent:* Monday, March 4, 2013 4:48 PM
*Subject:* Re: [R] Mysterious issues with reading text files from R in 
ArcGIS and Excel

On 04/03/2013 10:09 AM, Kerry wrote:
It seems within the last ~3 months Ive been having issues with 
writing text or csv files from a R data frame. The problem is 
multifold and it is hard to filter  out what is going on and where the 
problem is.  So, Im hoping someone else has come across this and may 
provide insight.

I think you need to provide a simple example for us to try, either by
putting a small example of one of your files online for us to download,
or (better) by giving us self-contained code to duplicate the problem.

You might also get better help (especially about ArcGIS) on the
R-sig-Geo mailing list: <https://stat.ethz.ch/mailman/listinfo/r-sig-geo>.

Duncan Murdoch

My current settings for R:
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:

[1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 
LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252

attached base packages:
[1] tcltk    stats    graphics  grDevices utils datasets  methods  base

other attached packages:
[1] adehabitat_1.8.11 shapefiles_0.6    foreign_0.8-51   
tkrplot_0.0-23    ade4_1.5-1
loaded via a namespace (and not attached):
[1] tools_2.15.2

I am using Microsoft Excel 2010 and ArcGIS 10.1sp1 for Desktop

Basically, no matter what data frame I am working on, when I export 
it to a text file to be use in Excel or ArcGIS problems arise.  Im not 
sure if it is R or these other programs, maybe forums for ArcGIS might 
be more appropriate, but this problem only occurs when I use tables 
that have been produced from an R session.
When I try to open a text file in Excel, either I get an error 
message stating
The file you are trying to open is in a different format than 
specified by the file extension.  Verify that the file is not 
corrupted and is from a trusted source.
Followed by
Excel has detected that 'file.txt' is a SYLK file, but cannot load 
it.  Either the file has errors or is not a SYLK file format.  Click 
OK to open the file in a different format
Then the file opens

Otherwise, the file opens "fine" the first time through - and 
"looks" ok. I can't figure out what Im doing different between the two 
commands of write.table as they are always written the same:
write.csv(file, file = "D:/mylocations/fileofinterest.csv") or 
write.table(file, file = "D:/mylocations/fileofinterest.txt")
Sometimes I will try to add sep = "," or sep = ";" but these don't 
make a difference (which I didn't figure they would).
The other program I use is ArcGIS and bringing in a txt file from R 
is really messing things up as 2 new columns of information are 
typically added and date/time data is usually lost with txt files, but 
not with csv files.
For instance - a text file that looks like this in Excel:
     id      x      y                date    R1dmed R1dmean R1error 
R2error
1 F07001 1482445 6621768 2007-03-05 10:00:53 2498.2973 2498.2973  
FALSE  FALSE
2 F07001 1481274 6619628 2007-03-05 12:00:41  657.1029 657.1029    
FALSE  FALSE
3 F07001 1481279 6619630 2007-03-05 14:01:12  660.3569 660.3569    
FALSE  FALSE
4 F07001 1481271 6619700 2007-03-05 16:00:39  620.1397 620.1397    
FALSE  FALSE
 in ArcGIS now looks like this:

Field1idid_Xid_YxydateR1dmedR1dmean R1errorR2errorOBJECTID *

1F07001118.0818119.485541e+01514824456621768NA2498.297272498.29727FALSEFALSE1

2F07001118.0818119.485541e+01514812746619628NA657.102922657.102922FALSEFALSE2

3F07001118.0818119.485541e+01514812796619630NA660.356911660.356911FALSEFALSE3

4F07001118.0818119.485541e+01514812716619700NA620.139702620.139702FALSEFALSE4

5F07001118.0818119.485541e+01514808496620321NA378.186792378.186792FALSEFALSE5
Where did id_X and id_Y come from?? What are they??
What happened to the Date column???  Why does the date column show 
up when I use write.csv but not write.table?
Thank you for your help.

~K
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html 
<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/70b4f3a2/attachment.pl>
I realize my command code is not writing a CSV file, I already pointed 
out that CSV file seems to work OK but not in the TXT format.
Sorry.  In that case, I think you really do have to go to R-sig-Geo to 
find someone who knows about ArcGIS.

Duncan Murdoch
 Regardless of that, there should be no problems in ArcGIS in reading 
the date column - in ArcGIS it will simply recognize it as a text 
field by default.  As I said in my initial posting, when I use other 
programs to create a text file (say textpad,wordpad, notepad or excel) 
and bring the txt file into ArcGIS, there is no dropping of the 
information - it doesn't turn the column into NA's.  It only does this 
when I try to add text files that were generated from using the 
write.table or write.csv.

Any thoughts at to explain why I get 2 new columns of data in either 
the CSV format or the TXT format?
~K

------------------------------------------------------------------------

On 04/03/2013 10:52 AM, Kerry wrote:
Here's the first 5 lines of my dataset:

structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label = 
c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
   R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
   620.139702002702, 378.186792471657), R1dmean = c(2498.29727014221,
   657.102921923195, 660.356911071581, 620.139702002702, 
378.186792471657
   ), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error = c(FALSE,
   FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")

and here's the code I wrote for this file:

write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")
That's not a CSV file, it is being written with a blank as separator.  
Since it also has blanks in the formatted POSIXct column, you're very 
likely to run into problems reading it.

Use write.csv(test, "test.csv") and you'll have fewer problems.  If 
you want tab-delimited columns instead, you'll need to specify that in 
the write.table call.

Duncan Murdoch

Your description of diagnosis uses non-R software (off topic here). Please either describe the difference in the files (you may need a hex editor or the hexbin package to detect the differences) or supply the files that behave differently (this may require some alternate route than this mailing list if there are odd characters at fault).

For what it is worth, TXT is not a clearly-defined format, so this could be more effectively addressed by using a more specific format for data exchange.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

I realize my command code is not writing a CSV file, I already pointed
out that CSV file seems to work OK but not in the TXT format.
?Regardless of that, there should be no problems in ArcGIS in reading
the date column - in ArcGIS it will simply recognize it as a text field
by default. ?As I said in my initial posting, when I use other programs
to create a text file (say textpad,wordpad, notepad or excel) and bring
the txt file into ArcGIS, there is no dropping of the information - it
doesn't turn the column into NA's. ?It only does this when I try to add
text files that were generated from using the write.table or write.csv.

Any thoughts at to explain why I get 2 new columns of data in either
the CSV format or the TXT format?
?
~K

________________________________

On 04/03/2013 10:52 AM, Kerry wrote:
Here's the first 5 lines of my dataset:

structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label =
c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
? ?  R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
? ?  620.139702002702, 378.186792471657), R1dmean =
c(2498.29727014221,
? ?  657.102921923195, 660.356911071581, 620.139702002702,
378.186792471657
? ?  ), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error =
c(FALSE,
? ?  FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")

and here's the code I wrote for this file:

write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")

That's not a CSV file, it is being written with a blank as separator.? 
Since it also has blanks in the formatted POSIXct column, you're very
likely to run into problems reading it.

Use write.csv(test, "test.csv") and you'll have fewer problems.? If you
want tab-delimited columns instead, you'll need to specify that in the
write.table call.

Duncan Murdoch
[[alternative HTML version deleted]]

------------------------------------------------------------------------

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130305/4473daf3/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130305/ff082733/attachment.pl>