An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/351cfcd4/attachment.pl>
Mysterious issues with reading text files from R in ArcGIS and Excel
9 messages · Duncan Murdoch, Kerry, Jeff Newmiller +1 more
On 04/03/2013 10:09 AM, Kerry wrote:
It seems within the last ~3 months Ive been having issues with writing text or csv files from a R data frame. The problem is multifold and it is hard to filter out what is going on and where the problem is. So, Im hoping someone else has come across this and may provide insight.
I think you need to provide a simple example for us to try, either by putting a small example of one of your files online for us to download, or (better) by giving us self-contained code to duplicate the problem. You might also get better help (especially about ArcGIS) on the R-sig-Geo mailing list: <https://stat.ethz.ch/mailman/listinfo/r-sig-geo>. Duncan Murdoch
My current settings for R:
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252
attached base packages:
[1] tcltk stats graphics grDevices utils datasets methods base
other attached packages:
[1] adehabitat_1.8.11 shapefiles_0.6 foreign_0.8-51 tkrplot_0.0-23 ade4_1.5-1
loaded via a namespace (and not attached):
[1] tools_2.15.2
I am using Microsoft Excel 2010 and ArcGIS 10.1sp1 for Desktop
Basically, no matter what data frame I am working on, when I export it to a text file to be use in Excel or ArcGIS problems arise. Im not sure if it is R or these other programs, maybe forums for ArcGIS might be more appropriate, but this problem only occurs when I use tables that have been produced from an R session.
When I try to open a text file in Excel, either I get an error message stating
The file you are trying to open is in a different format than specified by the file extension. Verify that the file is not corrupted and is from a trusted source.
Followed by
Excel has detected that 'file.txt' is a SYLK file, but cannot load it. Either the file has errors or is not a SYLK file format. Click OK to open the file in a different format
Then the file opens
Otherwise, the file opens "fine" the first time through - and "looks" ok. I can't figure out what Im doing different between the two commands of write.table as they are always written the same:
write.csv(file, file = "D:/mylocations/fileofinterest.csv") or write.table(file, file = "D:/mylocations/fileofinterest.txt")
Sometimes I will try to add sep = "," or sep = ";" but these don't make a difference (which I didn't figure they would).
The other program I use is ArcGIS and bringing in a txt file from R is really messing things up as 2 new columns of information are typically added and date/time data is usually lost with txt files, but not with csv files.
For instance - a text file that looks like this in Excel:
id x y date R1dmed R1dmean R1error R2error
1 F07001 1482445 6621768 2007-03-05 10:00:53 2498.2973 2498.2973 FALSE FALSE
2 F07001 1481274 6619628 2007-03-05 12:00:41 657.1029 657.1029 FALSE FALSE
3 F07001 1481279 6619630 2007-03-05 14:01:12 660.3569 660.3569 FALSE FALSE
4 F07001 1481271 6619700 2007-03-05 16:00:39 620.1397 620.1397 FALSE FALSE
in ArcGIS now looks like this:
Field1idid_Xid_YxydateR1dmedR1dmean R1errorR2errorOBJECTID *
1F07001118.0818119.485541e+01514824456621768NA2498.297272498.29727FALSEFALSE1
2F07001118.0818119.485541e+01514812746619628NA657.102922657.102922FALSEFALSE2
3F07001118.0818119.485541e+01514812796619630NA660.356911660.356911FALSEFALSE3
4F07001118.0818119.485541e+01514812716619700NA620.139702620.139702FALSEFALSE4
5F07001118.0818119.485541e+01514808496620321NA378.186792378.186792FALSEFALSE5
Where did id_X and id_Y come from?? What are they??
What happened to the Date column??? Why does the date column show up when I use write.csv but not write.table?
Thank you for your help.
~K
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/65d922af/attachment.pl>
On 04/03/2013 10:52 AM, Kerry wrote:
Here's the first 5 lines of my dataset:
structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
620.139702002702, 378.186792471657), R1dmean = c(2498.29727014221,
657.102921923195, 660.356911071581, 620.139702002702, 378.186792471657
), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error = c(FALSE,
FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")
and here's the code I wrote for this file:
write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")
That's not a CSV file, it is being written with a blank as separator. Since it also has blanks in the formatted POSIXct column, you're very likely to run into problems reading it. Use write.csv(test, "test.csv") and you'll have fewer problems. If you want tab-delimited columns instead, you'll need to specify that in the write.table call. Duncan Murdoch
~K ------------------------------------------------------------------------ *From:* Duncan Murdoch <murdoch.duncan at gmail.com> *To:* Kerry <kernicholson at yahoo.com> *Cc:* "r-help at r-project.org" <r-help at r-project.org> *Sent:* Monday, March 4, 2013 4:48 PM *Subject:* Re: [R] Mysterious issues with reading text files from R in ArcGIS and Excel On 04/03/2013 10:09 AM, Kerry wrote:
It seems within the last ~3 months Ive been having issues with
writing text or csv files from a R data frame. The problem is multifold and it is hard to filter out what is going on and where the problem is. So, Im hoping someone else has come across this and may provide insight. I think you need to provide a simple example for us to try, either by putting a small example of one of your files online for us to download, or (better) by giving us self-contained code to duplicate the problem. You might also get better help (especially about ArcGIS) on the R-sig-Geo mailing list: <https://stat.ethz.ch/mailman/listinfo/r-sig-geo>. Duncan Murdoch
My current settings for R: R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252
LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] adehabitat_1.8.11 shapefiles_0.6 foreign_0.8-51
tkrplot_0.0-23 ade4_1.5-1
loaded via a namespace (and not attached): [1] tools_2.15.2 I am using Microsoft Excel 2010 and ArcGIS 10.1sp1 for Desktop Basically, no matter what data frame I am working on, when I export
it to a text file to be use in Excel or ArcGIS problems arise. Im not sure if it is R or these other programs, maybe forums for ArcGIS might be more appropriate, but this problem only occurs when I use tables that have been produced from an R session.
When I try to open a text file in Excel, either I get an error
message stating
The file you are trying to open is in a different format than
specified by the file extension. Verify that the file is not corrupted and is from a trusted source.
Followed by Excel has detected that 'file.txt' is a SYLK file, but cannot load
it. Either the file has errors or is not a SYLK file format. Click OK to open the file in a different format
Then the file opens Otherwise, the file opens "fine" the first time through - and
"looks" ok. I can't figure out what Im doing different between the two commands of write.table as they are always written the same:
write.csv(file, file = "D:/mylocations/fileofinterest.csv") or
write.table(file, file = "D:/mylocations/fileofinterest.txt")
Sometimes I will try to add sep = "," or sep = ";" but these don't
make a difference (which I didn't figure they would).
The other program I use is ArcGIS and bringing in a txt file from R
is really messing things up as 2 new columns of information are typically added and date/time data is usually lost with txt files, but not with csv files.
For instance - a text file that looks like this in Excel:
id x y date R1dmed R1dmean R1error
R2error
1 F07001 1482445 6621768 2007-03-05 10:00:53 2498.2973 2498.2973
FALSE FALSE
2 F07001 1481274 6619628 2007-03-05 12:00:41 657.1029 657.1029
FALSE FALSE
3 F07001 1481279 6619630 2007-03-05 14:01:12 660.3569 660.3569
FALSE FALSE
4 F07001 1481271 6619700 2007-03-05 16:00:39 620.1397 620.1397
FALSE FALSE
in ArcGIS now looks like this: Field1idid_Xid_YxydateR1dmedR1dmean R1errorR2errorOBJECTID *
1F07001118.0818119.485541e+01514824456621768NA2498.297272498.29727FALSEFALSE1
2F07001118.0818119.485541e+01514812746619628NA657.102922657.102922FALSEFALSE2
3F07001118.0818119.485541e+01514812796619630NA660.356911660.356911FALSEFALSE3
4F07001118.0818119.485541e+01514812716619700NA620.139702620.139702FALSEFALSE4
5F07001118.0818119.485541e+01514808496620321NA378.186792378.186792FALSEFALSE5
Where did id_X and id_Y come from?? What are they?? What happened to the Date column??? Why does the date column show
up when I use write.csv but not write.table?
Thank you for your help.
~K
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org <mailto:R-help at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130304/70b4f3a2/attachment.pl>
On 04/03/2013 11:15 AM, Kerry wrote:
I realize my command code is not writing a CSV file, I already pointed out that CSV file seems to work OK but not in the TXT format.
Sorry. In that case, I think you really do have to go to R-sig-Geo to find someone who knows about ArcGIS. Duncan Murdoch
Regardless of that, there should be no problems in ArcGIS in reading the date column - in ArcGIS it will simply recognize it as a text field by default. As I said in my initial posting, when I use other programs to create a text file (say textpad,wordpad, notepad or excel) and bring the txt file into ArcGIS, there is no dropping of the information - it doesn't turn the column into NA's. It only does this when I try to add text files that were generated from using the write.table or write.csv. Any thoughts at to explain why I get 2 new columns of data in either the CSV format or the TXT format? ~K ------------------------------------------------------------------------ On 04/03/2013 10:52 AM, Kerry wrote:
Here's the first 5 lines of my dataset: structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label =
c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
620.139702002702, 378.186792471657), R1dmean = c(2498.29727014221,
657.102921923195, 660.356911071581, 620.139702002702,
378.186792471657
), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error = c(FALSE,
FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")
and here's the code I wrote for this file:
write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")
That's not a CSV file, it is being written with a blank as separator. Since it also has blanks in the formatted POSIXct column, you're very likely to run into problems reading it. Use write.csv(test, "test.csv") and you'll have fewer problems. If you want tab-delimited columns instead, you'll need to specify that in the write.table call. Duncan Murdoch
Your description of diagnosis uses non-R software (off topic here). Please either describe the difference in the files (you may need a hex editor or the hexbin package to detect the differences) or supply the files that behave differently (this may require some alternate route than this mailing list if there are odd characters at fault).
For what it is worth, TXT is not a clearly-defined format, so this could be more effectively addressed by using a more specific format for data exchange.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
Kerry <kernicholson at yahoo.com> wrote:
I realize my command code is not writing a CSV file, I already pointed out that CSV file seems to work OK but not in the TXT format. ?Regardless of that, there should be no problems in ArcGIS in reading the date column - in ArcGIS it will simply recognize it as a text field by default. ?As I said in my initial posting, when I use other programs to create a text file (say textpad,wordpad, notepad or excel) and bring the txt file into ArcGIS, there is no dropping of the information - it doesn't turn the column into NA's. ?It only does this when I try to add text files that were generated from using the write.table or write.csv. Any thoughts at to explain why I get 2 new columns of data in either the CSV format or the TXT format? ? ~K
________________________________
On 04/03/2013 10:52 AM, Kerry wrote:
Here's the first 5 lines of my dataset:
structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L), .Label =
c("F07001",
"F07002", "F07003", "F07004", "F07005", "F07006", "F07008", "F07009",
"F07010", "F07011", "F07014", "F07015", "F07017", "F07018", "F07019",
"F07020", "F07021", "F07022", "F07023", "F07024", "F10001", "F10002",
"F10004", "F10008", "F10009", "F10010", "F10012", "F10013", "F10014",
"F98015", "M07007", "M07012", "M07013", "M07016", "M10007", "M10011",
"M10015"), class = "factor"), x = c(1482445L, 1481274L, 1481279L,
1481271L, 1480849L), y = c(6621768L, 6619628L, 6619630L, 6619700L,
6620321L), date = structure(c(1173085253, 1173092441, 1173099672,
1173106839, 1173114055), class = c("POSIXct", "POSIXt"), tzone = ""),
? ? R1dmed = c(2498.29727014221, 657.102921923195, 660.356911071581,
? ? 620.139702002702, 378.186792471657), R1dmean =
c(2498.29727014221,
? ? 657.102921923195, 660.356911071581, 620.139702002702,
378.186792471657
? ? ), R1error = c(FALSE, FALSE, FALSE, FALSE, FALSE), R2error =
c(FALSE,
? ? FALSE, FALSE, FALSE, FALSE)), .Names = c("id", "x", "y",
"date", "R1dmed", "R1dmean", "R1error", "R2error"), row.names = c(NA,
5L), class = "data.frame")
and here's the code I wrote for this file:
write.table(test, "D:/MooseEncounters/locations/Individual/test.txt")
That's not a CSV file, it is being written with a blank as separator.?
Since it also has blanks in the formatted POSIXct column, you're very
likely to run into problems reading it.
Use write.csv(test, "test.csv") and you'll have fewer problems.? If you
want tab-delimited columns instead, you'll need to specify that in the
write.table call.
Duncan Murdoch
[[alternative HTML version deleted]]
------------------------------------------------------------------------
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130305/4473daf3/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130305/ff082733/attachment.pl>