Dear all, I've been successfully reading Web of Science-data from tab-delimited text files into a data.frame using an R-script based on readLines(). With new data I just downloaded I suddenly get this warning: incomplete final line found I know this warning has already been discussed numerous times but none of the previously suggested solutions worked for me, unfortunately; so please bear with me: I shut the warning down using "warn = FALSE", but the data still won't get read so this seems to be more serious than a warning. Adding a blank line or two at the end of the file did NOT help, i.e. R still does not read the file. But my old files still work properly, though. So I opened the text files using Notepad++ and saw that the last lines of both old text files (i.e. working) as well as new text files (i.e. the ones that don't work for some reason) always end with a tab stop followed by a line break. Personally I couldn't tell any difference between the ways these files ended. Their endings looked identical to me. I was using R 2.14.0 (64 bit) on Windows when I dioscovered the problem. So I upgraded to 2.15.0 (64-bit) but the problem persists. You can see small examples of an old and new file at https://www.dropbox.com/s/2joadjo9ce86rij/WoS-old.txt and https://www.dropbox.com/s/lp9l1exx4mfws1s/WoS-new.txt, respectively. Does anybody happen to have an idea of what could cause these problems for me? Thank you very much for your consideration!
Could "incomplete final line found" be more serious than a warning?
3 messages · Michael Bärtl, Peter Dalgaard, Zhou Fang
(Original below)
Looks like someone had the bright idea of changing it to 16-bit UTF, so every 2nd byte is NUL. It works for me with
x <- readLines(file("~/Downloads/WoS-new.txt", encoding="UTF-16"))
(except that for some reason, x won't print properly although each individual line prints fine. Never mind, who cares as long as it reads...)
-pd
PS: The reason the printing is wacky is that one line has 148934 characters in it and the print routines pad all lines to the maximum length. Not sure what the point is in that.
On May 22, 2012, at 18:26 , Michael B?rtl wrote:
Dear all, I've been successfully reading Web of Science-data from tab-delimited text files into a data.frame using an R-script based on readLines(). With new data I just downloaded I suddenly get this warning: incomplete final line found I know this warning has already been discussed numerous times but none of the previously suggested solutions worked for me, unfortunately; so please bear with me: I shut the warning down using "warn = FALSE", but the data still won't get read so this seems to be more serious than a warning. Adding a blank line or two at the end of the file did NOT help, i.e. R still does not read the file. But my old files still work properly, though. So I opened the text files using Notepad++ and saw that the last lines of both old text files (i.e. working) as well as new text files (i.e. the ones that don't work for some reason) always end with a tab stop followed by a line break. Personally I couldn't tell any difference between the ways these files ended. Their endings looked identical to me. I was using R 2.14.0 (64 bit) on Windows when I dioscovered the problem. So I upgraded to 2.15.0 (64-bit) but the problem persists. You can see small examples of an old and new file at https://www.dropbox.com/s/2joadjo9ce86rij/WoS-old.txt and https://www.dropbox.com/s/lp9l1exx4mfws1s/WoS-new.txt, respectively. Does anybody happen to have an idea of what could cause these problems for me? Thank you very much for your consideration!
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
If you look at the new file in raw mode, you'll see that it's chock full of ASCII nuls, while the old file has none. This is probably what's giving you the problems, because R does not allow strings containing embedded nul characters. (I believe this is because Nul in strings is pretty dangerous in programming, because they are often used to delimit the end of strings, and so allowing you to read it in directly can be used for various code injection exploits.) To read the new data files, you need some way of dealing with the file as a raw stream, and stripping out all the nul characters before converting back to character. Investigate ?readBin... Zhou -- View this message in context: http://r.789695.n4.nabble.com/Could-incomplete-final-line-found-be-more-serious-than-a-warning-tp4630932p4630944.html Sent from the R help mailing list archive at Nabble.com.