Skip to content
Prev 334225 / 398506 Next

How can I find nonstandard or control characters in a large file?

On Mon, 09 Dec 2013, andrewH <ahoerner at rprogress.org> writes:
You could process your file in chunks:

  f <- file("myfile.csv", open = "r")
  lines <- readLines(f, n = 10000)
  ## do something with lines
  lines <- readLines(f, n = 10000)
  ## do something with lines
  ## ....

To find 'non-standard characters' you will need to define what
'non-standard characters' are.  But perhaps ?tools:::showNonASCII, which
uses ?iconv, can help you.  (Please note the warnings and caveats on the
functions' help pages.)