Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and then skip. Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com
Reading a CSV file
10 messages · Uwe Ligges, Jim Lemon, Erin Hodgess +2 more
csv <- readLines(filename) read.csv(text = csv[-(2:5)] Best, Uwe Ligges
On 07.08.2022 02:15, Erin Hodgess wrote:
Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and then skip. Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Awesome! Thanks so much! On Sat, Aug 6, 2022 at 8:51 PM Uwe Ligges <ligges at statistik.tu-dortmund.de> wrote:
csv <- readLines(filename) read.csv(text = csv[-(2:5)] Best, Uwe Ligges On 07.08.2022 02:15, Erin Hodgess wrote:
Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and
then
skip.
Thanks,
Erin
Erin Hodgess, PhD
mailto: erinm.hodgess at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com [[alternative HTML version deleted]]
HI Erin, Just index out the lines in the result vector. Jim
On Sun, Aug 7, 2022 at 10:16 AM Erin Hodgess <erinm.hodgess at gmail.com> wrote:
Hello!
Is there a way to read the first line of a CSV file, then skip 4 lines,
then continue reading, please?
I know you can skip from the top, but I don't know if you can read and then
skip.
Thanks,
Erin
Erin Hodgess, PhD
mailto: erinm.hodgess at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
There is an example here [1] that illustrates one way to accomplish this. [1] https://jdnewmil.github.io/blog/post/functions-talk/
On August 6, 2022 5:15:43 PM PDT, Erin Hodgess <erinm.hodgess at gmail.com> wrote:
Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and then skip. Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Nice! (Needs just one more closing paren...)
On August 6, 2022 5:51:12 PM PDT, Uwe Ligges <ligges at statistik.tu-dortmund.de> wrote:
csv <- readLines(filename) read.csv(text = csv[-(2:5)] Best, Uwe Ligges On 07.08.2022 02:15, Erin Hodgess wrote:
Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and then skip. Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
All of these are so great! Thanks so much, particularly on a Saturday night! Sincerely, Erin On Sat, Aug 6, 2022 at 9:02 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Nice! (Needs just one more closing paren...) On August 6, 2022 5:51:12 PM PDT, Uwe Ligges < ligges at statistik.tu-dortmund.de> wrote:
csv <- readLines(filename) read.csv(text = csv[-(2:5)] Best, Uwe Ligges On 07.08.2022 02:15, Erin Hodgess wrote:
Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and
then
skip.
Thanks,
Erin
Erin Hodgess, PhD
mailto: erinm.hodgess at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity.
Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com [[alternative HTML version deleted]]
Unfortunately, this can mess with your data types. Uwe's (and my more cluttered) approach doesn't.
On August 6, 2022 5:53:33 PM PDT, Jim Lemon <drjimlemon at gmail.com> wrote:
HI Erin, Just index out the lines in the result vector. Jim On Sun, Aug 7, 2022 at 10:16 AM Erin Hodgess <erinm.hodgess at gmail.com> wrote:
Hello!
Is there a way to read the first line of a CSV file, then skip 4 lines,
then continue reading, please?
I know you can skip from the top, but I don't know if you can read and then
skip.
Thanks,
Erin
Erin Hodgess, PhD
mailto: erinm.hodgess at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Erin, No explanation of why you want to do that. Are there comments on those lines, for example? I see others have replied with things that boils down to reading the entire file into a data structure with lines, then using indexing of some sort to eliminate the lines you want to skip and make data from the data structure rather than the file. . That works, albeit for large files, ... But there are, as usual, many ways to do things. Some people read in files on their own and do the comma separation, type checking of columns and so on and are free to make their own structure, perhaps the hard way. Clearly skipping lines becomes trivial. Then there is the concept of reading it twice with the first pass trivially picking up just a comma-separated line you can make into a series of headers and then call something like read.csv and tell it to skip your N lines and use these names for the columns. Let me suggest an alternate solution IF you can arrange for the lines you do not want (and only those) to begin with some comment character like "#" in my example below: text <- 'head1, head2 # ignore # ignore 2 # ignore 3 # ignore 4 1,2 3,2' hi <- read.csv(text=text, comment.char="#") The above returns: head1 head2 1 1 2 2 3 2 It will ignore any number of lines. If the data has anything special like that, this might be a way to get what you want. There are other packages like dplyr in the tidyverse with related but sometimes different functionality and the same effect can be had with a variation of my technique using read_csv() [note underscore not period] hi <- text %>% read_csv(comment="#") Or use the new pipe symbol if you prefer: hi <- text |> read_csv(comment="#") What this gives you perhaps is more options such as a skip_empty_rows=TRUE option that would remove the lines if blank. And I tried some rather weird ideas like this: text <- 'head1, head2 # ignore # ignore 2 more stuff # ignore 3, more stuff # ignore 4 1,2 3,2' hi <- text |> read_csv(col_types=c(col_integer(), col_integer())) The idea was to TELL it what type to expect and hope the bad lines become NA. Well, not quite. It made everything character given the above data that was no longer suppressing comments:
hi
# A tibble: 6 ? 2 head1 head2 <chr> <chr> 1 # ignore NA 2 # ignore 2 more stuff NA 3 # ignore 3 more stuff 4 # ignore 4 NA 5 1 2 6 3 2
typeof(hi$head1)
[1] "character"
typeof(hi$head2)
[1] "character" But although this seems bad, it opens a door to consider. As long as whatever is on those 4 lines does not mess things up by say making additional columns, you probably can read the darn thing in to a tibble or data.frame and then remove the rows you do not want and convert the columns from character to whatever you want such as integer or numeric. Finally, if you have any control over the file contents, guess what happens if you place the header line AFTER the four skipped lines like this? text <- '# ignore # ignore 2 more stuff # ignore 3, more stuff # ignore 4 head1, head2 1,2 3,2' You now tell it to skip 4 lines AND use a header and it works for me! read.csv(text=text, header=TRUE, skip=4) There seems to be many ways to consider and I would not be shocked if some program that does this data import even allowed you to specify what rows to ignore more dynamically. But perhaps the first solution you got is more dynamic as it allows you to process the text as a series of lines in all kinds of ways, such as removing any rows that contain the number 666 or even editing it in some way, combining data from multiple files, and so on. -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Erin Hodgess Sent: Saturday, August 6, 2022 8:16 PM To: r-help at r-project.org Subject: [R] Reading a CSV file Hello! Is there a way to read the first line of a CSV file, then skip 4 lines, then continue reading, please? I know you can skip from the top, but I don't know if you can read and then skip. Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim, The resulting column vectors may not come out right as they will not be read into the same number of columns unless you have the right number of commas/separators and worse, the result may likely be columns changed to the same type such as text or float even if you intended integer. -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Jim Lemon Sent: Saturday, August 6, 2022 8:54 PM To: Erin Hodgess <erinm.hodgess at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Reading a CSV file HI Erin, Just index out the lines in the result vector. Jim On Sun, Aug 7, 2022 at 10:16 AM Erin Hodgess <erinm.hodgess at gmail.com> wrote:
Hello!
Is there a way to read the first line of a CSV file, then skip 4
lines, then continue reading, please?
I know you can skip from the top, but I don't know if you can read and
then skip.
Thanks,
Erin
Erin Hodgess, PhD
mailto: erinm.hodgess at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.