hello - I need to download flow data for Scottish river catchments. The data is available from the Scottish Environmental protection Agency body and that doesn't present a problem. For example the API beneath will access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one station: https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code But this data comes as HTML. I can copy and paste it into a text doc which can then be read into R but that's slow and time-consuming. I have tried using the package "rvest" to import the HTML into R but I have got nowhere. Can anyone give me any pointers as to how to do this? Thanks Nick Wray
html into R
4 messages · Nick Wray, Thierry Onkelinx, Rui Barradas
Dear Nick, A better solution is to add "&format=json" to the URL. Then the query returns the data in JSON format. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// <https://www.inbo.be> Op vr 26 aug. 2022 om 10:44 schreef Nick Wray <nickmwray at gmail.com>:
hello - I need to download flow data for Scottish river catchments. The data is available from the Scottish Environmental protection Agency body and that doesn't present a problem. For example the API beneath will access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one station: https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code But this data comes as HTML. I can copy and paste it into a text doc which can then be read into R but that's slow and time-consuming. I have tried using the package "rvest" to import the HTML into R but I have got nowhere. Can anyone give me any pointers as to how to do this? Thanks Nick Wray [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, You can try the following. It worked with me. Read from the link and post-process the html data extracting the element "table" and then the table itself. This table has 3 rows before the actual table so the lapply below will get the table and its header. library(httr) library(rvest) link <- "https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code" page <- read_html(link) page |> html_elements("table") |> html_table(header = TRUE) |> lapply(\(x) { hdr <- unlist(x[3, ]) y <- x[-(1:3), ] names(y) <- hdr y }) Hope this helps, Rui Barradas ?s 09:43 de 26/08/2022, Nick Wray escreveu:
hello - I need to download flow data for Scottish river catchments. The data is available from the Scottish Environmental protection Agency body and that doesn't present a problem. For example the API beneath will access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one station: https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code But this data comes as HTML. I can copy and paste it into a text doc which can then be read into R but that's slow and time-consuming. I have tried using the package "rvest" to import the HTML into R but I have got nowhere. Can anyone give me any pointers as to how to do this? Thanks Nick Wray [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sorry, there's simpler code. I used html_elements (plural) and the
result is a list. Use html_element (singular) and the output is a tibble.
page |>
html_element("table") |>
html_table(header = TRUE) |>
(\(x) {
hdr <- unlist(x[3, ])
y <- x[-(1:3), ]
names(y) <- hdr
y
})()
Hope this helps,
Rui Barradas
?s 11:53 de 26/08/2022, Rui Barradas escreveu:
Hello, You can try the following. It worked with me. Read from the link and post-process the html data extracting the element "table" and then the table itself. This table has 3 rows before the actual table so the lapply below will get the table and its header. library(httr) library(rvest) link <- "https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code" page <- read_html(link) page |> ? html_elements("table") |> ? html_table(header = TRUE) |> ? lapply(\(x) { ??? hdr <- unlist(x[3, ]) ??? y <- x[-(1:3), ] ??? names(y) <- hdr ??? y ? }) Hope this helps, Rui Barradas ?s 09:43 de 26/08/2022, Nick Wray escreveu:
hello - I need to download flow data for Scottish river catchments.? The data is available from the Scottish Environmental protection Agency body and that doesn't present a problem.? For example the API beneath will access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one station: https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code But this data comes as HTML.? I can copy and paste it into a text doc which can then be read into R but that's slow and time-consuming.? I have tried using the package "rvest" to import the HTML into R but I have got nowhere. Can anyone give me any pointers as to how to do this? Thanks Nick Wray ????[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.