Using Rvest to scrape pages

Dear All,

I am just learning how to use R programming. I want to extract reviews
from a page and loop till I extract for all pages:

#specify the first page URL
fpURL <- 'https://wordpress.org/support/plugin/easyrecipe/reviews/'

#read the HTML contents in the first page URL
contentfpURL <- read_html(fpURL)

#identify the anchor tags in the first page URL
fpAnchors <- html_nodes(contentfpURL, css='a.bbp-topic-permalink')

#extract the HREF attribute value of each anchor tag
fpHREF <- html_attr(fpAnchors, 'href')

#create empty lists to store titles & contents found in the HREF
attribute value of each anchor tag
titles = c()
contents = c()

#loop the following actions for each HREF found firstpage
for (u in fpHREF) {

    #read the HTML content of the review page
    fpURL = read_html(u)

   #identify the title anchor and read the title text
   fpreviewT = html_text(html_nodes(fpURL, css='h1.page-title'))

   #identify the content anchor and read the content text
   fpreviewC = html_text(html_nodes(fpURL, css='div.bbp-topic-content'))

   #store the review titles and contents in the previous lists
   titles = c(titles, fpreviewT)
   contents = c(contents, fpreviewC)
}
#identify the anchor tag pointing to the next summary page
npAnchor <- html_text(html_node(contentfpURL, css='a.next page-numbers'))

#extract the HREF attribute value of the anchor tag pointing to the
next summary page
npHREF <- html_attr(npAnchor, 'href')

Using Rvest to scrape pages

Thread (2 messages)