RSelenium: Unable to Extract Hrefs from Page After Button Click

  Kiến thức lập trình

I’m trying to automate web scraping using RSelenium in R. I’ve successfully located and clicked a button on a webpage using RSelenium, but I’m having trouble extracting href attributes from the page after the button click.

Here’s the code I’m using:

library(RSelenium)
remDr <- remoteDriver(
  remoteServerAddr = "localhost",
  port = 4445L,
  browserName = "firefox"
)

remDr$open()

remDr$navigate("https://ser-sid.org/")

webElem <- remDr$findElement(using = "class", "flex")

# Find the input field and button within webElem
input_element <- webElem$findChildElement(using = "css selector", value = "input[type='text']")
button_element <- webElem$findChildElement(using = "css selector", value = "button")

# Enter species name into the input field

input_element$sendKeysToElement(list("Abies balsamea"))

# Click the button to submit the form
button_element$clickElement()


Sys.sleep(5)

# Find all <a> elements with species information
species_links <- remDr$findElements(using = "css selector", value = "a[href^='/species/']")

# Extract the href attributes from the species links
hrefs <- sapply(species_links, function(link) {
  link$getElementAttribute("href")
})

# Filter out NULL values (in case some links don't have href attributes)
hrefs <- hrefs[!is.na(hrefs)]

# Print the extracted hrefs
print(hrefs)

The code runs without errors, but species_links is empty, indicating that the elements with species information are not being located.

I’ve tried waiting for the page to load after clicking the button, but it seems like the page content isn’t fully loading or isn’t as expected.

How can I troubleshoot this issue and ensure that I can extract hrefs from the page after clicking the button?

Thank you for your help!

LEAVE A COMMENT