Scraping Dynamic Information

May 29, 2024 Post a Comment

I recently started with coding, I use Python and Pycharm. I Installed and imported the needed 'Add-ons' like Selenium. For my first project I tried to get the 'address' information

Solution 1:

If you want the element's address just get the element and print it's text.

driver.get("https://randomstreetview.com/")
wait = WebDriverWait(driver, 10)
elem = wait.until(EC.presence_of_element_located((By.ID, "address")))
print(elem.text)

Element

<div id="address">Nordre Ringvej 97, 2600 Glostrup, Dänemark</div>

Outputs

NordreRingvej97,2600 Glostrup,Dänemark

Imports

from selenium.webdriver.common.byimportByfrom selenium.webdriver.support.uiimportWebDriverWaitfrom selenium.webdriver.supportimport expected_conditions asEC

Solution 2:

To print the textvalue you can use either of the following Locator Strategies:

Using id and get_attribute("textContent"):

driver.get("https://randomstreetview.com/#fullscreen")
print(driver.find_element_by_id("address").get_attribute("textContent"))

Using css_selector and get_attribute("innerHTML"):

driver.get("https://randomstreetview.com/#fullscreen")
print(driver.find_element_by_css_selector("div#address").get_attribute("innerHTML"))

Using xpath and text attribute:

driver.get("https://randomstreetview.com/#fullscreen")
print(driver.find_element_by_xpath("//div[@id='address']").text)

Ideally you need to induce WebDriverWait for the presence_of_element_located() and you can use either of the following Locator Strategies:

Using ID and get_attribute("textContent"):

print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.ID, "address"))).get_attribute("textContent"))

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div#address"))).text)

Using XPATH and get_attribute():

print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//div[@id='address']"))).get_attribute("innerHTML"))

Console Output:
```
value
```

Note : You have to add the following imports :

from selenium.webdriver.support.uiimportWebDriverWaitfrom selenium.webdriver.common.byimportByfrom selenium.webdriver.supportimport expected_conditions asEC

Console Output:

Ciudad Pérdida 10, La Sabana, 39799 Acapulco, Guerrero, Mexico

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

References

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Solution 3:

To piggy back on Arundeep Chohan's answer. The reason that you are unable to get the address is because it is a hidden element.

check out this post Python Selenium: Finds h1 element but returns empty text string

TLDR; "text property allow you to get text from only visible elements while textContent attribute also allow to get text of hidden one..."

This code also works using CSS selectors

element = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div#address')))

print(element.get_attribute('textContent'))

Python Freelancers

Scraping Dynamic Information

Solution 1:

Solution 2:

References

Solution 3:

Post a Comment for "Scraping Dynamic Information"