Hidden Phone Number Can't Be Scraped

June 25, 2024 Post a Comment

I've been having trouble trying to extract the phone number after clicking the 'llamar' button. So far I've used the xpath method with selenium and also tried using beautiful soup

Solution 1:

The phone is stored inside Javascript. You can use re module to extract it:

import re
import requests
from bs4 import BeautifulSoup

url = "https://www.milanuncios.com/venta-de-pisos-en-malaga-malaga/portada-alta-carlos-de-haya-carranque-386352344.htm"
phone_url = "https://www.milanuncios.com/datos-contacto/?usePhoneProxy=0&from=detail&includeEmail=false&id={}"

ad_id = re.search(r"(\d+)\.htm", url).group(1)

html_text = requests.get(phone_url.format(ad_id)).text

soup = BeautifulSoup(html_text, "html.parser")
phone = re.search(r"getTrackingPhone\((.*?)\)", html_text).group(1)

print(soup.select_one(".texto").get_text(strip=True), phone)

Prints:

ana (Particular) 639....

Solution 2:

With Selenium you will need to click the button and to switch to iframe.

from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

wait.until(EC.element_to_be_clickable(
            (By.CSS_SELECTOR, ".def-btn.phone-btn")))
tel_button = driver.find_element_by_css_selector(".def-btn.phone-btn")
tel_button.click()
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "ifrw")))
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR,".texto>.telefonos")))
tel_number = driver.find_element_by_css_selector(".texto>.telefonos").text

Please note, I used much stable locators.

Python Freelancers

Hidden Phone Number Can't Be Scraped

Solution 1:

Solution 2:

Post a Comment for "Hidden Phone Number Can't Be Scraped"