Skip to content Skip to sidebar Skip to footer

Remove 'urllib.error.httperror: Http Error 302:' From Urlreq(url)

Hey guys what's up? :) I'm trying to scrape a website with some url parameters. If I use url1, url2, url3 it WORKS properly and it prints me the regular output I want (html) ->

Solution 1:

If use requests package and add in the user agent in the headers, it looks like it's getting 200 response for all 4 of those links. So try adding in the user agent headers:

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}

import requests
from bs4 import BeautifulSoup as soup

# create urls
url1 = 'https://en.titolo.ch/sale'
url2 = 'https://en.titolo.ch/sale?limit=108'
url3 = 'https://en.titolo.ch/sale?category_styles=29838_21212'
url4 = 'https://en.titolo.ch/sale?category_styles=31066&limit=108'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}

url_list = [url1, url2, url3, url4]

for url in url_list:
# opening up connection on each url, grabbing the page
    response = requests.get(url, headers=headers)
    print (response.status_code)

Output:

200
200
200
200

So:

importrequestsheaders= {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}

url = 'https://en.titolo.ch/sale?category_styles=31066&limit=108'

r = requests.get(url, headers=headers)
html = r.text
print(html)

Post a Comment for "Remove 'urllib.error.httperror: Http Error 302:' From Urlreq(url)"