Skip to content Skip to sidebar Skip to footer

How To Get Contents Of HTML Script Tag

I'm trying to scrape the geo data from a URL for my scraping practice. But I'm having trouble while handling contents of script tag. Following is the contents of script tag :

Solution 1:

I don't understand what you're trying to do with the repeated xpath queries //item/title/text(). Note that xpath is useful for extracting HTML content. The content of the <script> tag in your question is not HTML, so it's not possible to query that with xpath.

In a first step you can get the content of the <script> tag:

content = tree.xpath('//script/text()').extract()[0]

And then you can use the json package to load the json content into a Python dictionary:

d = json.loads(content)

Also note that the JSON in the <script> in your example is not valid, it's missing a closing brace. The above method only works with valid content.


Post a Comment for "How To Get Contents Of HTML Script Tag"