How To Get Contents Of HTML Script Tag
I'm trying to scrape the geo data from a URL for my scraping practice. But I'm having trouble while handling contents of script tag. Following is the contents of script tag :
Solution 1:
I don't understand what you're trying to do with the repeated xpath queries //item/title/text()
. Note that xpath is useful for extracting HTML content. The content of the <script>
tag in your question is not HTML, so it's not possible to query that with xpath.
In a first step you can get the content of the <script>
tag:
content = tree.xpath('//script/text()').extract()[0]
And then you can use the json
package to load the json content into a Python dictionary:
d = json.loads(content)
Also note that the JSON in the <script>
in your example is not valid,
it's missing a closing brace.
The above method only works with valid content.
Post a Comment for "How To Get Contents Of HTML Script Tag"