Skip to content Skip to sidebar Skip to footer

Lxml/python Reading Xml With Cdata Section

In my xml I have a CDATA section. I want to keep the CDATA part, and then strip it. Can someone help with the following? Default does not work: $ from io import StringIO $ from lxm

Solution 1:

CDATA sections are not preserved in the text property of an element, even if strip_cdata=False is used when the XML content is parsed, as you have noticed. See https://lxml.de/api.html#cdata.

CDATA sections are preserved in these cases:

  1. When serializing with tostring():

    print(etree.tostring(tree.getroot(), encoding="UTF-8").decode())
    
  2. When writing to a file:

    tree.write("subject.xml", encoding="UTF-8")
    

Post a Comment for "Lxml/python Reading Xml With Cdata Section"