Posts Tagged ‘xml’

Suppressing the default namespace in ElementTree

Monday, July 7th, 2014

xmlThis solution removes the default namespace from the input XML and makes typing ElementTree Xpath expressions a lot easier.  The trick is to remove the xmlns=”…” attribute from the input XML before parsing it.  Most other workarounds suggest traversing the tree of elements after parsing, and modifying the tag entries.

Replace:

import ElementTree as ET

# Parse the contents of the XML file
tree = ET.parse(xmlFile)

# Get the root element
root = tree.getroot()

# Use fully qualified tag names in path expressions
extentX = root.find(".//{http://some/namespace}Coverage/{http://some/namespace}Extent/x")

with:

import ElementTree as ET
import re # regular expression module

# Read the contents of the XML file into xmlstring
with open(xmlFile) as f:
    xmlstring = f.read()

# Remove the default namespace definition (xmlns="http://some/namespace")
xmlstring = re.sub('\\sxmlns="[^"]+"', '', xmlstring, count=1)

# Parse the XML string
root = ET.fromstring(xmlstring)

# Use much simpler path expressions
extentX = root.find(".//Coverage/Extent/x")

Thanks to this post on the always useful StackOverflow for suggesting the solution.