Introduction
Python provides several libraries for creating XML files. The built-in xml.etree.ElementTree module is the most common choice for simple XML generation. For more complex needs, lxml offers better performance and XPath support, while xml.dom.minidom provides DOM-style manipulation.
XML Structure Basics
Root Element: The single top-level tag that contains all the other elements
Child Elements: Tags nested within the root element representing data or structure
Attributes: Key-value pairs associated with an element, e.g., id="1"
Method 1: xml.etree.ElementTree (Built-in)
1import xml.etree.ElementTree as ET
2
3# Create root element
4root = ET.Element("catalog")
5
6# Add child elements
7book1 = ET.SubElement(root, "book", id="1")
8ET.SubElement(book1, "title").text = "Python Basics"
9ET.SubElement(book1, "author").text = "Alice Johnson"
10ET.SubElement(book1, "price").text = "29.99"
11
12book2 = ET.SubElement(root, "book", id="2")
13ET.SubElement(book2, "title").text = "Data Science"
14ET.SubElement(book2, "author").text = "Bob Smith"
15ET.SubElement(book2, "price").text = "39.99"
16
17# Write to file
18tree = ET.ElementTree(root)
19ET.indent(tree, space=" ") # Python 3.9+ for pretty printing
20tree.write("catalog.xml", encoding="utf-8", xml_declaration=True)
Output (catalog.xml):
1<?xml version='1.0' encoding='utf-8'?>
2<catalog>
3 <book id="1">
4 <title>Python Basics</title>
5 <author>Alice Johnson</author>
6 <price>29.99</price>
7 </book>
8 <book id="2">
9 <title>Data Science</title>
10 <author>Bob Smith</author>
11 <price>39.99</price>
12 </book>
13</catalog>
Method 2: Pretty Printing (Python < 3.9)
For Python versions before 3.9 where ET.indent() is not available:
1import xml.etree.ElementTree as ET
2from xml.dom import minidom
3
4root = ET.Element("catalog")
5book = ET.SubElement(root, "book", id="1")
6ET.SubElement(book, "title").text = "Python Basics"
7
8# Convert to pretty-printed string
9rough_string = ET.tostring(root, encoding="unicode")
10reparsed = minidom.parseString(rough_string)
11pretty_xml = reparsed.toprettyxml(indent=" ")
12
13with open("catalog.xml", "w") as f:
14 f.write(pretty_xml)
Method 3: Using lxml
lxml offers better performance and more features:
1from lxml import etree
2
3root = etree.Element("catalog")
4book = etree.SubElement(root, "book", id="1")
5etree.SubElement(book, "title").text = "Python Basics"
6etree.SubElement(book, "author").text = "Alice Johnson"
7etree.SubElement(book, "price").text = "29.99"
8
9# Write with pretty printing
10tree = etree.ElementTree(root)
11tree.write("catalog.xml", pretty_print=True,
12 xml_declaration=True, encoding="utf-8")
Install with: pip install lxml
Method 4: Using xml.dom.minidom
DOM-style creation for those familiar with the DOM API:
1from xml.dom.minidom import Document
2
3doc = Document()
4
5# Create root
6catalog = doc.createElement("catalog")
7doc.appendChild(catalog)
8
9# Create a book
10book = doc.createElement("book")
11book.setAttribute("id", "1")
12catalog.appendChild(book)
13
14title = doc.createElement("title")
15title.appendChild(doc.createTextNode("Python Basics"))
16book.appendChild(title)
17
18author = doc.createElement("author")
19author.appendChild(doc.createTextNode("Alice Johnson"))
20book.appendChild(author)
21
22# Write to file
23with open("catalog.xml", "w") as f:
24 doc.writexml(f, indent="", addindent=" ", newl="\n", encoding="utf-8")
Method 5: String Templating (Quick and Simple)
For very simple XML, string formatting works:
1books = [
2 {"id": "1", "title": "Python Basics", "author": "Alice", "price": "29.99"},
3 {"id": "2", "title": "Data Science", "author": "Bob", "price": "39.99"},
4]
5
6xml = '<?xml version="1.0" encoding="utf-8"?>\n<catalog>\n'
7for book in books:
8 xml += f' <book id="{book["id"]}">\n'
9 xml += f' <title>{book["title"]}</title>\n'
10 xml += f' <author>{book["author"]}</author>\n'
11 xml += f' <price>{book["price"]}</price>\n'
12 xml += ' </book>\n'
13xml += '</catalog>'
14
15with open("catalog.xml", "w") as f:
16 f.write(xml)
Warning: String templating does not escape special characters. Use proper XML libraries for data that may contain <, >, &, or quotes.
Building XML from Dictionaries
1import xml.etree.ElementTree as ET
2
3def dict_to_xml(tag, data):
4 elem = ET.Element(tag)
5 for key, val in data.items():
6 child = ET.SubElement(elem, key)
7 if isinstance(val, dict):
8 child.extend(dict_to_xml(key, val))
9 else:
10 child.text = str(val)
11 return elem
12
13data = {"name": "Alice", "age": "30", "city": "NYC"}
14root = dict_to_xml("user", data)
15print(ET.tostring(root, encoding="unicode"))
16# <user><name>Alice</name><age>30</age><city>NYC</city></user>
Common Pitfalls
Special characters: XML requires escaping <, >, &, ", and '. ElementTree handles this automatically when you set .text. String templating does not.
Encoding declaration: Use xml_declaration=True and encoding="utf-8" in tree.write() to include the XML declaration header.
Namespace handling: If your XML needs namespaces, register them with ET.register_namespace("prefix", "uri") before creating elements.
Empty elements: ET.SubElement(root, "empty") without setting .text creates <empty /> (self-closing). Set .text = "" for <empty></empty>.
Large files: For very large XML files, use ET.TreeBuilder or lxml.etree.xmlfile for incremental writing instead of building the entire tree in memory.
Summary
Use xml.etree.ElementTree for simple XML creation (built-in, no dependencies)
Use ET.indent() (Python 3.9+) or minidom.toprettyxml() for pretty printing
Use lxml for better performance, XPath support, and pretty_print=True
Always use XML libraries (not string formatting) when data may contain special characters
Set xml_declaration=True and encoding="utf-8" when writing to files