Python
XML
Programming
Tutorial
File Handling

Creating a simple XML file using python

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Python provides several libraries for creating XML files. The built-in xml.etree.ElementTree module is the most common choice for simple XML generation. For more complex needs, lxml offers better performance and XPath support, while xml.dom.minidom provides DOM-style manipulation.

XML Structure Basics

  • Root Element: The single top-level tag that contains all the other elements
  • Child Elements: Tags nested within the root element representing data or structure
  • Attributes: Key-value pairs associated with an element, e.g., id="1"

Method 1: xml.etree.ElementTree (Built-in)

python
1import xml.etree.ElementTree as ET
2
3# Create root element
4root = ET.Element("catalog")
5
6# Add child elements
7book1 = ET.SubElement(root, "book", id="1")
8ET.SubElement(book1, "title").text = "Python Basics"
9ET.SubElement(book1, "author").text = "Alice Johnson"
10ET.SubElement(book1, "price").text = "29.99"
11
12book2 = ET.SubElement(root, "book", id="2")
13ET.SubElement(book2, "title").text = "Data Science"
14ET.SubElement(book2, "author").text = "Bob Smith"
15ET.SubElement(book2, "price").text = "39.99"
16
17# Write to file
18tree = ET.ElementTree(root)
19ET.indent(tree, space="  ")  # Python 3.9+ for pretty printing
20tree.write("catalog.xml", encoding="utf-8", xml_declaration=True)

Output (catalog.xml):

xml
1<?xml version='1.0' encoding='utf-8'?>
2<catalog>
3  <book id="1">
4    <title>Python Basics</title>
5    <author>Alice Johnson</author>
6    <price>29.99</price>
7  </book>
8  <book id="2">
9    <title>Data Science</title>
10    <author>Bob Smith</author>
11    <price>39.99</price>
12  </book>
13</catalog>

Method 2: Pretty Printing (Python < 3.9)

For Python versions before 3.9 where ET.indent() is not available:

python
1import xml.etree.ElementTree as ET
2from xml.dom import minidom
3
4root = ET.Element("catalog")
5book = ET.SubElement(root, "book", id="1")
6ET.SubElement(book, "title").text = "Python Basics"
7
8# Convert to pretty-printed string
9rough_string = ET.tostring(root, encoding="unicode")
10reparsed = minidom.parseString(rough_string)
11pretty_xml = reparsed.toprettyxml(indent="  ")
12
13with open("catalog.xml", "w") as f:
14    f.write(pretty_xml)

Method 3: Using lxml

lxml offers better performance and more features:

python
1from lxml import etree
2
3root = etree.Element("catalog")
4book = etree.SubElement(root, "book", id="1")
5etree.SubElement(book, "title").text = "Python Basics"
6etree.SubElement(book, "author").text = "Alice Johnson"
7etree.SubElement(book, "price").text = "29.99"
8
9# Write with pretty printing
10tree = etree.ElementTree(root)
11tree.write("catalog.xml", pretty_print=True,
12           xml_declaration=True, encoding="utf-8")

Install with: pip install lxml

Method 4: Using xml.dom.minidom

DOM-style creation for those familiar with the DOM API:

python
1from xml.dom.minidom import Document
2
3doc = Document()
4
5# Create root
6catalog = doc.createElement("catalog")
7doc.appendChild(catalog)
8
9# Create a book
10book = doc.createElement("book")
11book.setAttribute("id", "1")
12catalog.appendChild(book)
13
14title = doc.createElement("title")
15title.appendChild(doc.createTextNode("Python Basics"))
16book.appendChild(title)
17
18author = doc.createElement("author")
19author.appendChild(doc.createTextNode("Alice Johnson"))
20book.appendChild(author)
21
22# Write to file
23with open("catalog.xml", "w") as f:
24    doc.writexml(f, indent="", addindent="  ", newl="\n", encoding="utf-8")

Method 5: String Templating (Quick and Simple)

For very simple XML, string formatting works:

python
1books = [
2    {"id": "1", "title": "Python Basics", "author": "Alice", "price": "29.99"},
3    {"id": "2", "title": "Data Science", "author": "Bob", "price": "39.99"},
4]
5
6xml = '<?xml version="1.0" encoding="utf-8"?>\n<catalog>\n'
7for book in books:
8    xml += f'  <book id="{book["id"]}">\n'
9    xml += f'    <title>{book["title"]}</title>\n'
10    xml += f'    <author>{book["author"]}</author>\n'
11    xml += f'    <price>{book["price"]}</price>\n'
12    xml += '  </book>\n'
13xml += '</catalog>'
14
15with open("catalog.xml", "w") as f:
16    f.write(xml)

Warning: String templating does not escape special characters. Use proper XML libraries for data that may contain <, >, &, or quotes.

Building XML from Dictionaries

python
1import xml.etree.ElementTree as ET
2
3def dict_to_xml(tag, data):
4    elem = ET.Element(tag)
5    for key, val in data.items():
6        child = ET.SubElement(elem, key)
7        if isinstance(val, dict):
8            child.extend(dict_to_xml(key, val))
9        else:
10            child.text = str(val)
11    return elem
12
13data = {"name": "Alice", "age": "30", "city": "NYC"}
14root = dict_to_xml("user", data)
15print(ET.tostring(root, encoding="unicode"))
16# <user><name>Alice</name><age>30</age><city>NYC</city></user>

Common Pitfalls

  • Special characters: XML requires escaping <, >, &, ", and '. ElementTree handles this automatically when you set .text. String templating does not.
  • Encoding declaration: Use xml_declaration=True and encoding="utf-8" in tree.write() to include the XML declaration header.
  • Namespace handling: If your XML needs namespaces, register them with ET.register_namespace("prefix", "uri") before creating elements.
  • Empty elements: ET.SubElement(root, "empty") without setting .text creates <empty /> (self-closing). Set .text = "" for <empty></empty>.
  • Large files: For very large XML files, use ET.TreeBuilder or lxml.etree.xmlfile for incremental writing instead of building the entire tree in memory.

Summary

  • Use xml.etree.ElementTree for simple XML creation (built-in, no dependencies)
  • Use ET.indent() (Python 3.9+) or minidom.toprettyxml() for pretty printing
  • Use lxml for better performance, XPath support, and pretty_print=True
  • Always use XML libraries (not string formatting) when data may contain special characters
  • Set xml_declaration=True and encoding="utf-8" when writing to files

Course illustration
Course illustration

All Rights Reserved.