Selenium
WebDriver
Python
HTML Source
WebElement

Get HTML source of WebElement in Selenium WebDriver using Python

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

If you need the HTML for one specific DOM node in Selenium, the usual answer is to locate the element and read its outerHTML or innerHTML attribute. The important part is not the attribute itself, but making sure you are reading it after the page, frame, and dynamic content are in the state you expect.

Use outerHTML for the full element

outerHTML returns the element's start tag, its contents, and its end tag. That makes it the closest equivalent to "give me this element's HTML source."

python
1from selenium import webdriver
2from selenium.webdriver.common.by import By
3
4driver = webdriver.Chrome()
5driver.get("https://example.com")
6
7card = driver.find_element(By.CSS_SELECTOR, "div.example")
8html = card.get_attribute("outerHTML")
9
10print(html)
11driver.quit()

If you only want the markup inside the element, use innerHTML instead:

python
inner = card.get_attribute("innerHTML")
print(inner)

That distinction matters. outerHTML includes the element itself, while innerHTML includes only the children.

Wait for the right DOM state first

Many failures happen because the element exists but has not been populated yet by JavaScript. In that case, Selenium will return HTML, but not the final HTML you expected.

Use WebDriverWait before reading the attribute:

python
1from selenium import webdriver
2from selenium.webdriver.common.by import By
3from selenium.webdriver.support.ui import WebDriverWait
4from selenium.webdriver.support import expected_conditions as EC
5
6driver = webdriver.Chrome()
7driver.get("https://example.com")
8
9wait = WebDriverWait(driver, 10)
10card = wait.until(
11    EC.visibility_of_element_located((By.CSS_SELECTOR, "div.example"))
12)
13
14html = card.get_attribute("outerHTML")
15print(html)
16driver.quit()

If the page updates after the first render, you may need to wait for a text value, a class name, or some other condition that tells you the element is truly ready.

JavaScript execution is a useful fallback

In most cases, get_attribute("outerHTML") is enough. If you want to be explicit or compare behavior, you can also ask the browser directly through JavaScript:

python
1html = driver.execute_script(
2    "return arguments[0].outerHTML;",
3    card,
4)
5print(html)

This can help in debugging when you want to confirm whether the issue is in Selenium's attribute access or in your timing and locator logic. In normal usage, either approach is fine.

Frames and shadow DOM change the workflow

If the target element is inside an iframe, you must switch into that frame before locating the element.

python
1from selenium.webdriver.common.by import By
2
3frame = driver.find_element(By.CSS_SELECTOR, "iframe")
4driver.switch_to.frame(frame)
5
6element = driver.find_element(By.ID, "target")
7print(element.get_attribute("outerHTML"))
8
9driver.switch_to.default_content()

If the element is inside shadow DOM, you cannot always use a normal CSS selector from the top-level document. You may need to access the shadow root first through JavaScript or Selenium's shadow root support, depending on your driver and browser version.

Know when page source is the wrong tool

Sometimes people ask for a single element's HTML because they are trying to debug a test. If that is the case, you may not need to extract HTML at all. It can be simpler to assert on visible text, class names, or specific child nodes. Reading and comparing large chunks of HTML is useful, but it can also make tests brittle if harmless markup changes occur.

Still, for scraping, debugging, or snapshotting a DOM fragment, outerHTML is the right primitive.

Common Pitfalls

  • Using page_source when the real goal is the HTML for one specific element.
  • Reading outerHTML too early, before JavaScript has finished updating the DOM.
  • Forgetting to switch into an iframe before locating the element.
  • Confusing innerHTML with outerHTML and getting more or less markup than intended.
  • Building assertions around large raw HTML strings when a smaller, more stable DOM check would be better.

Summary

  • Use element.get_attribute("outerHTML") to get the full HTML for a Selenium WebElement.
  • Use innerHTML only when you want the content inside the element, not the element wrapper.
  • Wait for the correct DOM state before reading HTML from dynamic pages.
  • Switch into frames first if the element lives inside an iframe.
  • Extracting raw HTML is useful for debugging and scraping, but it is not always the best assertion strategy for tests.

Course illustration
Course illustration

All Rights Reserved.