Get HTML source of WebElement in Selenium WebDriver using Python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
If you need the HTML for one specific DOM node in Selenium, the usual answer is to locate the element and read its outerHTML or innerHTML attribute. The important part is not the attribute itself, but making sure you are reading it after the page, frame, and dynamic content are in the state you expect.
Use outerHTML for the full element
outerHTML returns the element's start tag, its contents, and its end tag. That makes it the closest equivalent to "give me this element's HTML source."
If you only want the markup inside the element, use innerHTML instead:
That distinction matters. outerHTML includes the element itself, while innerHTML includes only the children.
Wait for the right DOM state first
Many failures happen because the element exists but has not been populated yet by JavaScript. In that case, Selenium will return HTML, but not the final HTML you expected.
Use WebDriverWait before reading the attribute:
If the page updates after the first render, you may need to wait for a text value, a class name, or some other condition that tells you the element is truly ready.
JavaScript execution is a useful fallback
In most cases, get_attribute("outerHTML") is enough. If you want to be explicit or compare behavior, you can also ask the browser directly through JavaScript:
This can help in debugging when you want to confirm whether the issue is in Selenium's attribute access or in your timing and locator logic. In normal usage, either approach is fine.
Frames and shadow DOM change the workflow
If the target element is inside an iframe, you must switch into that frame before locating the element.
If the element is inside shadow DOM, you cannot always use a normal CSS selector from the top-level document. You may need to access the shadow root first through JavaScript or Selenium's shadow root support, depending on your driver and browser version.
Know when page source is the wrong tool
Sometimes people ask for a single element's HTML because they are trying to debug a test. If that is the case, you may not need to extract HTML at all. It can be simpler to assert on visible text, class names, or specific child nodes. Reading and comparing large chunks of HTML is useful, but it can also make tests brittle if harmless markup changes occur.
Still, for scraping, debugging, or snapshotting a DOM fragment, outerHTML is the right primitive.
Common Pitfalls
- Using
page_sourcewhen the real goal is the HTML for one specific element. - Reading
outerHTMLtoo early, before JavaScript has finished updating the DOM. - Forgetting to switch into an
iframebefore locating the element. - Confusing
innerHTMLwithouterHTMLand getting more or less markup than intended. - Building assertions around large raw HTML strings when a smaller, more stable DOM check would be better.
Summary
- Use
element.get_attribute("outerHTML")to get the full HTML for a SeleniumWebElement. - Use
innerHTMLonly when you want the content inside the element, not the element wrapper. - Wait for the correct DOM state before reading HTML from dynamic pages.
- Switch into frames first if the element lives inside an
iframe. - Extracting raw HTML is useful for debugging and scraping, but it is not always the best assertion strategy for tests.

