regex
regular expressions
capturing groups
nested groups
regex numbering

How are nested capturing groups numbered in regular expressions?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Nested capturing groups are numbered by the position of their opening parenthesis as the regex is read from left to right. The nesting depth does not create a separate numbering scheme; every capturing group simply gets the next number when its opening parenthesis appears.

The Core Rule

Regex engines that support numbered capture groups follow a simple rule:

  • scan the pattern from left to right
  • assign group 1, 2, 3, and so on whenever a capturing ( is opened

That means outer and inner groups are mixed into one numbering sequence based purely on opening-parenthesis order.

Example With Nesting

Take this pattern:

regex
((a)(b(c)))

The group numbers are:

  1. ((a)(b(c)))
  2. (a)
  3. (b(c))
  4. (c)

Notice that group 2 is not "the next outer group." It is just the second opening capturing parenthesis encountered while scanning left to right.

Python Demonstration

Python exposes the numbered groups clearly.

python
1import re
2
3pattern = re.compile(r"((a)(b(c)))")
4match = pattern.match("abc")
5
6for i in range(0, 5):
7    print(i, match.group(i))

Expected output:

text
10 abc
21 abc
32 a
43 bc
54 c

Group 0 is always the full match. The actual capturing groups start at 1.

Non-Capturing Groups Do Not Count

A non-capturing group uses (?:...). It groups part of the pattern for precedence or repetition but does not receive a capture number.

python
1import re
2
3pattern = re.compile(r"(?:x)(y(z))")
4match = pattern.match("xyz")
5
6print(match.group(1))
7print(match.group(2))

Here, the (?:x) group does not consume a group number. The numbering starts at (y(z)).

This is one of the easiest ways to keep numbering stable when you need grouping but do not need extraction.

Named Groups Still Have Numbers

Named groups improve readability, but in most regex flavors they still occupy numbered positions.

python
1import re
2
3pattern = re.compile(r"(?P<outer>(a)(?P<inner>b))")
4match = pattern.match("ab")
5
6print(match.group(1))
7print(match.group(2))
8print(match.group(3))
9print(match.group("outer"))
10print(match.group("inner"))

The groups have both names and numeric indices. Naming them does not remove them from the numbering sequence.

Backreferences Depend On Numbering

Numbering matters because backreferences use it.

python
1import re
2
3pattern = re.compile(r"(ha)+-(ha)")
4print(bool(pattern.fullmatch("haha-ha")))

In more complex regexes, introducing a new capturing group earlier in the pattern can shift every later group number. That is one reason non-capturing groups are useful in maintenance-heavy patterns.

Common Pitfalls

The most common mistake is assuming nested groups are numbered by depth. They are not. Only the left-to-right order of opening parentheses matters.

Another mistake is forgetting that non-capturing groups do not count. If one pattern uses (...) and another uses (?:...), later group numbers can differ even when the visible structure looks similar.

A third issue is editing a regex and adding a capturing group near the front without updating later backreferences or code that reads group(3), group(4), and so on.

Summary

  • Capturing groups are numbered by opening parenthesis order from left to right.
  • Nested groups do not have a separate numbering scheme.
  • Group 0 is the full match, and numbered captures start at 1.
  • Non-capturing groups (?:...) do not consume a group number.
  • Named groups usually still have numeric positions as well.
  • Backreferences such as \1 and \2 depend directly on this numbering, so adding one capture earlier in the pattern can shift later references.
  • When numbering stability matters, prefer non-capturing groups for structural parentheses you do not need to extract.
  • Test complex patterns in the target regex engine, because replacement syntax and named-group support vary even when capture numbering rules are similar.

Course illustration
Course illustration

All Rights Reserved.