Regular Expressions
Programming Concepts
Computer Science
Syntax Understanding
Lazy and Greedy Quantifiers

What do 'lazy' and 'greedy' mean in the context of regular expressions?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Regular expressions (regex) are a powerful tool for matching patterns within text. Two essential concepts in the context of regex are "lazy" and "greedy" quantifiers, which control how much text a regex pattern should attempt to consume. Understanding the difference between these two can greatly affect the performance of regex-based operations and the results they produce.

Greedy Quantifiers

Greedy quantifiers try to match as much text as possible. They essentially "consume" as much of the input as they can, consistent with the rest of the regular expression still allowing a match. The most common greedy quantifiers are:

  • * (match 0 or more times)
  • + (match 1 or more times)
  • ? (match 0 or 1 time)
  • {n,} (match at least 'n' times)

Example of Greedy Matching

Consider the regex ".*" applied to the string "abcde". This regex uses .*, which is a greedy quantifier aiming to match any character (.) zero or more times (*). In this case, .* will match the entire string "abcde" because it consumes as much as possible.

In more complex strings, such as "abc <tag>def</tag> ghi", using a greedy quantifier with the regex <.*> to find tags would result in a match of <tag>def</tag> — this happens because the quantifier tries to match the longest string that starts with < and ends with >.

Lazy (Non-greedy) Quantifiers

Lazy quantifiers, in contrast, try to match as little text as possible, they are "reluctant" to consume characters. To make a quantifier lazy, a ? is added after it. The most common lazy quantifiers are:

  • *? (match 0 or more times, but as few as possible)
  • +? (match 1 or more times, but as few as possible)
  • ?? (match 0 or 1 time, but prefer 0)
  • &#123;n,&#125;? (match at least 'n' times, but as few as possible after that)

Example of Lazy Matching

Using the previous example of the string "<tag>def</tag>", applying the regex "<.*?>" with a lazy quantifier *?, results in matching just "<tag>". It stops consuming characters as soon as it fulfills the pattern — reaching the first > after <.

Use Cases and Performance Considerations

Lazy quantifiers are particularly useful when you want to avoid "over-matching." They are helpful when parsing or scraping data from nested or hierarchical structures (like XML or HTML) or when specific, minimal matches are desired within larger blocks of text.

Greedy quantifiers can lead to issues like catastrophic backtracking, especially in complex regex patterns with many possibilities for matching. In contrast, lazy quantifiers may result in a higher number of steps to validate a match, potentially impacting performance. Selecting between greedy or lazy depends on the specific requirements and the structure of the text being analyzed.

Summary Table

Here is a quick reference on the behavior of both types of quantifiers:

QuantifierTypeDescriptionExampleAgainst "abcde"Matches
*GreedyMatch 0 or more timesa*aaabcdeaaa
*?LazyMatch 0 or more times, as few as possiblea*?aaabcdeempty
+GreedyMatch 1 or more timesa+aaabcdeaaa
+?LazyMatch 1 or more times, as few as possiblea+?aaabcdea
?GreedyMatch 0 or 1 timea?aaabcdea
??LazyMatch 0 or 1 time, prefer 0a??aaabcdeempty

In summary, the choice between using greedy or lazy quantifiers in regular expressions depends on the specific requirement: whether you need to match as much as possible or to keep the match minimal. Understanding their behavior and effect can help in crafting more efficient and purpose-driven regex operations.


Course illustration
Course illustration

All Rights Reserved.