Categories
non-greedy regex regex-greedy

What do ‘lazy’ and ‘greedy’ mean in the context of regular expressions?

711

What are these two terms in an understandable way?

1

832

Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+>. Suppose you have the following:

<em>Hello World</em>

You may think that <.+> (. means any non newline character and + means one or more) would only match the <em> and the </em>, when in reality it will be very greedy, and go from the first < to the last >. This means it will match <em>Hello World</em> instead of what you wanted.

Making it lazy (<.+?>) will prevent this. By adding the ? after the +, we tell it to repeat as few times as possible, so the first > it comes across, is where we want to stop the matching.

I’d encourage you to download RegExr, a great tool that will help you explore Regular Expressions – I use it all the time.

7

  • 2

    so if you use greedy will u have 3 (1 element + 2 tags) matches or just 1 match (1 element)?

    – ajsie

    Feb 20, 2010 at 6:27

  • 12

    It would match only 1 time, starting from the first < and ending with the last >.

    – Sampson

    Feb 20, 2010 at 6:28

  • 4

    But making it lazy would match twice, giving us both the opening and closing tag, ignoring the text in between (since it doesn’t fit the expression).

    – Sampson

    Feb 20, 2010 at 6:29

  • 12

    Just to add that there is a greedy way to go about it, too: <[^>]+> regex101.com/r/lW0cY6/1

    Jun 15, 2015 at 12:57

  • 1

    For the record, about using regex with HTML stackoverflow.com/questions/1732348/…

    – qwr

    Jun 11, 2021 at 22:00

405

‘Greedy’ means match longest possible string.

‘Lazy’ means match shortest possible string.

For example, the greedy h.+l matches 'hell' in 'hello' but the lazy h.+?l matches 'hel'.

6

  • 148

    Brilliant, so lazy will stop as soon as the condition l is satisfied, but greedy means it will stop only once the condition l is not satisfied any more?

    – Andrew S

    Feb 23, 2014 at 21:27

  • 4

    For all people reading the post: greedy or lazy quantifiers by themselves won’t match the longest/shortest possible substring. You would have to use either a tempered greedy token, or use non-regex approaches.

    Oct 15, 2016 at 21:29

  • 6

    @AndrewS Don’t be confused by the double ll in the example. It’s rather lazy will match the shortest possible substring while greedy will match the longest possible. Greedy h.+l matches 'helol' in 'helolo' but the lazy h.+?l matches 'hel'.

    Mar 21, 2017 at 16:38


  • 6

    @FloatingRock: No. x? means x is optional but +? is a different syntax. It means stop looking after you find something that matches – lazy matching.

    – slebetman

    Apr 14, 2017 at 12:56

  • 2

    @FloatingRock: As for how you differentiate the different syntax, simple: ? means optional and +? means lazy. Therefore \+? means + is optional.

    – slebetman

    Apr 14, 2017 at 12:57


226

Greedy quantifierLazy quantifierDescription
**?Star Quantifier: 0 or more
++?Plus Quantifier: 1 or more
???Optional Quantifier: 0 or 1
{n}{n}?Quantifier: exactly n
{n,}{n,}?Quantifier: n or more
{n,m}{n,m}?Quantifier: between n and m

Add a ? to a quantifier to make it ungreedy i.e lazy.

Example:
test string : stackoverflow
greedy reg expression : s.*o output: stackoverflow
lazy reg expression : s.*?o output: stackoverflow

3

  • 4

    is not ?? equivalent to ? . Similarly , isn’t {n}? equivalen to {n}

    – Number945

    Sep 2, 2016 at 8:07


  • 10

    @BreakingBenjamin: no ?? is not equivalent to ?, when it has a choice to either return 0 or 1 occurrence, it will pick the 0 (lazy) alternative. To see the difference, compare re.match('(f)?(.*)', 'food').groups() to re.match('(f)??(.*)', 'food').groups(). In the latter, (f)?? will not match the leading ‘f’ even though it could. Hence the ‘f’ will get matched by the second ‘.*’ capture group. I’m sure you can construct an example with ‘{n}?’ too. Admittedly these two are very-rarely-used.

    – smci

    Nov 16, 2017 at 0:42


  • 2

    @Number945 Yes, {n}? is equivalent to {n}. See stackoverflow.com/questions/18006093/how-do-an-and-an-differ

    Apr 1, 2021 at 15:13