[regex] OR condition in Regex

Let's say I have

1 ABC Street
1 A ABC Street

With \d, it matches 1 (what I expect), with \d \w, it matches 1 A (expected). When I combine the patterns together \d|\d \w, it matches only the first one but ignores the second one.

My question is how to use "or" condition correctly in this particular case?

PS: The condition is wrapping the number only when there is no single letter after that, otherwise wrap the number and the single letter.

Example: 1 ABC Street match number 1 only, but when 1 A ABC Street wrap the 1 A

This question is related to regex

The answer is


A classic "or" would be |. For example, ab|de would match either side of the expression.

However, for something like your case you might want to use the ? quantifier, which will match the previous expression exactly 0 or 1 times (1 times preferred; i.e. it's a "greedy" match). Another (probably more relyable) alternative would be using a custom character group:

\d+\s+[A-Z\s]+\s+[A-Z][A-Za-z]+

This pattern will match:

  • \d+: One or more numbers.
  • \s+: One or more whitespaces.
  • [A-Z\s]+: One or more uppercase characters or space characters
  • \s+: One or more whitespaces.
  • [A-Z][A-Za-z\s]+: An uppercase character followed by at least one more character (uppercase or lowercase) or whitespaces.

If you'd like a more static check, e.g. indeed only match ABC and A ABC, then you can combine a (non-matching) group and define the alternatives inside (to limit the scope):

\d (?:ABC|A ABC) Street

Or another alternative using a quantifier:

\d (?:A )?ABC Street

I think what you need might be simply:

\d( \w)?

Note that your regex would have worked too if it was written as \d \w|\d instead of \d|\d \w.

This is because in your case, once the regex matches the first option, \d, it ceases to search for a new match, so to speak.