Regex to match URL end-of-line or character

Question

I have a URL  and I m trying to match it to a regular expression to pull out some groups  The problem I m having is that the URL can either end or continue with a     and more URL text  I d like to match URLs like this    http   server xyz 2008-10-08-4 http   server xyz 2008-10-08-4  http   server xyz 2008-10-08-4 123 more   But not match something like this    http   server xyz 2008-10-08-4-1   So  I thought my best bet was something like this             d 4 - d 2 - d 2  -  d         where the character class at the end contained either the     or the end-of-line  The character class doesn t seem to be happy with the     in there though  How can I best discriminate between these URLs while still pulling back the correct groups

User · Answer

In Ruby and Bash  you can use   inside parentheses      S      d 4 - d 2 - d 2  -  d           This solution is similar to Pete Boughton s  but preserves the usage of    which means end of line  rather than using  z  which means end of string

User · Answer

d 4 - d 2 - d 2  -  d            1st Capturing Group          matches any character  except for line terminators      Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    2nd Capturing Group   d 4 - d 2 - d 2     d 4  matches a digit  equal to  0-9      4  Quantifier     Matches exactly 4 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive   3rd Capturing Group   d     d  matches a digit  equal to  0-9       Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    4th Capturing Group          Quantifier     Matches between zero and one times  as many times as possible  giving back as needed  greedy      matches any character  except for line terminators      Quantifier     Matches between zero and unlimited times  as many times as possible  giving back as needed  greedy      asserts position at the end of the string

User · Answer

To match either   or end of content  use     z   This only applies if you are not using multi-line matching  i e  you re matching a single URL  not a newline-delimited list of URLs     To put that with an updated version of what you had      S      d 4 - d 2 - d 2  -  d      z    Note that I ve changed the start to be a non-greedy match for non-whitespace    S     rather than matching anything and everything

User · Answer

To match either   or end of content  use     z   This only applies if you are not using multi-line matching  i e  you re matching a single URL  not a newline-delimited list of URLs     To put that with an updated version of what you had      S      d 4 - d 2 - d 2  -  d      z    Note that I ve changed the start to be a non-greedy match for non-whitespace    S     rather than matching anything and everything

User · Answer

You ve got a couple regexes now which will do what you want  so that s adequately covered     What hasn t been mentioned is why your attempt won t work   Inside a character class     as well as       and    has no special meaning  so      matches either a literal   or a literal   rather than terminating the regex     or matching end-of-line

User · Answer

You ve got a couple regexes now which will do what you want  so that s adequately covered     What hasn t been mentioned is why your attempt won t work   Inside a character class     as well as       and    has no special meaning  so      matches either a literal   or a literal   rather than terminating the regex     or matching end-of-line

User · Answer

You ve got a couple regexes now which will do what you want  so that s adequately covered     What hasn t been mentioned is why your attempt won t work   Inside a character class     as well as       and    has no special meaning  so      matches either a literal   or a literal   rather than terminating the regex     or matching end-of-line

User · Answer

d 4 - d 2 - d 2  -  d            1st Capturing Group          matches any character  except for line terminators      Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    2nd Capturing Group   d 4 - d 2 - d 2     d 4  matches a digit  equal to  0-9      4  Quantifier     Matches exactly 4 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive   3rd Capturing Group   d     d  matches a digit  equal to  0-9       Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    4th Capturing Group          Quantifier     Matches between zero and one times  as many times as possible  giving back as needed  greedy      matches any character  except for line terminators      Quantifier     Matches between zero and unlimited times  as many times as possible  giving back as needed  greedy      asserts position at the end of the string

User · Answer

To match either   or end of content  use     z   This only applies if you are not using multi-line matching  i e  you re matching a single URL  not a newline-delimited list of URLs     To put that with an updated version of what you had      S      d 4 - d 2 - d 2  -  d      z    Note that I ve changed the start to be a non-greedy match for non-whitespace    S     rather than matching anything and everything

User · Answer

You ve got a couple regexes now which will do what you want  so that s adequately covered     What hasn t been mentioned is why your attempt won t work   Inside a character class     as well as       and    has no special meaning  so      matches either a literal   or a literal   rather than terminating the regex     or matching end-of-line

User · Answer

d 4 - d 2 - d 2  -  d            1st Capturing Group          matches any character  except for line terminators      Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    2nd Capturing Group   d 4 - d 2 - d 2     d 4  matches a digit  equal to  0-9      4  Quantifier     Matches exactly 4 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive   3rd Capturing Group   d     d  matches a digit  equal to  0-9       Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    4th Capturing Group          Quantifier     Matches between zero and one times  as many times as possible  giving back as needed  greedy      matches any character  except for line terminators      Quantifier     Matches between zero and unlimited times  as many times as possible  giving back as needed  greedy      asserts position at the end of the string

User · Answer

In Ruby and Bash  you can use   inside parentheses      S      d 4 - d 2 - d 2  -  d           This solution is similar to Pete Boughton s  but preserves the usage of    which means end of line  rather than using  z  which means end of string

User · Answer

d 4 - d 2 - d 2  -  d            1st Capturing Group          matches any character  except for line terminators      Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    2nd Capturing Group   d 4 - d 2 - d 2     d 4  matches a digit  equal to  0-9      4  Quantifier     Matches exactly 4 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive    d 2  matches a digit  equal to  0-9      2  Quantifier     Matches exactly 2 times   - matches the character - literally  case sensitive   3rd Capturing Group   d     d  matches a digit  equal to  0-9       Quantifier     Matches between one and unlimited times  as many times as possible  giving back as needed  greedy    4th Capturing Group          Quantifier     Matches between zero and one times  as many times as possible  giving back as needed  greedy      matches any character  except for line terminators      Quantifier     Matches between zero and unlimited times  as many times as possible  giving back as needed  greedy      asserts position at the end of the string

User · Answer

To match either   or end of content  use     z   This only applies if you are not using multi-line matching  i e  you re matching a single URL  not a newline-delimited list of URLs     To put that with an updated version of what you had      S      d 4 - d 2 - d 2  -  d      z    Note that I ve changed the start to be a non-greedy match for non-whitespace    S     rather than matching anything and everything

[regex] Regex to match URL end-of-line or "/" character

Examples related to regex