Difference between b and B in regex

Question

I am reading a book on regular expression and I came across this example for  b   The cat scattered his food all over the room   Using regex -  bcat b will match the word cat but not the cat in scattered  For  B the author uses the following example   Please enter the nine-digit id as it appears on your color - coded pass-key   Using regex  B- B matches - between the word color - coded  Using  b- b on the other hand matches the - in nine-digit and pass-key  How come in the first example we use  b to separate cat and in the second use  B to separate -  Using  b in the second example does the opposite of what it did earlier  Please explain the difference to me  EDIT  Also  can anyone please explain with a new example

User · Answer

The metacharacter  b is an anchor like the caret and the dollar sign  It matches at a position that is called a  word boundary   This match is zero-length   There are three different positions that qualify as word boundaries    Before the first character in the string  if the first character is a word character  After the last character in the string  if the last character is a word character  Between two characters in the string  where one is a word character and the other is not a word character      B is the negated version of  b   B matches at every position where  b does not  Effectively   B matches at any position between two word characters as well as at any position between two non-word characters   Source  http   www regular-expressions info wordboundaries html

User · Answer

b is a zero-width word boundary  Specifically      Matches at the position between a word character  anything matched by  w  and a non-word character  anything matched by    w  or  W  as well as at the start and or end of the string if the first and or last characters in the string are word characters    Example    b matches c in abc   B is a zero-width non-word boundary  Specifically      Matches at the position between two word characters  i e the position between  w w  as well as at the position between two non-word characters  i e   W W     Example   B  B matches b in abc  See regular-expressions info for more great regex info

User · Answer

The confusion stems from your thinking  b matches spaces  probably because  b  suggests  blank      b matches the empty string at the beginning or end of a word    B matches the empty string not at the beginning or end of a word   The key here is that  -  is not a part of a word   So  lt left gt - lt right gt  matches  b- b because there are word boundaries on either side of the -   On the other hand for  lt left gt  -  lt right gt   note the spaces   there are not word boundaries on either side of the dash   The word boundaries are one space further left and right   On the other hand  when searching for  bcat b word boundaries behave more intuitively  and it matches   cat   as expected

User · Answer

B is not  b e g  negative  b  pass-key here is no word boundary beside - so it matches  B in your first example there are word boundary beside cat so it matches  b  similar rules apply for others too   W is negative of  w  UPPER CASE is negative of  LOWER CASE

User · Answer

Source    Copyright RexEgg com  Word Boundary   b   The word boundary  b matches positions where one side is a word character  usually a letter  digit or underscore   but see below for variations across engines  and the other side is not a word character  for instance  it may be the beginning of the string or a space character    The regex  bcat b would  therefore  match cat in a black cat  but it wouldn t match it in catatonic  tomcat or certificate  Removing one of the boundaries   bcat would match cat in catfish  and cat b would match cat in tomcat  but not vice-versa  Both  of course  would match cat on its own   Not-a-word-boundary   B   B matches all positions where  b doesn t match  Therefore  it matches     When neither side is a word character  for instance at any position in the string     -      including the beginning and end of the string     When both sides are a word character  for instance between the H and the i in Hi   This may not seem very useful  but sometimes  B is just what you want  For instance      Bcat B will find cat fully surrounded by word characters  as in certificate  but neither on its own nor at the beginning or end of words     cat B will find cat both in certificate and catfish  but neither in tomcat nor on its own      Bcat will find cat both in certificate and tomcat  but neither in catfish nor on its own      Bcat cat B will find cat in embedded situation  e g  in certificate  catfish or tomcat  but not on its own

User · Answer

b is used as word boundary  word    categorical cat    Find all  cat  in the above word  without  b  re findall r cat  word    cat    cat     with  b  re findall r  bcat b  word    cat

User · Answer

With a different example   Consider this is the string and pattern to be searched for is  cat    text    catmania thiscat thiscatmaina    Now definitions     b  finds matches the pattern at the beginning or end of each word     B  does not find match the pattern at the beginning or end of each word   Different Cases   Case 1  At the beginning of each word  result   text replace   bcat g   ct     Now  result is  ctmania thiscat thiscatmaina   Case 2  At the end of each word  result   text replace  cat b g   ct     Now  result is  catmania thisct thiscatmaina   Case 3  Not in the beginning  result   text replace   Bcat g   ct     Now  result is  catmania thisct thisctmaina   Case 4  Not in the end  result   text replace  cat B g   ct     Now  result is  ctmania thiscat thisctmaina   Case 5  Neither beginning nor end  result   text replace   Bcat B g   ct     Now  result is  catmania thiscat thisctmaina   Hope this helps

User · Answer

Let take a string like       XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-    Note  Underscore       is not considered a special character in this case       bX b g Should begin and end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-         bX g Should begin with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-        X b g Should end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-         BX B g  Should not begin and not end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-         BX g Should not begin with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-        X B g Should not end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-         bX B g Should begin and not end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-         BX b g Should not begin and should end with a special character or white Space      XIX IXI XX X I II IIXX XXII I-I X-X -X X- X-I I-X -X- -I-X -X-I I-X- X-I- X X  X-

User · Answer

b matches a word-boundary   B matches non-word-boundaries  and is equivalent to    b     b   thanks to  Alan Moore for the correction    Both are zero-width   See http   www regular-expressions info wordboundaries html for details  The site is extremely useful for many basic regex questions

[regex] Difference between \b and \B in regex

Examples related to regex