Regular Expressions and negating a whole character group

Question

I m attempting something which I feel should be fairly obvious to me but it s not   I m trying to match a string which does NOT contain a specific sequence of characters   I ve tried using   ab      ab    etc  to match strings containing no  a s or  b s  or only  a s or only  b s or  ba  but not match on  ab    The examples I gave won t match  ab  it s true but they also won t match  a  alone and I need them to   Is there some simple way to do this

User · Accepted Answer

Use negative lookahead         ab       UPDATE  In the comments below  I stated that this approach is slower than the one given in Peter s answer   I ve run some tests since then  and found that it s really slightly faster   However  the reason to prefer this technique over the other is not speed  but simplicity     The other technique  described here as a tempered greedy token  is suitable for more complex problems  like matching delimited text where the delimiters consist of multiple characters  like HTML  as Luke commented below    For the problem described in the question  it s overkill   For anyone who s interested  I tested with a large chunk of Lorem Ipsum text  counting the number of lines that don t contain the word  quo    These are the regexes I used     m        bquo b        m         bquo b        Whether I search for matches in the whole text  or break it up into lines and match them individually  the anchored lookahead consistently outperforms the floating one

User · Answer

abc   def  will match abc not followed   by def  So it ll match abce  abc    abck  etc  what if I want neither def   nor xyz will it be abc    def  xyz           I had the same question and found a solution   abc      def        xyz     These non-counting groups are combined by  AND   so it this should do the trick  Hope it helps

User · Answer

Simplest way is to pull the negation out of the regular expression entirely   if   userName matches     Ss ys  admin

User · Answer

The regex   ab  will match for example  ab ab ab ab  but not  ab   because it will match on the string   a  or  b    What language scenario do you have  Can you subtract results from the original set  and just match ab  If you are using GNU grep  and are parsing input  use the  -v  flag to invert your results  returning all non-matches  Other regex tools also have a  return nonmatch  function  too  If I understand correctly  you want everything except for those items which contain  ab  anywhere

User · Answer

Using a character class such as   ab  will match a single character that is not within the set of characters   With the   being the negating part    To match a string which does not contain the multi-character sequence ab  you want to use a negative lookahead          ab         And the above expression disected in regex comment mode is     x       enable regex comment mode           match start of line string           begin non-capturing group           begin negative lookahead     ab    literal text sequence ab           end negative lookahead           any single character           end non-capturing group           repeat previous match one or more times           match end of line string

User · Answer

In this case I might just simply avoid regular expressions altogether and go with something like   if  StringToTest IndexOf  ab    lt  0      do stuff   This is likely also going to be much faster  a quick test vs regexes above showed this method to take about 25  of the time of the regex method    In general  if I know the exact string I m looking for  I ve found regexes are overkill   Since you know you don t want  ab   it s a simple matter to test if the string contains that string  without using regex

User · Answer

Yes its called negative lookahead  It goes like this -    regex here   So abc   def  will match abc not followed by def  So it ll match abce  abc  abck  etc   Similarly there is positive lookahead -    regex here   So abc   def  will match abc followed by def   There are also negative and positive lookbehind -    lt  regex here  and    lt  regex here  respectively  One point to note is that the negative lookahead is zero-width  That is  it does not count as having taken any space   So it may look like a   b c will match  abc  but it won t  It will match  a   then the positive lookahead with  b  but it won t move forward into the string  Then it will try to match the  c  with  b  which won t work  Similarly  a   b b  will match  ab  and not  abb  because the lookarounds are zero-width  in most regex implementations    More information on this page

User · Answer

Just search for  ab  in the string then negate the result     ab  test  bamboo       true   ab  test  baobab       false   It seems easier and should be faster too

User · Answer

Using a regex as you described is the simple way  as far as I am aware   If you want a range you could use   a-f

[regex] Regular Expressions and negating a whole character group

Examples related to regex