negative pattern matching in python

Question

I have the following input   OK SYS 10 LEN 20 12 43 1233a fdads txt 23  data a11134 a txt 3232b ddsss txt 32  data d13f11 b txt 3452d dsasa txt 1234  data c13af4 f txt     And I d like to extract all of the input except the line containing  OK SYS 10 LEN 20  and the last line which contains a single      dot   That is  I want to extract the following   1233a fdads txt 23  data a11134 a txt 3232b ddsss txt 32  data d13f11 b txt 3452d dsasa txt 1234  data c13af4 f txt   I tried the following   for item in output      matchObj   re search      OK               item      if matchObj          print  got item      item   but it does not work  as it does not produce any output

User · Answer

Use a negative match   Also note that whitespace is significant  by default  inside a regex so don t space things out  Alternatively  use re VERBOSE    for item in output      matchObj   re search    OK        item      if not matchObj          print  got item     item

User · Answer

You can also do it without negative look ahead  You just need to add parentheses to that part of expression which you want to extract  This construction with parentheses is named group   Let s write python code   string      OK SYS 10 LEN 20 12 43 1233a fdads txt 23  data a11134 a txt 3232b ddsss txt 32  data d13f11 b txt 3452d dsasa txt 1234  data c13af4 f txt        search result   re search r  OK   n     s       string   if search result      print search result group 1     Output is   1233a fdads txt 23  data a11134 a txt 3232b ddsss txt 32  data d13f11 b txt 3452d dsasa txt 1234  data c13af4 f txt    OK   n will find first line with OK statement  but we don t want to extract it so leave it without parentheses  Next is part which we want to capture       s     so put it inside parentheses  And in the end of regexp we look for a dot    but we also don t want to capture it   P S  I find this answer is super helpful to understand power of groups  https   stackoverflow com a 3513858 4333811

User · Answer

If the OK line is the first line and the last line is the dot you could consider slice them off like this   TestString      OK SYS 10 LEN 20 12 43 1233a fdads txt 23  data a11134 a txt 3232b ddsss txt 32  data d13f11 b txt 3452d dsasa txt 1234  data c13af4 f txt       print   n  join TestString split   1 -1      However if this is a very large string you may run into memory problems

User · Answer

if not  line startswith  OK    or line strip                 print line

User · Answer

Why dont you match the OK SYS row and not return it   for item in output      matchObj   re search   OK SYS          item      if not matchObj          print  got item      item

User · Answer

If this is a file  you can simply skip the first and last lines and read the rest with csv    gt  gt  gt  s      OK SYS 10 LEN 20 12 43     1233a fdads txt 23  data a11134 a txt     3232b ddsss txt 32  data d13f11 b txt     3452d dsasa txt 1234  data c13af4 f txt           gt  gt  gt  stream   StringIO StringIO s   gt  gt  gt  rows    row for row in csv reader stream delimiter      if len row     2   gt  gt  gt  rows    1233a fdads txt    23  data a11134 a txt      3232b ddsss txt    32  data d13f11 b txt      3452d dsasa txt    1234  data c13af4 f txt      If its a file  then you can do this   with open  myfile txt   r   as f     rows    row for row in csv reader f delimiter      if len row     2

User · Answer

See it in action   matchObj   re search      OK          item    Don t forget to put    after negative look-ahead  otherwise you couldn t get any match  -

User · Answer

and re search  bla bla pattern   str item  re IGNORECASE     None    is working

[python] 'negative' pattern matching in python

Examples related to python

Examples related to regex