How do I search for a pattern within a text file using Python combining regex string file operations and store instances of the pattern

Question

So essentially I m looking for specifically a 4 digit code within two angle brackets within a text file  I know that I need to open the text file and then parse line by line  but I am not sure the best way to go about structuring my code after checking  for line in file     I think I can either somehow split it  strip it  or partition  but I also wrote a regex which I used compile on and so if that returns a match object I don t think I can use that with those string based operations  Also I m not sure whether my regex is greedy enough or not     I d like to store all instances of those found hits as strings within either a tuple or a list    Here is my regex    regex   re compile    lt   d 4 5   gt        I don t think I need to include all that much code considering its fairly basic so far

User · Answer

Doing it in one bulk read   import re  textfile   open filename   r   filetext   textfile read   textfile close   matches   re findall    lt   d 4 5   gt      filetext    Line by line   import re  textfile   open filename   r   matches      reg   re compile    lt   d 4 5   gt      for line in textfile      matches    reg findall line  textfile close     But again  the matches that returns will not be useful for anything except counting unless you added an offset counter   import re  textfile   open filename   r   matches      offset   0 reg   re compile    lt   d 4 5   gt      for line in textfile      matches      reg findall line  offset       offset    len line  textfile close     But it still just makes more sense to read the whole file in at once

User · Answer

import re pattern   re compile   lt   d 4 5   gt     for i  line in enumerate open  test txt         for match in re finditer pattern  line           print  Found on line  s   s     i 1  match group      A couple of notes about the regex    You don t need the   at the end and the outer       if you don t want to match the number with the angle brackets  but only want the number itself It matches either 4 or 5 digits between the angle brackets   Update  It s important to understand that the match and capture in a regex can be quite different  The regex in my snippet above matches the pattern with angle brackets  but I ask to capture only the internal number  without the angle brackets   More about regex in python can be found here    Regular Expression HOWTO

[python] How do I search for a pattern within a text file using Python combining regex & string/file operations and store instances of the pattern?

Examples related to python

Examples related to regex

Examples related to file-io

Examples related to text-mining

Examples related to string-parsing