Finding multiple occurrences of a string within a string in Python

Question

How do I find multiple occurrences of a string within a string in Python  Consider this    gt  gt  gt  text    Allowed Hello Hollow   gt  gt  gt  text find  ll   1  gt  gt  gt     So the first occurrence of ll is at 1 as expected  How do I find the next occurrence of it   Same question is valid for a list  Consider    gt  gt  gt  x     ll    ok    ll     How do I find all the ll with their indexes

User · Accepted Answer

Using regular expressions  you can use re finditer to find all  non-overlapping  occurences    gt  gt  gt  import re  gt  gt  gt  text    Allowed Hello Hollow   gt  gt  gt  for m in re finditer  ll   text            print  ll found   m start    m end     ll found 1 3 ll found 10 12 ll found 16 18   Alternatively  if you don t want the overhead of regular expressions  you can also repeatedly use str find to get the next index    gt  gt  gt  text    Allowed Hello Hollow   gt  gt  gt  index   0  gt  gt  gt  while index  lt  len text           index   text find  ll   index          if index    -1              break         print  ll found at   index          index    2    2 because len  ll      2  ll found at  1 ll found at  10 ll found at  16   This also works for lists and other sequences

User · Answer

gt  gt  gt  for n c in enumerate text         try          if c text n 1      ll   print n       except  pass     1 10 16

User · Answer

I had randomly gotten this idea just a while ago  Using a While loop with string splicing and string search can work  even for overlapping strings   findin    algorithm alma mater alison alternation alpines  search    al  inx   0 num str   0  while True      inx   findin find search      if inx    -1   breaks before adding 1 to number of string         break     inx   inx   1     findin   findin inx    to splice the  unsearched  part of the string     num str   num str   1  counts no  of string  if num str    0      print  There are   num str     search   in your string    else      print  There are no   search   in your string      I m an amateur in Python Programming  Programming of any language  actually   and am not sure what other issues it could have  but I guess it s working fine   I guess lower   could be used somewhere in it too if needed

User · Answer

Brand new to programming in general and working through an online tutorial  I was asked to do this as well  but only using the methods I had learned so far  basically strings and loops   Not sure if this adds any value here  and I know this isn t how you would do it  but I got it to work with this   needle   input   haystack   input   counter   0 n -1 for i in range  n 1 len haystack  1      for j in range n 1 len haystack  1         n -1       if needle    haystack i j            n   n 1          continue       if needle    haystack i j            counter   counter   1 print  counter

User · Answer

This version should be linear in length of the string  and should be fine as long as the sequences aren t too repetitive  in which case you can replace the recursion with a while loop    def find all st  substr  start pos 0  accum          ix   st find substr  start pos      if ix    -1          return accum     return find all st  substr  start pos ix   1  accum accum    ix     bstpierre s list comprehension is a good solution for short sequences  but looks to have quadratic complexity and never finished on a long text I was using   findall lc   lambda txt  substr   n for n in xrange len txt                                      if txt find substr  n     n    For a random string of non-trivial length  the two functions give the same result   import random  string  random seed 0  s      join  random choice string ascii lowercase  for   in range 100000      gt  gt  gt  find all s   th      findall lc s   th   True  gt  gt  gt  findall lc s   th    4   564  818  1872  2470    But the quadratic version is about 300 times slower   timeit find all s   th   1000 loops  best of 3  282   s per loop   timeit findall lc s   th       10 loops  best of 3  92 3 ms per loop

User · Answer

I think what you are looking for is string count   Allowed Hello Hollow  count  ll    gt  gt  gt  3   Hope this helps  NOTE  this only captures non-overlapping occurences

User · Answer

usr local bin python3  - - coding  utf-8 - -  main string   input   sub string   input    count   counter   0  for i in range len main string        if main string i     sub string 0           k   i   1         for j in range 1  len sub string                if k    len main string  and main string k     sub string j                   count    1                 k    1         if count     len sub string  - 1               counter    1         count   0  print counter     This program counts the number of all substrings even if they are overlapped without the use of regex  But this is a naive implementation and for better results in worst case it is advised to go through either Suffix Tree  KMP and other string matching data structures and algorithms

User · Answer

For your list example   In  1   x     ll   ok   ll    In  2   for idx  value in enumerate x               if value     ll                   print idx  value        0 ll 2 ll   If you wanted all the items in a list that contained  ll   you could also do that   In  3   x     Allowed   Hello   World   Hollow    In  4   for idx  value in enumerate x               if  ll  in value                  print idx  value                                   0 Allowed 1 Hello 3 Hollow

User · Answer

Maybe not so Pythonic  but somewhat more self-explanatory  It returns the position of the word looked in the original string    def retrieve occurences sequence  word  result  base counter        indx   sequence find word       if indx    -1           return result      result append indx   base counter       base counter    indx   len word       return retrieve occurences sequence indx   len word     word  result  base counter

User · Answer

A simple iterative code which returns a list of indices where the substring occurs           def allindices string  sub              l               i   string find sub             while i  gt   0                l append i                i   string find sub  i   1             return l

User · Answer

Here is my function for finding multiple occurrences  Unlike the other solutions here  it supports the optional start and end parameters for slicing  just like str index   def all substring indexes string  substring  start 0  end None       result          new start   start     while True          try              index   string index substring  new start  end          except ValueError              return result         else              result append index              new start   index   len substring

User · Answer

FWIW  here are a couple of non-RE alternatives that I think are neater than poke s solution   The first uses str index and checks for ValueError   def findall sub  string                gt  gt  gt  text    Allowed Hello Hollow       gt  gt  gt  tuple findall  ll   text        1  10  16              index   0 - len sub      try          while True              index   string index sub  index   len sub               yield index     except ValueError          pass   The second tests uses str find and checks for the sentinel of -1 by using iter   def findall iter sub  string                gt  gt  gt  text    Allowed Hello Hollow       gt  gt  gt  tuple findall iter  ll   text        1  10  16              def next index length           index   0 - length         while True              index   string find sub  index   length              yield index     return iter next index len sub   next  -1    To apply any of these functions to a list  tuple or other iterable of strings  you can use a higher-level function    one that takes a function as one of its arguments    like this one   def findall each findall  sub  strings                gt  gt  gt  texts     fail    dolly the llama    Hello    Hollow    not ok        gt  gt  gt  list findall each findall   ll   texts             2  10    2     2             gt  gt  gt  texts     parallellized    illegally    dillydallying    hillbillies        gt  gt  gt  list findall each findall iter   ll   texts         4  7    1  6    2  7    2  6               return  tuple findall sub  string   for string in strings

User · Answer

You can also do it with conditional list comprehension like this   string1   Allowed Hello Hollow  string2   ll  print  num for num in xrange len string1 -len string2  1  if string1 num num len string2    string2     1  10  16

User · Answer

I think there s no need to test for length of text  just keep finding until there s nothing left to find  Like this        gt  gt  gt  text    Allowed Hello Hollow       gt  gt  gt  place   0      gt  gt  gt  while text find  ll   place     -1              print  ll found at   text find  ll   place               place   text find  ll   place    2       ll found at 1     ll found at 10     ll found at 16

User · Answer

You can split to get relative positions then sum consecutive numbers in a list and add  string length   occurence order  at the same time to get the wanted string indexes      gt  gt  gt  key    ll   gt  gt  gt  text    Allowed Hello Hollow   gt  gt  gt  x    len i  for i in text split key   -1    gt  gt  gt   sum x  i 1     i len key  for i in range len x     1  10  16   gt  gt  gt

User · Answer

The following function finds all the occurrences of a string inside another while informing the position where each occurrence is found  You can call the function using the test cases in the table below  You can try with words  spaces and numbers all mixed up  The function works well with overlaping characteres            theString            aString     --------------------------   -------      quot 661444444423666455678966 quot      quot 55 quot         quot 661444444423666455678966 quot      quot 44 quot         quot 6123666455678966 quot              quot 666 quot        quot 66123666455678966 quot             quot 66 quot       Calling examples  1  print  quot Number of occurrences   quot   find all  quot 123666455556785555966 quot    quot 5555 quot           output             Found in position   7            Found in position   14            Number of occurrences   2     2  print  quot Number of occorrences   quot   find all  quot Allowed Hello Hollow quot    quot ll  quot        output            Found in position   1           Found in position   10           Found in position   16           Number of occurrences   3  3  print  quot Number of occorrences   quot   find all  quot Aaa bbbcd    abWebbrbbbbrr 123 quot    quot bbb quot        output           Found in position   4          Found in position   21          Number of occurrences   2            def find all theString  aString       count   0     i   len aString      x   0      while x  lt  len theString  -  i-1            if theString x x i     aString                      print  quot Found in position   quot   x              x x i             count count 1         else              x x 1     return count

User · Answer

For the list example  use a comprehension    gt  gt  gt  l     ll    xx    ll    gt  gt  gt  print  n for  n  e  in enumerate l  if e     ll    0  2    Similarly for strings    gt  gt  gt  text    Allowed Hello Hollow   gt  gt  gt  print  n for n in xrange len text   if text find  ll   n     n   1  10  16    this will list adjacent runs of  ll   which may or may not be what you want    gt  gt  gt  text    Alllowed Hello Holllow   gt  gt  gt  print  n for n in xrange len text   if text find  ll   n     n   1  2  11  17  18

[python] Finding multiple occurrences of a string within a string in Python

Examples related to python

Examples related to string