[python] How to find char in string and get all the indexes?

I got some simple code:

def find(str, ch):
    for ltr in str:
        if ltr == ch:
            return str.index(ltr)
find("ooottat", "o")

The function only return the first index. If I change return to print, it will print 0 0 0. Why is this and is there any way to get 0 1 2?

This question is related to python string

The answer is


All the other answers have two main flaws:

  1. They do a Python loop through the string, which is horrifically slow, or
  2. They use numpy which is a pretty big additional dependency.
def findall(haystack, needle):
    idx = -1
    while True:
        idx = haystack.find(needle, idx+1)
        if idx == -1:
            break
        yield idx

This iterates through haystack looking for needle, always starting at where the previous iteration ended. It uses the builtin str.find which is much faster than iterating through haystack character-by-character. It doesn't require any new imports.


This is because str.index(ch) will return the index where ch occurs the first time. Try:

def find(s, ch):
    return [i for i, ltr in enumerate(s) if ltr == ch]

This will return a list of all indexes you need.

P.S. Hugh's answer shows a generator function (it makes a difference if the list of indexes can get large). This function can also be adjusted by changing [] to ().


Lev's answer is the one I'd use, however here's something based on your original code:

def find(str, ch):
    for i, ltr in enumerate(str):
        if ltr == ch:
            yield i

>>> list(find("ooottat", "o"))
[0, 1, 2]

def find_offsets(haystack, needle):
    """
    Find the start of all (possibly-overlapping) instances of needle in haystack
    """
    offs = -1
    while True:
        offs = haystack.find(needle, offs+1)
        if offs == -1:
            break
        else:
            yield offs

for offs in find_offsets("ooottat", "o"):
    print offs

results in

0
1
2

x = "abcdabcdabcd"
print(x)
l = -1
while True:
    l = x.find("a", l+1)
    if l == -1:
        break
    print(l)

I would go with Lev, but it's worth pointing out that if you end up with more complex searches that using re.finditer may be worth bearing in mind (but re's often cause more trouble than worth - but sometimes handy to know)

test = "ooottat"
[ (i.start(), i.end()) for i in re.finditer('o', test)]
# [(0, 1), (1, 2), (2, 3)]

[ (i.start(), i.end()) for i in re.finditer('o+', test)]
# [(0, 3)]

def find_idx(str, ch):
    yield [i for i, c in enumerate(str) if c == ch]

for idx in find_idx('babak karchini is a beginner in python ', 'i'):
    print(idx)

output:

[11, 13, 15, 23, 29]

As the rule of thumb, NumPy arrays often outperform other solutions while working with POD, Plain Old Data. A string is an example of POD and a character too. To find all the indices of only one char in a string, NumPy ndarrays may be the fastest way:

def find1(str, ch):
  # 0.100 seconds for 1MB str 
  npbuf = np.frombuffer(str, dtype=np.uint8) # Reinterpret str as a char buffer
  return np.where(npbuf == ord(ch))          # Find indices with numpy

def find2(str, ch):
  # 0.920 seconds for 1MB str 
  return [i for i, c in enumerate(str) if c == ch] # Find indices with python

This is slightly modified version of Mark Ransom's answer that works if ch could be more than one character in length.

def find(term, ch):
    """Find all places with ch in str
    """
    for i in range(len(term)):
        if term[i:i + len(ch)] == ch:
            yield i

You could try this

def find(ch,string1):
    for i in range(len(string1)):
        if ch == string1[i]:
            pos.append(i)