All the other answers have two main flaws:
def findall(haystack, needle):
idx = -1
while True:
idx = haystack.find(needle, idx+1)
if idx == -1:
break
yield idx
This iterates through haystack
looking for needle
, always starting at where the previous iteration ended. It uses the builtin str.find
which is much faster than iterating through haystack
character-by-character. It doesn't require any new imports.
This is because str.index(ch)
will return the index where ch
occurs the first time. Try:
def find(s, ch):
return [i for i, ltr in enumerate(s) if ltr == ch]
This will return a list of all indexes you need.
P.S. Hugh's answer shows a generator function (it makes a difference if the list of indexes can get large). This function can also be adjusted by changing []
to ()
.
Lev's answer is the one I'd use, however here's something based on your original code:
def find(str, ch):
for i, ltr in enumerate(str):
if ltr == ch:
yield i
>>> list(find("ooottat", "o"))
[0, 1, 2]
def find_offsets(haystack, needle):
"""
Find the start of all (possibly-overlapping) instances of needle in haystack
"""
offs = -1
while True:
offs = haystack.find(needle, offs+1)
if offs == -1:
break
else:
yield offs
for offs in find_offsets("ooottat", "o"):
print offs
results in
0
1
2
x = "abcdabcdabcd"
print(x)
l = -1
while True:
l = x.find("a", l+1)
if l == -1:
break
print(l)
I would go with Lev, but it's worth pointing out that if you end up with more complex searches that using re.finditer may be worth bearing in mind (but re's often cause more trouble than worth - but sometimes handy to know)
test = "ooottat"
[ (i.start(), i.end()) for i in re.finditer('o', test)]
# [(0, 1), (1, 2), (2, 3)]
[ (i.start(), i.end()) for i in re.finditer('o+', test)]
# [(0, 3)]
def find_idx(str, ch):
yield [i for i, c in enumerate(str) if c == ch]
for idx in find_idx('babak karchini is a beginner in python ', 'i'):
print(idx)
output:
[11, 13, 15, 23, 29]
As the rule of thumb, NumPy arrays often outperform other solutions while working with POD, Plain Old Data. A string is an example of POD and a character too. To find all the indices of only one char in a string, NumPy ndarrays may be the fastest way:
def find1(str, ch):
# 0.100 seconds for 1MB str
npbuf = np.frombuffer(str, dtype=np.uint8) # Reinterpret str as a char buffer
return np.where(npbuf == ord(ch)) # Find indices with numpy
def find2(str, ch):
# 0.920 seconds for 1MB str
return [i for i, c in enumerate(str) if c == ch] # Find indices with python
This is slightly modified version of Mark Ransom's answer that works if ch
could be more than one character in length.
def find(term, ch):
"""Find all places with ch in str
"""
for i in range(len(term)):
if term[i:i + len(ch)] == ch:
yield i
You could try this
def find(ch,string1):
for i in range(len(string1)):
if ch == string1[i]:
pos.append(i)
Source: Stackoverflow.com