This is entirely inspired by laurasia's answer above, but it refines the structure.
It also adds some checks:
0
when searching an empty file for the empty string. In laurasia's answer, this is an edge case that will return -1
.In practice, the goal string should be much smaller than the buffer for efficiency, and there are more efficient methods of searching if the size of the goal string is very close to the size of the buffer.
def fnd(fname, goal, start=0, bsize=4096):
if bsize < len(goal):
raise ValueError("The buffer size must be larger than the string being searched for.")
with open(fname, 'rb') as f:
if start > 0:
f.seek(start)
overlap = len(goal) - 1
while True:
buffer = f.read(bsize)
pos = buffer.find(goal)
if pos >= 0:
return f.tell() - len(buffer) + pos
if not buffer:
return -1
f.seek(f.tell() - overlap)