I did a little experiment to see which of these methods
string.startswith('hello')
string.rfind('hello') == 0
string.rpartition('hello')[0] == ''
string.rindex('hello') == 0
are most efficient to return whether a certain string begins with another string.
Here is the result of one of the many test runs I've made, where each list is ordered to show the least time it took (in seconds) to parse 5 million of each of the above expressions during each iteration of the while
loop I used:
['startswith: 1.37', 'rpartition: 1.38', 'rfind: 1.62', 'rindex: 1.62']
['startswith: 1.28', 'rpartition: 1.44', 'rindex: 1.67', 'rfind: 1.68']
['startswith: 1.29', 'rpartition: 1.42', 'rindex: 1.63', 'rfind: 1.64']
['startswith: 1.28', 'rpartition: 1.43', 'rindex: 1.61', 'rfind: 1.62']
['rpartition: 1.48', 'startswith: 1.48', 'rfind: 1.62', 'rindex: 1.67']
['startswith: 1.34', 'rpartition: 1.43', 'rfind: 1.64', 'rindex: 1.64']
['startswith: 1.36', 'rpartition: 1.44', 'rindex: 1.61', 'rfind: 1.63']
['startswith: 1.29', 'rpartition: 1.37', 'rindex: 1.64', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.44', 'rfind: 1.66', 'rindex: 1.68']
['startswith: 1.44', 'rpartition: 1.41', 'rindex: 1.61', 'rfind: 2.24']
['startswith: 1.34', 'rpartition: 1.45', 'rindex: 1.62', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.38', 'rindex: 1.67', 'rfind: 1.74']
['rpartition: 1.37', 'startswith: 1.38', 'rfind: 1.61', 'rindex: 1.64']
['startswith: 1.32', 'rpartition: 1.39', 'rfind: 1.64', 'rindex: 1.61']
['rpartition: 1.35', 'startswith: 1.36', 'rfind: 1.63', 'rindex: 1.67']
['startswith: 1.29', 'rpartition: 1.36', 'rfind: 1.65', 'rindex: 1.84']
['startswith: 1.41', 'rpartition: 1.44', 'rfind: 1.63', 'rindex: 1.71']
['startswith: 1.34', 'rpartition: 1.46', 'rindex: 1.66', 'rfind: 1.74']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.38', 'rpartition: 1.48', 'rfind: 1.68', 'rindex: 1.68']
['startswith: 1.35', 'rpartition: 1.42', 'rfind: 1.63', 'rindex: 1.68']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.65', 'rindex: 1.75']
['startswith: 1.37', 'rpartition: 1.46', 'rfind: 1.74', 'rindex: 1.75']
['startswith: 1.31', 'rpartition: 1.48', 'rfind: 1.67', 'rindex: 1.74']
['startswith: 1.44', 'rpartition: 1.46', 'rindex: 1.69', 'rfind: 1.74']
['startswith: 1.44', 'rpartition: 1.42', 'rfind: 1.65', 'rindex: 1.65']
['startswith: 1.36', 'rpartition: 1.44', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.34', 'rpartition: 1.46', 'rfind: 1.61', 'rindex: 1.74']
['startswith: 1.35', 'rpartition: 1.56', 'rfind: 1.68', 'rindex: 1.69']
['startswith: 1.32', 'rpartition: 1.48', 'rindex: 1.64', 'rfind: 1.65']
['startswith: 1.28', 'rpartition: 1.43', 'rfind: 1.59', 'rindex: 1.66']
I believe that it is pretty obvious from the start that the startswith
method would come out the most efficient, as returning whether a string begins with the specified string is its main purpose.
What surprises me is that the seemingly impractical string.rpartition('hello')[0] == ''
method always finds a way to be listed first, before the string.startswith('hello')
method, every now and then. The results show that using str.partition
to determine if a string starts with another string is more efficient then using both rfind
and rindex
.
Another thing I've noticed is that string.rindex('hello') == 0
and string.rindex('hello') == 0
have a good battle going on, each rising from fourth to third place, and dropping from third to fourth place, which makes sense, as their main purposes are the same.
Here is the code:
from time import perf_counter
string = 'hello world'
places = dict()
while True:
start = perf_counter()
for _ in range(5000000):
string.startswith('hello')
end = perf_counter()
places['startswith'] = round(end - start, 2)
start = perf_counter()
for _ in range(5000000):
string.rfind('hello') == 0
end = perf_counter()
places['rfind'] = round(end - start, 2)
start = perf_counter()
for _ in range(5000000):
string.rpartition('hello')[0] == ''
end = perf_counter()
places['rpartition'] = round(end - start, 2)
start = perf_counter()
for _ in range(5000000):
string.rindex('hello') == 0
end = perf_counter()
places['rindex'] = round(end - start, 2)
print([f'{b}: {str(a).ljust(4, "4")}' for a, b in sorted(i[::-1] for i in places.items())])