I have a string buffer of about 2000 characters and need to check the buffer if it contains a specific string.
Will do the check in a ASP.NET 2.0 webapp for every webrequest.
Does anyone know if the String.Contains method performs better than String.IndexOf method?
// 2000 characters in s1, search token in s2
string s1 = "Many characters. The quick brown fox jumps over the lazy dog";
string s2 = "fox";
bool b;
b = s1.Contains(s2);
int i;
i = s1.IndexOf(s2);
This question is related to
c#
.net
asp.net
performance
string
For anyone still reading this, indexOf() will probably perform better on most enterprise systems, as contains() is not compatible with IE!
Just as an update to this I've been doing some testing and providing your input string is fairly large then parallel Regex is the fastest C# method I've found (providing you have more than one core I imagine)
Getting the total amount of matches for example -
needles.AsParallel ( ).Sum ( l => Regex.IsMatch ( haystack , Regex.Escape ( l ) ) ? 1 : 0 );
Hope this helps!
Use a benchmark library, like this recent foray from Jon Skeet to measure it.
As all (micro-)performance questions, this depends on the versions of software you are using, the details of the data inspected and the code surrounding the call.
As all (micro-)performance questions, the first step has to be to get a running version which is easily maintainable. Then benchmarking, profiling and tuning can be applied to the measured bottlenecks instead of guessing.
From a little reading, it appears that under the hood the String.Contains method simply calls String.IndexOf. The difference is String.Contains returns a boolean while String.IndexOf returns an integer with (-1) representing that the substring was not found.
I would suggest writing a little test with 100,000 or so iterations and see for yourself. If I were to guess, I'd say that IndexOf may be slightly faster but like I said it just a guess.
Jeff Atwood has a good article on strings at his blog. It's more about concatenation but may be helpful nonetheless.
By using Reflector, you can see, that Contains is implemented using IndexOf. Here's the implementation.
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
So Contains is likely a wee bit slower than calling IndexOf directly, but I doubt that it will have any significance for the actual performance.
Contains(s2) is many times (in my computer 10 times) faster than IndexOf(s2) because Contains uses StringComparison.Ordinal that is faster than the culture sensitive search that IndexOf does by default (but that may change in .net 4.0 http://davesbox.com/archive/2008/11/12/breaking-changes-to-the-string-class.aspx).
Contains has exactly the same performance as IndexOf(s2,StringComparison.Ordinal) >= 0 in my tests but it's shorter and makes your intent clear.
By using Reflector, you can see, that Contains is implemented using IndexOf. Here's the implementation.
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
So Contains is likely a wee bit slower than calling IndexOf directly, but I doubt that it will have any significance for the actual performance.
From a little reading, it appears that under the hood the String.Contains method simply calls String.IndexOf. The difference is String.Contains returns a boolean while String.IndexOf returns an integer with (-1) representing that the substring was not found.
I would suggest writing a little test with 100,000 or so iterations and see for yourself. If I were to guess, I'd say that IndexOf may be slightly faster but like I said it just a guess.
Jeff Atwood has a good article on strings at his blog. It's more about concatenation but may be helpful nonetheless.
I am running a real case (in opposite to a synthetic benchmark)
if("=,<=,=>,<>,<,>,!=,==,".IndexOf(tmps)>=0) {
versus
if("=,<=,=>,<>,<,>,!=,==,".Contains(tmps)) {
It is a vital part of my system and it is executed 131,953 times (thanks DotTrace).
However shocking surprise, the result is the opposite that expected
:-/
net framework 4.0 (updated as for 13-02-2012)
If you really want to micro optimise your code your best approach is always benchmarking.
The .net framework has an excellent stopwatch implementation - System.Diagnostics.Stopwatch
Use a benchmark library, like this recent foray from Jon Skeet to measure it.
As all (micro-)performance questions, this depends on the versions of software you are using, the details of the data inspected and the code surrounding the call.
As all (micro-)performance questions, the first step has to be to get a running version which is easily maintainable. Then benchmarking, profiling and tuning can be applied to the measured bottlenecks instead of guessing.
Just as an update to this I've been doing some testing and providing your input string is fairly large then parallel Regex is the fastest C# method I've found (providing you have more than one core I imagine)
Getting the total amount of matches for example -
needles.AsParallel ( ).Sum ( l => Regex.IsMatch ( haystack , Regex.Escape ( l ) ) ? 1 : 0 );
Hope this helps!
Probably, it will not matter at all. Read this post on Coding Horror ;): http://www.codinghorror.com/blog/archives/001218.html
Contains(s2) is many times (in my computer 10 times) faster than IndexOf(s2) because Contains uses StringComparison.Ordinal that is faster than the culture sensitive search that IndexOf does by default (but that may change in .net 4.0 http://davesbox.com/archive/2008/11/12/breaking-changes-to-the-string-class.aspx).
Contains has exactly the same performance as IndexOf(s2,StringComparison.Ordinal) >= 0 in my tests but it's shorter and makes your intent clear.
If you really want to micro optimise your code your best approach is always benchmarking.
The .net framework has an excellent stopwatch implementation - System.Diagnostics.Stopwatch
For anyone still reading this, indexOf() will probably perform better on most enterprise systems, as contains() is not compatible with IE!
Tried it today on a 1.3 GB text file. Amongst others every line is checked for existence of a '@' char. 17.000.000 calls to Contains/IndexOf are made. Result: 12.5 sec for all Contains('@') calls, 2.5 sec for all IndexOf('@') calls. => IndexOf performs 5 times faster!! (.Net 4.8)
From a little reading, it appears that under the hood the String.Contains method simply calls String.IndexOf. The difference is String.Contains returns a boolean while String.IndexOf returns an integer with (-1) representing that the substring was not found.
I would suggest writing a little test with 100,000 or so iterations and see for yourself. If I were to guess, I'd say that IndexOf may be slightly faster but like I said it just a guess.
Jeff Atwood has a good article on strings at his blog. It's more about concatenation but may be helpful nonetheless.
Contains(s2) is many times (in my computer 10 times) faster than IndexOf(s2) because Contains uses StringComparison.Ordinal that is faster than the culture sensitive search that IndexOf does by default (but that may change in .net 4.0 http://davesbox.com/archive/2008/11/12/breaking-changes-to-the-string-class.aspx).
Contains has exactly the same performance as IndexOf(s2,StringComparison.Ordinal) >= 0 in my tests but it's shorter and makes your intent clear.
Contains(s2) is many times (in my computer 10 times) faster than IndexOf(s2) because Contains uses StringComparison.Ordinal that is faster than the culture sensitive search that IndexOf does by default (but that may change in .net 4.0 http://davesbox.com/archive/2008/11/12/breaking-changes-to-the-string-class.aspx).
Contains has exactly the same performance as IndexOf(s2,StringComparison.Ordinal) >= 0 in my tests but it's shorter and makes your intent clear.
Use a benchmark library, like this recent foray from Jon Skeet to measure it.
As all (micro-)performance questions, this depends on the versions of software you are using, the details of the data inspected and the code surrounding the call.
As all (micro-)performance questions, the first step has to be to get a running version which is easily maintainable. Then benchmarking, profiling and tuning can be applied to the measured bottlenecks instead of guessing.
Tried it today on a 1.3 GB text file. Amongst others every line is checked for existence of a '@' char. 17.000.000 calls to Contains/IndexOf are made. Result: 12.5 sec for all Contains('@') calls, 2.5 sec for all IndexOf('@') calls. => IndexOf performs 5 times faster!! (.Net 4.8)
Use a benchmark library, like this recent foray from Jon Skeet to measure it.
As all (micro-)performance questions, this depends on the versions of software you are using, the details of the data inspected and the code surrounding the call.
As all (micro-)performance questions, the first step has to be to get a running version which is easily maintainable. Then benchmarking, profiling and tuning can be applied to the measured bottlenecks instead of guessing.
I am running a real case (in opposite to a synthetic benchmark)
if("=,<=,=>,<>,<,>,!=,==,".IndexOf(tmps)>=0) {
versus
if("=,<=,=>,<>,<,>,!=,==,".Contains(tmps)) {
It is a vital part of my system and it is executed 131,953 times (thanks DotTrace).
However shocking surprise, the result is the opposite that expected
:-/
net framework 4.0 (updated as for 13-02-2012)
From a little reading, it appears that under the hood the String.Contains method simply calls String.IndexOf. The difference is String.Contains returns a boolean while String.IndexOf returns an integer with (-1) representing that the substring was not found.
I would suggest writing a little test with 100,000 or so iterations and see for yourself. If I were to guess, I'd say that IndexOf may be slightly faster but like I said it just a guess.
Jeff Atwood has a good article on strings at his blog. It's more about concatenation but may be helpful nonetheless.
By using Reflector, you can see, that Contains is implemented using IndexOf. Here's the implementation.
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
So Contains is likely a wee bit slower than calling IndexOf directly, but I doubt that it will have any significance for the actual performance.
Probably, it will not matter at all. Read this post on Coding Horror ;): http://www.codinghorror.com/blog/archives/001218.html
If you really want to micro optimise your code your best approach is always benchmarking.
The .net framework has an excellent stopwatch implementation - System.Diagnostics.Stopwatch
Probably, it will not matter at all. Read this post on Coding Horror ;): http://www.codinghorror.com/blog/archives/001218.html
By using Reflector, you can see, that Contains is implemented using IndexOf. Here's the implementation.
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
So Contains is likely a wee bit slower than calling IndexOf directly, but I doubt that it will have any significance for the actual performance.
Source: Stackoverflow.com