[c#] Check if a string contains an element from a list (of strings)

For the following block of code:

For I = 0 To listOfStrings.Count - 1
    If myString.Contains(lstOfStrings.Item(I)) Then
        Return True
    End If
Next
Return False

The output is:

Case 1:

myString: C:\Files\myfile.doc
listOfString: C:\Files\, C:\Files2\
Result: True

Case 2:

myString: C:\Files3\myfile.doc
listOfString: C:\Files\, C:\Files2\
Result: False

The list (listOfStrings) may contain several items (minimum 20) and it has to be checked against a thousands of strings (like myString).

Is there a better (more efficient) way to write this code?

This question is related to c# vb.net list coding-style performance

The answer is


myList.Any(myString.Contains);

There were a number of suggestions from an earlier similar question "Best way to test for existing string against a large list of comparables".

Regex might be sufficient for your requirement. The expression would be a concatenation of all the candidate substrings, with an OR "|" operator between them. Of course, you'll have to watch out for unescaped characters when building the expression, or a failure to compile it because of complexity or size limitations.

Another way to do this would be to construct a trie data structure to represent all the candidate substrings (this may somewhat duplicate what the regex matcher is doing). As you step through each character in the test string, you would create a new pointer to the root of the trie, and advance existing pointers to the appropriate child (if any). You get a match when any pointer reaches a leaf.


As I needed to check if there are items from a list in a (long) string, I ended up with this one:

listOfStrings.Any(x => myString.ToUpper().Contains(x.ToUpper()));

Or in vb.net:

listOfStrings.Any(Function(x) myString.ToUpper().Contains(x.ToUpper()))

If speed is critical, you might want to look for the Aho-Corasick algorithm for sets of patterns.

It's a trie with failure links, that is, complexity is O(n+m+k), where n is the length of the input text, m the cumulative length of the patterns and k the number of matches. You just have to modify the algorithm to terminate after the first match is found.


Have you tested the speed?

i.e. Have you created a sample set of data and profiled it? It may not be as bad as you think.

This might also be something you could spawn off into a separate thread and give the illusion of speed!


myList.Any(myString.Contains);

when you construct yours strings it should be like this

bool inact = new string[] { "SUSPENDARE", "DIZOLVARE" }.Any(s=>stare.Contains(s));

The drawback of Contains method is that it doesn't allow to specify comparison type which is often important when comparing strings. It is always culture-sensitive and case-sensitive. So I think the answer of WhoIsRich is valuable, I just want to show a simpler alternative:

listOfStrings.Any(s => s.Equals(myString, StringComparison.OrdinalIgnoreCase))

Have you tested the speed?

i.e. Have you created a sample set of data and profiled it? It may not be as bad as you think.

This might also be something you could spawn off into a separate thread and give the illusion of speed!


Based on your patterns one improvement would be to change to using StartsWith instead of Contains. StartsWith need only iterate through each string until it finds the first mismatch instead of having to restart the search at every character position when it finds one.

Also, based on your patterns, it looks like you may be able to extract the first part of the path for myString, then reverse the comparison -- looking for the starting path of myString in the list of strings rather than the other way around.

string[] pathComponents = myString.Split( Path.DirectorySeparatorChar );
string startPath = pathComponents[0] + Path.DirectorySeparatorChar;

return listOfStrings.Contains( startPath );

EDIT: This would be even faster using the HashSet idea @Marc Gravell mentions since you could change Contains to ContainsKey and the lookup would be O(1) instead of O(N). You would have to make sure that the paths match exactly. Note that this is not a general solution as is @Marc Gravell's but is tailored to your examples.

Sorry for the C# example. I haven't had enough coffee to translate to VB.


There were a number of suggestions from an earlier similar question "Best way to test for existing string against a large list of comparables".

Regex might be sufficient for your requirement. The expression would be a concatenation of all the candidate substrings, with an OR "|" operator between them. Of course, you'll have to watch out for unescaped characters when building the expression, or a failure to compile it because of complexity or size limitations.

Another way to do this would be to construct a trie data structure to represent all the candidate substrings (this may somewhat duplicate what the regex matcher is doing). As you step through each character in the test string, you would create a new pointer to the root of the trie, and advance existing pointers to the appropriate child (if any). You get a match when any pointer reaches a leaf.


Based on your patterns one improvement would be to change to using StartsWith instead of Contains. StartsWith need only iterate through each string until it finds the first mismatch instead of having to restart the search at every character position when it finds one.

Also, based on your patterns, it looks like you may be able to extract the first part of the path for myString, then reverse the comparison -- looking for the starting path of myString in the list of strings rather than the other way around.

string[] pathComponents = myString.Split( Path.DirectorySeparatorChar );
string startPath = pathComponents[0] + Path.DirectorySeparatorChar;

return listOfStrings.Contains( startPath );

EDIT: This would be even faster using the HashSet idea @Marc Gravell mentions since you could change Contains to ContainsKey and the lookup would be O(1) instead of O(N). You would have to make sure that the paths match exactly. Note that this is not a general solution as is @Marc Gravell's but is tailored to your examples.

Sorry for the C# example. I haven't had enough coffee to translate to VB.


Have you tested the speed?

i.e. Have you created a sample set of data and profiled it? It may not be as bad as you think.

This might also be something you could spawn off into a separate thread and give the illusion of speed!


There were a number of suggestions from an earlier similar question "Best way to test for existing string against a large list of comparables".

Regex might be sufficient for your requirement. The expression would be a concatenation of all the candidate substrings, with an OR "|" operator between them. Of course, you'll have to watch out for unescaped characters when building the expression, or a failure to compile it because of complexity or size limitations.

Another way to do this would be to construct a trie data structure to represent all the candidate substrings (this may somewhat duplicate what the regex matcher is doing). As you step through each character in the test string, you would create a new pointer to the root of the trie, and advance existing pointers to the appropriate child (if any). You get a match when any pointer reaches a leaf.


If speed is critical, you might want to look for the Aho-Corasick algorithm for sets of patterns.

It's a trie with failure links, that is, complexity is O(n+m+k), where n is the length of the input text, m the cumulative length of the patterns and k the number of matches. You just have to modify the algorithm to terminate after the first match is found.


Old question. But since VB.NET was the original requirement. Using the same values of the accepted answer:

listOfStrings.Any(Function(s) myString.Contains(s))

The drawback of Contains method is that it doesn't allow to specify comparison type which is often important when comparing strings. It is always culture-sensitive and case-sensitive. So I think the answer of WhoIsRich is valuable, I just want to show a simpler alternative:

listOfStrings.Any(s => s.Equals(myString, StringComparison.OrdinalIgnoreCase))

when you construct yours strings it should be like this

bool inact = new string[] { "SUSPENDARE", "DIZOLVARE" }.Any(s=>stare.Contains(s));

Based on your patterns one improvement would be to change to using StartsWith instead of Contains. StartsWith need only iterate through each string until it finds the first mismatch instead of having to restart the search at every character position when it finds one.

Also, based on your patterns, it looks like you may be able to extract the first part of the path for myString, then reverse the comparison -- looking for the starting path of myString in the list of strings rather than the other way around.

string[] pathComponents = myString.Split( Path.DirectorySeparatorChar );
string startPath = pathComponents[0] + Path.DirectorySeparatorChar;

return listOfStrings.Contains( startPath );

EDIT: This would be even faster using the HashSet idea @Marc Gravell mentions since you could change Contains to ContainsKey and the lookup would be O(1) instead of O(N). You would have to make sure that the paths match exactly. Note that this is not a general solution as is @Marc Gravell's but is tailored to your examples.

Sorry for the C# example. I haven't had enough coffee to translate to VB.


I liked Marc's answer, but needed the Contains matching to be CaSe InSenSiTiVe.

This was the solution:

bool b = listOfStrings.Any(s => myString.IndexOf(s, StringComparison.OrdinalIgnoreCase) >= 0))

There were a number of suggestions from an earlier similar question "Best way to test for existing string against a large list of comparables".

Regex might be sufficient for your requirement. The expression would be a concatenation of all the candidate substrings, with an OR "|" operator between them. Of course, you'll have to watch out for unescaped characters when building the expression, or a failure to compile it because of complexity or size limitations.

Another way to do this would be to construct a trie data structure to represent all the candidate substrings (this may somewhat duplicate what the regex matcher is doing). As you step through each character in the test string, you would create a new pointer to the root of the trie, and advance existing pointers to the appropriate child (if any). You get a match when any pointer reaches a leaf.


Old question. But since VB.NET was the original requirement. Using the same values of the accepted answer:

listOfStrings.Any(Function(s) myString.Contains(s))

Have you tested the speed?

i.e. Have you created a sample set of data and profiled it? It may not be as bad as you think.

This might also be something you could spawn off into a separate thread and give the illusion of speed!


I liked Marc's answer, but needed the Contains matching to be CaSe InSenSiTiVe.

This was the solution:

bool b = listOfStrings.Any(s => myString.IndexOf(s, StringComparison.OrdinalIgnoreCase) >= 0))

As I needed to check if there are items from a list in a (long) string, I ended up with this one:

listOfStrings.Any(x => myString.ToUpper().Contains(x.ToUpper()));

Or in vb.net:

listOfStrings.Any(Function(x) myString.ToUpper().Contains(x.ToUpper()))

If speed is critical, you might want to look for the Aho-Corasick algorithm for sets of patterns.

It's a trie with failure links, that is, complexity is O(n+m+k), where n is the length of the input text, m the cumulative length of the patterns and k the number of matches. You just have to modify the algorithm to terminate after the first match is found.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to vb.net

How to get parameter value for date/time column from empty MaskedTextBox HTTP 415 unsupported media type error when calling Web API 2 endpoint variable is not declared it may be inaccessible due to its protection level Differences Between vbLf, vbCrLf & vbCr Constants Simple working Example of json.net in VB.net How to open up a form from another form in VB.NET? Delete a row in DataGridView Control in VB.NET How to get cell value from DataGridView in VB.Net? Set default format of datetimepicker as dd-MM-yyyy How to configure SMTP settings in web.config

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>

Examples related to coding-style

Method Call Chaining; returning a pointer vs a reference? 80-characters / right margin line in Sublime Text 3 Cannot find reference 'xxx' in __init__.py - Python / Pycharm How to stick <footer> element at the bottom of the page (HTML5 and CSS3)? Simple way to create matrix of random numbers Is calling destructor manually always a sign of bad design? Count all values in a matrix greater than a value Iterate through a C++ Vector using a 'for' loop Which comment style should I use in batch files? Dictionaries and default values

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder