[c#] LINQ query to find if items in a list are contained in another list

I have the following code:

List<string> test1 = new List<string> { "@bob.com", "@tom.com" };
List<string> test2 = new List<string> { "[email protected]", "[email protected]" };

I need to remove anyone in test2 that has @bob.com or @tom.com.

What I have tried is this:

bool bContained1 = test1.Contains(test2);
bool bContained2 = test2.Contains(test1);

bContained1 = false but bContained2 = true. I would prefer not to loop through each list but instead use a Linq query to retrieve the data. bContained1 is the same condition for the Linq query that I have created below:

List<string> test3 = test1.Where(w => !test2.Contains(w)).ToList();

The query above works on an exact match but not partial matches.

I have looked at other queries but I can find a close comparison to this with Linq. Any ideas or anywhere you can point me to would be a great help.

This question is related to c# linq

The answer is


var output = emails.Where(e => domains.All(d => !e.EndsWith(d)));

Or if you prefer:

var output = emails.Where(e => !domains.Any(d => e.EndsWith(d)));

I think this would be easiest one:

test1.ForEach(str => test2.RemoveAll(x=>x.Contains(str)));

List<string> l = new List<string> { "@bob.com", "@tom.com" };
List<string> l2 = new List<string> { "[email protected]", "[email protected]" };
List<string> myboblist= (l2.Where (i=>i.Contains("bob")).ToList<string>());
foreach (var bob in myboblist)
    Console.WriteLine(bob.ToString());

something like this:

List<string> test1 = new List<string> { "@bob.com", "@tom.com" };
List<string> test2 = new List<string> { "[email protected]", "[email protected]" };

var res = test2.Where(f => test1.Count(z => f.Contains(z)) == 0)

Live example: here


No need to use Linq like this here, because there already exists an extension method to do this for you.

Enumerable.Except<TSource>

http://msdn.microsoft.com/en-us/library/bb336390.aspx

You just need to create your own comparer to compare as needed.


Try the following:

List<string> test1 = new List<string> { "@bob.com", "@tom.com" };
List<string> test2 = new List<string> { "[email protected]", "[email protected]" };
var output = from goodEmails in test2
            where !(from email in test2
                from domain in test1
                where email.EndsWith(domain)
                select email).Contains(goodEmails)
            select goodEmails;

This works with the test set provided (and looks correct).


List<string> test1 = new List<string> { "@bob.com", "@tom.com" };
List<string> test2 = new List<string> { "[email protected]", "[email protected]", "[email protected]" };

var result = (from t2 in test2
              where test1.Any(t => t2.Contains(t)) == false
              select t2);

If query form is what you want to use, this is legible and more or less as "performant" as this could be.

What i mean is that what you are trying to do is an O(N*M) algorithm, that is, you have to traverse N items and compare them against M values. What you want is to traverse the first list only once, and compare against the other list just as many times as needed (worst case is when the email is valid since it has to compare against every black listed domain).

from t2 in test we loop the email list once.

test1.Any(t => t2.Contains(t)) == false we compare with the blacklist and when we found one match return (hence not comparing against the whole list if is not needed)

select t2 keep the ones that are clean.

So this is what I would use.


bool doesL1ContainsL2 = l1.Intersect(l2).Count() == l2.Count;

L1 and L2 are both List<T>