[c#] How would you do a "not in" query with LINQ?

I have two collections which have property Email in both collections. I need to get a list of the items in the first list where Email does not exist in the second list. With SQL I would just use "not in", but I do not know the equivalent in LINQ. How is that done?

So far I have a join, like...

var matches = from item1 in list1
join item2 in list2 on item1.Email equals item2.Email
select new { Email = list1.Email };

But I cannot join since I need the difference and the join would fail. I need some way of using Contains or Exists I believe. I just have not found an example to do that yet.

This question is related to c# linq

The answer is


You want the Except operator.

var answer = list1.Except(list2);

Better explanation here: https://docs.microsoft.com/archive/blogs/charlie/linq-farm-more-on-set-operators

NOTE: This technique works best for primitive types only, since you have to implement an IEqualityComparer to use the Except method with complex types.


While Except is part of the answer, it's not the whole answer. By default, Except (like several of the LINQ operators) does a reference comparison on reference types. To compare by values in the objects, you'll have to

  • implement IEquatable<T> in your type, or
  • override Equals and GetHashCode in your type, or
  • pass in an instance of a type implementing IEqualityComparer<T> for your type

items in the first list where the Email does not exist in the second list.

from item1 in List1
where !(list2.Any(item2 => item2.Email == item1.Email))
select item1;

You can take both the collections in two different lists, say list1 and list2.

Then just write

list1.RemoveAll(Item => list2.Contains(Item));

This will work.


In the case where one is using the ADO.NET Entity Framework, EchoStorm's solution also works perfectly. But it took me a few minutes to wrap my head around it. Assuming you have a database context, dc, and want to find rows in table x not linked in table y, the complete answer answer looks like:

var linked =
  from x in dc.X
  from y in dc.Y
  where x.MyProperty == y.MyProperty
  select x;
var notLinked =
  dc.X.Except(linked);

In response to Andy's comment, yes, one can have two from's in a LINQ query. Here's a complete working example, using lists. Each class, Foo and Bar, has an Id. Foo has a "foreign key" reference to Bar via Foo.BarId. The program selects all Foo's not linked to a corresponding Bar.

class Program
{
    static void Main(string[] args)
    {
        // Creates some foos
        List<Foo> fooList = new List<Foo>();
        fooList.Add(new Foo { Id = 1, BarId = 11 });
        fooList.Add(new Foo { Id = 2, BarId = 12 });
        fooList.Add(new Foo { Id = 3, BarId = 13 });
        fooList.Add(new Foo { Id = 4, BarId = 14 });
        fooList.Add(new Foo { Id = 5, BarId = -1 });
        fooList.Add(new Foo { Id = 6, BarId = -1 });
        fooList.Add(new Foo { Id = 7, BarId = -1 });

        // Create some bars
        List<Bar> barList = new List<Bar>();
        barList.Add(new Bar { Id = 11 });
        barList.Add(new Bar { Id = 12 });
        barList.Add(new Bar { Id = 13 });
        barList.Add(new Bar { Id = 14 });
        barList.Add(new Bar { Id = 15 });
        barList.Add(new Bar { Id = 16 });
        barList.Add(new Bar { Id = 17 });

        var linked = from foo in fooList
                     from bar in barList
                     where foo.BarId == bar.Id
                     select foo;
        var notLinked = fooList.Except(linked);
        foreach (Foo item in notLinked)
        {
            Console.WriteLine(
                String.Format(
                "Foo.Id: {0} | Bar.Id: {1}",
                item.Id, item.BarId));
        }
        Console.WriteLine("Any key to continue...");
        Console.ReadKey();
    }
}

class Foo
{
    public int Id { get; set; }
    public int BarId { get; set; }
}

class Bar
{
    public int Id { get; set; }
}

For anyone who also wants to use a SQL-alike IN operator in C#, download this package :

Mshwf.NiceLinq

It has In and NotIn methods:

var result = list1.In(x => x.Email, list2.Select(z => z.Email));

Even you can use it this way

var result = list1.In(x => x.Email, "[email protected]", "[email protected]", "[email protected]");

Alternatively you can do like this:

var result = list1.Where(p => list2.All(x => x.Id != p.Id));

var secondEmails = (from item in list2
                    select new { Email = item.Email }
                   ).ToList();

var matches = from item in list1
              where !secondEmails.Contains(item.Email)
              select new {Email = item.Email};

 DynamicWebsiteEntities db = new DynamicWebsiteEntities();
    var data = (from dt_sub in db.Subjects_Details
                                //Sub Query - 1
                            let sub_s_g = (from sg in db.Subjects_In_Group
                                           where sg.GroupId == groupId
                                           select sg.SubjectId)
                            //Where Cause
                            where !sub_s_g.Contains(dt_sub.Id) && dt_sub.IsLanguage == false
                            //Order By Cause
                            orderby dt_sub.Subject_Name

                            select dt_sub)
                           .AsEnumerable();
                  
                                SelectList multiSelect = new SelectList(data, "Id", "Subject_Name", selectedValue);

    //======================================OR===========================================

    var data = (from dt_sub in db.Subjects_Details

                               
                            //Where Cause
                            where !(from sg in db.Subjects_In_Group
                                           where sg.GroupId == groupId
                                           select sg.SubjectId).Contains(dt_sub.Id) && dt_sub.IsLanguage == false

                            //Order By Cause
                            orderby dt_sub.Subject_Name

                            select dt_sub)

                           .AsEnumerable();

var secondEmails = (from item in list2
                    select new { Email = item.Email }
                   ).ToList();

var matches = from item in list1
              where !secondEmails.Contains(item.Email)
              select new {Email = item.Email};

You can use a combination of Where and Any for finding not in:

var NotInRecord =list1.Where(p => !list2.Any(p2 => p2.Email  == p.Email));

For anyone who also wants to use a SQL-alike IN operator in C#, download this package :

Mshwf.NiceLinq

It has In and NotIn methods:

var result = list1.In(x => x.Email, list2.Select(z => z.Email));

Even you can use it this way

var result = list1.In(x => x.Email, "[email protected]", "[email protected]", "[email protected]");

While Except is part of the answer, it's not the whole answer. By default, Except (like several of the LINQ operators) does a reference comparison on reference types. To compare by values in the objects, you'll have to

  • implement IEquatable<T> in your type, or
  • override Equals and GetHashCode in your type, or
  • pass in an instance of a type implementing IEqualityComparer<T> for your type

Couldn't you do an outer join, only selecting the items from the first list if the group is empty? Something like:

Dim result = (From a In list1
              Group Join b In list2 
                  On a.Value Equals b.Value 
                  Into grp = Group
              Where Not grp.Any
              Select a)

I'm unsure whether this would work in any sort of efficient way with the Entity framework.


In the case where one is using the ADO.NET Entity Framework, EchoStorm's solution also works perfectly. But it took me a few minutes to wrap my head around it. Assuming you have a database context, dc, and want to find rows in table x not linked in table y, the complete answer answer looks like:

var linked =
  from x in dc.X
  from y in dc.Y
  where x.MyProperty == y.MyProperty
  select x;
var notLinked =
  dc.X.Except(linked);

In response to Andy's comment, yes, one can have two from's in a LINQ query. Here's a complete working example, using lists. Each class, Foo and Bar, has an Id. Foo has a "foreign key" reference to Bar via Foo.BarId. The program selects all Foo's not linked to a corresponding Bar.

class Program
{
    static void Main(string[] args)
    {
        // Creates some foos
        List<Foo> fooList = new List<Foo>();
        fooList.Add(new Foo { Id = 1, BarId = 11 });
        fooList.Add(new Foo { Id = 2, BarId = 12 });
        fooList.Add(new Foo { Id = 3, BarId = 13 });
        fooList.Add(new Foo { Id = 4, BarId = 14 });
        fooList.Add(new Foo { Id = 5, BarId = -1 });
        fooList.Add(new Foo { Id = 6, BarId = -1 });
        fooList.Add(new Foo { Id = 7, BarId = -1 });

        // Create some bars
        List<Bar> barList = new List<Bar>();
        barList.Add(new Bar { Id = 11 });
        barList.Add(new Bar { Id = 12 });
        barList.Add(new Bar { Id = 13 });
        barList.Add(new Bar { Id = 14 });
        barList.Add(new Bar { Id = 15 });
        barList.Add(new Bar { Id = 16 });
        barList.Add(new Bar { Id = 17 });

        var linked = from foo in fooList
                     from bar in barList
                     where foo.BarId == bar.Id
                     select foo;
        var notLinked = fooList.Except(linked);
        foreach (Foo item in notLinked)
        {
            Console.WriteLine(
                String.Format(
                "Foo.Id: {0} | Bar.Id: {1}",
                item.Id, item.BarId));
        }
        Console.WriteLine("Any key to continue...");
        Console.ReadKey();
    }
}

class Foo
{
    public int Id { get; set; }
    public int BarId { get; set; }
}

class Bar
{
    public int Id { get; set; }
}

Alternatively you can do like this:

var result = list1.Where(p => list2.All(x => x.Id != p.Id));

var secondEmails = (from item in list2
                    select new { Email = item.Email }
                   ).ToList();

var matches = from item in list1
              where !secondEmails.Contains(item.Email)
              select new {Email = item.Email};

You want the Except operator.

var answer = list1.Except(list2);

Better explanation here: https://docs.microsoft.com/archive/blogs/charlie/linq-farm-more-on-set-operators

NOTE: This technique works best for primitive types only, since you have to implement an IEqualityComparer to use the Except method with complex types.


var secondEmails = (from item in list2
                    select new { Email = item.Email }
                   ).ToList();

var matches = from item in list1
              where !secondEmails.Contains(item.Email)
              select new {Email = item.Email};

Example using List of int for simplicity.

List<int> list1 = new List<int>();
// fill data
List<int> list2 = new List<int>();
// fill data

var results = from i in list1
              where !list2.Contains(i)
              select i;

foreach (var result in results)
    Console.WriteLine(result.ToString());

I did not test this with LINQ to Entities:

NorthwindDataContext dc = new NorthwindDataContext();    
dc.Log = Console.Out;

var query =    
    from c in dc.Customers 
    where !dc.Orders.Any(o => o.CustomerID == c.CustomerID)   
    select c;

Alternatively:

NorthwindDataContext dc = new NorthwindDataContext();    
dc.Log = Console.Out;

var query =    
    from c in dc.Customers 
    where dc.Orders.All(o => o.CustomerID != c.CustomerID)   
    select c;

foreach (var c in query) 
    Console.WriteLine( c );

Couldn't you do an outer join, only selecting the items from the first list if the group is empty? Something like:

Dim result = (From a In list1
              Group Join b In list2 
                  On a.Value Equals b.Value 
                  Into grp = Group
              Where Not grp.Any
              Select a)

I'm unsure whether this would work in any sort of efficient way with the Entity framework.


One could also use All()

var notInList = list1.Where(p => list2.All(p2 => p2.Email != p.Email));

While Except is part of the answer, it's not the whole answer. By default, Except (like several of the LINQ operators) does a reference comparison on reference types. To compare by values in the objects, you'll have to

  • implement IEquatable<T> in your type, or
  • override Equals and GetHashCode in your type, or
  • pass in an instance of a type implementing IEqualityComparer<T> for your type

Example using List of int for simplicity.

List<int> list1 = new List<int>();
// fill data
List<int> list2 = new List<int>();
// fill data

var results = from i in list1
              where !list2.Contains(i)
              select i;

foreach (var result in results)
    Console.WriteLine(result.ToString());

For people who start with a group of in-memory objects and are querying against a database, I've found this to be the best way to go:

var itemIds = inMemoryList.Select(x => x.Id).ToArray();
var otherObjects = context.ItemList.Where(x => !itemIds.Contains(x.Id));

This produces a nice WHERE ... IN (...) clause in SQL.


items in the first list where the Email does not exist in the second list.

from item1 in List1
where !(list2.Any(item2 => item2.Email == item1.Email))
select item1;

While Except is part of the answer, it's not the whole answer. By default, Except (like several of the LINQ operators) does a reference comparison on reference types. To compare by values in the objects, you'll have to

  • implement IEquatable<T> in your type, or
  • override Equals and GetHashCode in your type, or
  • pass in an instance of a type implementing IEqualityComparer<T> for your type

One could also use All()

var notInList = list1.Where(p => list2.All(p2 => p2.Email != p.Email));

items in the first list where the Email does not exist in the second list.

from item1 in List1
where !(list2.Any(item2 => item2.Email == item1.Email))
select item1;

You can take both the collections in two different lists, say list1 and list2.

Then just write

list1.RemoveAll(Item => list2.Contains(Item));

This will work.


Example using List of int for simplicity.

List<int> list1 = new List<int>();
// fill data
List<int> list2 = new List<int>();
// fill data

var results = from i in list1
              where !list2.Contains(i)
              select i;

foreach (var result in results)
    Console.WriteLine(result.ToString());

I did not test this with LINQ to Entities:

NorthwindDataContext dc = new NorthwindDataContext();    
dc.Log = Console.Out;

var query =    
    from c in dc.Customers 
    where !dc.Orders.Any(o => o.CustomerID == c.CustomerID)   
    select c;

Alternatively:

NorthwindDataContext dc = new NorthwindDataContext();    
dc.Log = Console.Out;

var query =    
    from c in dc.Customers 
    where dc.Orders.All(o => o.CustomerID != c.CustomerID)   
    select c;

foreach (var c in query) 
    Console.WriteLine( c );

You want the Except operator.

var answer = list1.Except(list2);

Better explanation here: https://docs.microsoft.com/archive/blogs/charlie/linq-farm-more-on-set-operators

NOTE: This technique works best for primitive types only, since you have to implement an IEqualityComparer to use the Except method with complex types.


You can use a combination of Where and Any for finding not in:

var NotInRecord =list1.Where(p => !list2.Any(p2 => p2.Email  == p.Email));