[linq] Linq to SQL how to do "where [column] in (list of values)"

I have a function where I get a list of ids, and I need to return the a list matching a description that is associated with the id. E.g.:

public class CodeData
{
    string CodeId {get; set;}
    string Description {get; set;}
}

public List<CodeData> GetCodeDescriptionList(List<string> codeIDs)
    //Given the list of institution codes, return a list of CodeData
    //having the given CodeIds
}

So if I were creating the sql for this myself, I would simply do something like the following (where the in clause contains all the values in the codeIds argument):

Select CodeId, Description FROM CodeTable WHERE CodeId IN ('1a','2b','3')

In Linq to Sql I can't seem to find the equivalent of the "IN" clause. The best I've found so far (which doesn't work) is:

 var foo = from codeData in channel.AsQueryable<CodeData>()
           where codeData.CodeId == "1" || codeData.CodeId == "2"
           select codeData;

The problem being, that I can't dynamically generate a list of "OR" clauses for linq to sql, because they are set at compile time.

How does one accomplish a where clause that checks a column is in a dynamic list of values using Linq to Sql?

This question is related to linq linq-to-sql

The answer is


You could also use:

List<int> codes = new List<int>();

codes.add(1);
codes.add(2);

var foo = from codeData in channel.AsQueryable<CodeData>()
          where codes.Any(code => codeData.CodeID.Equals(code))
          select codeData;

 var filterTransNos = (from so in db.SalesOrderDetails
                    where  ItemDescription.Contains(ItemDescription)
                            select new { so.TransNo }).AsEnumerable();    


listreceipt = listreceipt.Where(p => filterTransNos.Any(p2 => p2.TransNo == p.TransNo)).ToList();

Here is how I do it by using HashSet

        HashSet<String> hs = new HashSet<string>(new String[] { "Pluto", "Earth", "Neptune" });
        String[] arr =
        {
            "Pluto",
            "Earth",
            "Neptune",
            "Jupiter",
            "Saturn",
            "Mercury",
            "Pluto",
            "Earth",
            "Neptune",
            "Jupiter",
            "Saturn",
            "Mercury",
            // etc.
        };
        ICollection<String> coll = arr;

        String[] arrStrFiltered = coll.Where(str => hs.Contains(str)).ToArray();

HashSet is basically almost to O(1) so your complexity remains O(n).


I had been using the method in Jon Skeet's answer, but another one occurred to me using Concat. The Concat method performed slightly better in a limited test, but it's a hassle and I'll probably just stick with Contains, or maybe I'll write a helper method to do this for me. Either way, here's another option if anyone is interested:

The Method

// Given an array of id's
var ids = new Guid[] { ... };

// and a DataContext
var dc = new MyDataContext();

// start the queryable
var query = (
    from thing in dc.Things
    where thing.Id == ids[ 0 ]
    select thing 
);

// then, for each other id
for( var i = 1; i < ids.Count(); i++ ) {
    // select that thing and concat to queryable
    query.Concat(
        from thing in dc.Things
        where thing.Id == ids[ i ]
        select thing
    );
}

Performance Test

This was not remotely scientific. I imagine your database structure and the number of IDs involved in the list would have a significant impact.

I set up a test where I did 100 trials each of Concat and Contains where each trial involved selecting 25 rows specified by a randomized list of primary keys. I've run this about a dozen times, and most times the Concat method comes out 5 - 10% faster, although one time the Contains method won by just a smidgen.