[c#] Select multiple records based on list of Id's with linq

I have a list containing Id's of my UserProfile table. How can i select all UserProfiles based on the list of Id's i got in a var using LINQ?

var idList = new int[1, 2, 3, 4, 5];
var userProfiles = _dataContext.UserProfile.Where(......);

I got stuck right here. I can do this using for loops etc. But I'd rather do this with LINQ.

This question is related to c# linq

The answer is


That should be simple. Try this:

var idList = new int[1, 2, 3, 4, 5];
var userProfiles = _dataContext.UserProfile.Where(e => idList.Contains(e));

Nice answers abowe, but don't forget one IMPORTANT thing - they provide different results!

  var idList = new int[1, 2, 2, 2, 2]; // same user is selected 4 times
  var userProfiles = _dataContext.UserProfile.Where(e => idList.Contains(e)).ToList();

This will return 2 rows from DB (and this could be correct, if you just want a distinct sorted list of users)

BUT in many cases, you could want an unsorted list of results. You always have to think about it like about a SQL query. Please see the example with eshop shopping cart to illustrate what's going on:

  var priceListIDs = new int[1, 2, 2, 2, 2]; // user has bought 4 times item ID 2
  var shoppingCart = _dataContext.ShoppingCart
                     .Join(priceListIDs, sc => sc.PriceListID, pli => pli, (sc, pli) => sc)
                     .ToList();

This will return 5 results from DB. Using 'contains' would be wrong in this case.


Solution with .Where and .Contains has complexity of O(N square). Simple .Join should have a lot better performance (close to O(N) due to hashing). So the correct code is:

_dataContext.UserProfile.Join(idList, up => up.ID, id => id, (up, id) => up);

And now result of my measurement. I generated 100 000 UserProfiles and 100 000 ids. Join took 32ms and .Where with .Contains took 2 minutes and 19 seconds! I used pure IEnumerable for this testing to prove my statement. If you use List instead of IEnumerable, .Where and .Contains will be faster. Anyway the difference is significant. The fastest .Where .Contains is with Set<>. All it depends on complexity of underlying coletions for .Contains. Look at this post to learn about linq complexity.Look at my test sample below:

    private static void Main(string[] args)
    {
        var userProfiles = GenerateUserProfiles();
        var idList = GenerateIds();
        var stopWatch = new Stopwatch();
        stopWatch.Start();
        userProfiles.Join(idList, up => up.ID, id => id, (up, id) => up).ToArray();
        Console.WriteLine("Elapsed .Join time: {0}", stopWatch.Elapsed);
        stopWatch.Restart();
        userProfiles.Where(up => idList.Contains(up.ID)).ToArray();
        Console.WriteLine("Elapsed .Where .Contains time: {0}", stopWatch.Elapsed);
        Console.ReadLine();
    }

    private static IEnumerable<int> GenerateIds()
    {
       // var result = new List<int>();
        for (int i = 100000; i > 0; i--)
        {
            yield return i;
        }
    }

    private static IEnumerable<UserProfile> GenerateUserProfiles()
    {
        for (int i = 0; i < 100000; i++)
        {
            yield return new UserProfile {ID = i};
        }
    }

Console output:

Elapsed .Join time: 00:00:00.0322546

Elapsed .Where .Contains time: 00:02:19.4072107