[c#] IEnumerable vs List - What to Use? How do they work?

I have some doubts over how Enumerators work, and LINQ. Consider these two simple selects:

List<Animal> sel = (from animal in Animals 
                    join race in Species
                    on animal.SpeciesKey equals race.SpeciesKey
                    select animal).Distinct().ToList();

or

IEnumerable<Animal> sel = (from animal in Animals 
                           join race in Species
                           on animal.SpeciesKey equals race.SpeciesKey
                           select animal).Distinct();

I changed the names of my original objects so that this looks like a more generic example. The query itself is not that important. What I want to ask is this:

foreach (Animal animal in sel) { /*do stuff*/ }
  1. I noticed that if I use IEnumerable, when I debug and inspect "sel", which in that case is the IEnumerable, it has some interesting members: "inner", "outer", "innerKeySelector" and "outerKeySelector", these last 2 appear to be delegates. The "inner" member does not have "Animal" instances in it, but rather "Species" instances, which was very strange for me. The "outer" member does contain "Animal" instances. I presume that the two delegates determine which goes in and what goes out of it?

  2. I noticed that if I use "Distinct", the "inner" contains 6 items (this is incorrect as only 2 are Distinct), but the "outer" does contain the correct values. Again, probably the delegated methods determine this but this is a bit more than I know about IEnumerable.

  3. Most importantly, which of the two options is the best performance-wise?

The evil List conversion via .ToList()?

Or maybe using the enumerator directly?

If you can, please also explain a bit or throw some links that explain this use of IEnumerable.

This question is related to c# linq list ienumerable

The answer is


There is a very good article written by: Claudio Bernasconi's TechBlog here: When to use IEnumerable, ICollection, IList and List

Here some basics points about scenarios and functions:

enter image description here enter image description here


A class that implement IEnumerable allows you to use the foreach syntax.

Basically it has a method to get the next item in the collection. It doesn't need the whole collection to be in memory and doesn't know how many items are in it, foreach just keeps getting the next item until it runs out.

This can be very useful in certain circumstances, for instance in a massive database table you don't want to copy the entire thing into memory before you start processing the rows.

Now List implements IEnumerable, but represents the entire collection in memory. If you have an IEnumerable and you call .ToList() you create a new list with the contents of the enumeration in memory.

Your linq expression returns an enumeration, and by default the expression executes when you iterate through using the foreach. An IEnumerable linq statement executes when you iterate the foreach, but you can force it to iterate sooner using .ToList().

Here's what I mean:

var things = 
    from item in BigDatabaseCall()
    where ....
    select item;

// this will iterate through the entire linq statement:
int count = things.Count();

// this will stop after iterating the first one, but will execute the linq again
bool hasAnyRecs = things.Any();

// this will execute the linq statement *again*
foreach( var thing in things ) ...

// this will copy the results to a list in memory
var list = things.ToList()

// this won't iterate through again, the list knows how many items are in it
int count2 = list.Count();

// this won't execute the linq statement - we have it copied to the list
foreach( var thing in list ) ...

The most important thing to realize is that, using Linq, the query does not get evaluated immediately. It is only run as part of iterating through the resulting IEnumerable<T> in a foreach - that's what all the weird delegates are doing.

So, the first example evaluates the query immediately by calling ToList and putting the query results in a list.
The second example returns an IEnumerable<T> that contains all the information needed to run the query later on.

In terms of performance, the answer is it depends. If you need the results to be evaluated at once (say, you're mutating the structures you're querying later on, or if you don't want the iteration over the IEnumerable<T> to take a long time) use a list. Else use an IEnumerable<T>. The default should be to use the on-demand evaluation in the second example, as that generally uses less memory, unless there is a specific reason to store the results in a list.


In addition to all the answers posted above, here is my two cents. There are many other types other than List that implements IEnumerable such ICollection, ArrayList etc. So if we have IEnumerable as parameter of any method, we can pass any collection types to the function. Ie we can have method to operate on abstraction not any specific implementation.


The advantage of IEnumerable is deferred execution (usually with databases). The query will not get executed until you actually loop through the data. It's a query waiting until it's needed (aka lazy loading).

If you call ToList, the query will be executed, or "materialized" as I like to say.

There are pros and cons to both. If you call ToList, you may remove some mystery as to when the query gets executed. If you stick to IEnumerable, you get the advantage that the program doesn't do any work until it's actually required.


If all you want to do is enumerate them, use the IEnumerable.

Beware, though, that changing the original collection being enumerated is a dangerous operation - in this case, you will want to ToList first. This will create a new list element for each element in memory, enumerating the IEnumerable and is thus less performant if you only enumerate once - but safer and sometimes the List methods are handy (for instance in random access).


I will share one misused concept that I fell into one day:

var names = new List<string> {"mercedes", "mazda", "bmw", "fiat", "ferrari"};

var startingWith_M = names.Where(x => x.StartsWith("m"));

var startingWith_F = names.Where(x => x.StartsWith("f"));


// updating existing list
names[0] = "ford";

// Guess what should be printed before continuing
print( startingWith_M.ToList() );
print( startingWith_F.ToList() );

Expected result

// I was expecting    
print( startingWith_M.ToList() ); // mercedes, mazda
print( startingWith_F.ToList() ); // fiat, ferrari

Actual result

// what printed actualy   
print( startingWith_M.ToList() ); // mazda
print( startingWith_F.ToList() ); // ford, fiat, ferrari

Explanation

As per other answers, the evaluation of the result was deferred until calling ToList or similar invocation methods for example ToArray.

So I can rewrite the code in this case as:

var names = new List<string> {"mercedes", "mazda", "bmw", "fiat", "ferrari"};

// updating existing list
names[0] = "ford";

// before calling ToList directly
var startingWith_M = names.Where(x => x.StartsWith("m"));

var startingWith_F = names.Where(x => x.StartsWith("f"));

print( startingWith_M.ToList() );
print( startingWith_F.ToList() );

Play arround

https://repl.it/E8Ki/0


Nobody mentioned one crucial difference, ironically answered on a question closed as a duplicated of this.

IEnumerable is read-only and List is not.

See Practical difference between List and IEnumerable


There are many cases (such as an infinite list or a very large list) where IEnumerable cannot be transformed to a List. The most obvious examples are all the prime numbers, all the users of facebook with their details, or all the items on ebay.

The difference is that "List" objects are stored "right here and right now", whereas "IEnumerable" objects work "just one at a time". So if I am going through all the items on ebay, one at a time would be something even a small computer can handle, but ".ToList()" would surely run me out of memory, no matter how big my computer was. No computer can by itself contain and handle such a huge amount of data.

[Edit] - Needless to say - it's not "either this or that". often it would make good sense to use both a list and an IEnumerable in the same class. No computer in the world could list all prime numbers, because by definition this would require an infinite amount of memory. But you could easily think of a class PrimeContainer which contains an IEnumerable<long> primes, which for obvious reasons also contains a SortedList<long> _primes. all the primes calculated so far. the next prime to be checked would only be run against the existing primes (up to the square root). That way you gain both - primes one at a time (IEnumerable) and a good list of "primes so far", which is a pretty good approximation of the entire (infinite) list.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to linq

Async await in linq select How to resolve Value cannot be null. Parameter name: source in linq? What does Include() do in LINQ? Selecting multiple columns with linq query and lambda expression System.Collections.Generic.List does not contain a definition for 'Select' lambda expression join multiple tables with select and where clause LINQ select one field from list of DTO objects to array The model backing the 'ApplicationDbContext' context has changed since the database was created Check if two lists are equal Why is this error, 'Sequence contains no elements', happening?

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>

Examples related to ienumerable

How to concatenate two IEnumerable<T> into a new IEnumerable<T>? Remove an item from an IEnumerable<T> collection Converting from IEnumerable to List Shorter syntax for casting from a List<X> to a List<Y>? filtering a list using LINQ How to check if IEnumerable is null or empty? Convert from List into IEnumerable format IEnumerable vs List - What to Use? How do they work? Cannot apply indexing with [] to an expression of type 'System.Collections.Generic.IEnumerable<> Convert DataTable to IEnumerable<T>