[c#] When should I use a List vs a LinkedList

When is it better to use a List vs a LinkedList?

This question is related to c# .net vb.net data-structures linked-list

The answer is


My previous answer was not enough accurate. As truly it was horrible :D But now I can post much more useful and correct answer.


I did some additional tests. You can find it's source by the following link and reCheck it on your environment by your own: https://github.com/ukushu/DataStructuresTestsAndOther.git

Short results:

  • Array need to use:

    • So often as possible. It's fast and takes smallest RAM range for same amount information.
    • If you know exact count of cells needed
    • If data saved in array < 85000 b (85000/32 = 2656 elements for integer data)
    • If needed high Random Access speed
  • List need to use:

    • If needed to add cells to the end of list (often)
    • If needed to add cells in the beginning/middle of the list (NOT OFTEN)
    • If data saved in array < 85000 b (85000/32 = 2656 elements for integer data)
    • If needed high Random Access speed
  • LinkedList need to use:

    • If needed to add cells in the beginning/middle/end of the list (often)
    • If needed only sequential access (forward/backward)
    • If you need to save LARGE items, but items count is low.
    • Better do not use for large amount of items, as it's use additional memory for links.

More details:

??????? ???? ???????? ??????????? Interesting to know:

  1. LinkedList<T> internally is not a List in .NET. It's even does not implement IList<T>. And that's why there are absent indexes and methods related to indexes.

  2. LinkedList<T> is node-pointer based collection. In .NET it's in doubly linked implementation. This means that prior/next elements have link to current element. And data is fragmented -- different list objects can be located in different places of RAM. Also there will be more memory used for LinkedList<T> than for List<T> or Array.

  3. List<T> in .Net is Java's alternative of ArrayList<T>. This means that this is array wrapper. So it's allocated in memory as one contiguous block of data. If allocated data size exceeds 85000 bytes, it will be moved to Large Object Heap. Depending on the size, this can lead to heap fragmentation(a mild form of memory leak). But in the same time if size < 85000 bytes -- this provides a very compact and fast-access representation in memory.

  4. Single contiguous block is preferred for random access performance and memory consumption but for collections that need to change size regularly a structure such as an Array generally need to be copied to a new location whereas a linked list only needs to manage the memory for the newly inserted/deleted nodes.


I do agree with most of the point made above. And I also agree that List looks like a more obvious choice in most of the cases.

But, I just want to add that there are many instance where LinkedList are far better choice than List for better efficiency.

  1. Suppose you are traversing through the elements and you want to perform lot of insertions/deletion; LinkedList does it in linear O(n) time, whereas List does it in quadratic O(n^2) time.
  2. Suppose you want to access bigger objects again and again, LinkedList become very more useful.
  3. Deque() and queue() are better implemented using LinkedList.
  4. Increasing the size of LinkedList is much easier and better once you are dealing with many and bigger objects.

Hope someone would find these comments useful.


The primary advantage of linked lists over arrays is that the links provide us with the capability to rearrange the items efficiently. Sedgewick, p. 91


The difference between List and LinkedList lies in their underlying implementation. List is array based collection (ArrayList). LinkedList is node-pointer based collection (LinkedListNode). On the API level usage, both of them are pretty much the same since both implement same set of interfaces such as ICollection, IEnumerable, etc.

The key difference comes when performance matter. For example, if you are implementing the list that has heavy "INSERT" operation, LinkedList outperforms List. Since LinkedList can do it in O(1) time, but List may need to expand the size of underlying array. For more information/detail you might want to read up on the algorithmic difference between LinkedList and array data structures. http://en.wikipedia.org/wiki/Linked_list and Array

Hope this help,


Thinking of a linked list as a list can be a bit misleading. It's more like a chain. In fact, in .NET, LinkedList<T> does not even implement IList<T>. There is no real concept of index in a linked list, even though it may seem there is. Certainly none of the methods provided on the class accept indexes.

Linked lists may be singly linked, or doubly linked. This refers to whether each element in the chain has a link only to the next one (singly linked) or to both the prior/next elements (doubly linked). LinkedList<T> is doubly linked.

Internally, List<T> is backed by an array. This provides a very compact representation in memory. Conversely, LinkedList<T> involves additional memory to store the bidirectional links between successive elements. So the memory footprint of a LinkedList<T> will generally be larger than for List<T> (with the caveat that List<T> can have unused internal array elements to improve performance during append operations.)

They have different performance characteristics too:

Append

  • LinkedList<T>.AddLast(item) constant time
  • List<T>.Add(item) amortized constant time, linear worst case

Prepend

  • LinkedList<T>.AddFirst(item) constant time
  • List<T>.Insert(0, item) linear time

Insertion

  • LinkedList<T>.AddBefore(node, item) constant time
  • LinkedList<T>.AddAfter(node, item) constant time
  • List<T>.Insert(index, item) linear time

Removal

  • LinkedList<T>.Remove(item) linear time
  • LinkedList<T>.Remove(node) constant time
  • List<T>.Remove(item) linear time
  • List<T>.RemoveAt(index) linear time

Count

  • LinkedList<T>.Count constant time
  • List<T>.Count constant time

Contains

  • LinkedList<T>.Contains(item) linear time
  • List<T>.Contains(item) linear time

Clear

  • LinkedList<T>.Clear() linear time
  • List<T>.Clear() linear time

As you can see, they're mostly equivalent. In practice, the API of LinkedList<T> is more cumbersome to use, and details of its internal needs spill out into your code.

However, if you need to do many insertions/removals from within a list, it offers constant time. List<T> offers linear time, as extra items in the list must be shuffled around after the insertion/removal.


Use LinkedList<> when

  1. You don't know how many objects are coming through the flood gate. For example, Token Stream.
  2. When you ONLY wanted to delete\insert at the ends.

For everything else, it is better to use List<>.


In most cases, List<T> is more useful. LinkedList<T> will have less cost when adding/removing items in the middle of the list, whereas List<T> can only cheaply add/remove at the end of the list.

LinkedList<T> is only at it's most efficient if you are accessing sequential data (either forwards or backwards) - random access is relatively expensive since it must walk the chain each time (hence why it doesn't have an indexer). However, because a List<T> is essentially just an array (with a wrapper) random access is fine.

List<T> also offers a lot of support methods - Find, ToArray, etc; however, these are also available for LinkedList<T> with .NET 3.5/C# 3.0 via extension methods - so that is less of a factor.


Essentially, a List<> in .NET is a wrapper over an array. A LinkedList<> is a linked list. So the question comes down to, what is the difference between an array and a linked list, and when should an array be used instead of a linked list. Probably the two most important factors in your decision of which to use would come down to:

  • Linked lists have much better insertion/removal performance, so long as the insertions/removals are not on the last element in the collection. This is because an array must shift all remaining elements that come after the insertion/removal point. If the insertion/removal is at the tail end of the list however, this shift is not needed (although the array may need to be resized, if its capacity is exceeded).
  • Arrays have much better accessing capabilities. Arrays can be indexed into directly (in constant time). Linked lists must be traversed (linear time).

A common circumstance to use LinkedList is like this:

Suppose you want to remove many certain strings from a list of strings with a large size, say 100,000. The strings to remove can be looked up in HashSet dic, and the list of strings is believed to contain between 30,000 to 60,000 such strings to remove.

Then what's the best type of List for storing the 100,000 Strings? The answer is LinkedList. If the they are stored in an ArrayList, then iterating over it and removing matched Strings whould take up to billions of operations, while it takes just around 100,000 operations by using an iterator and the remove() method.

LinkedList<String> strings = readStrings();
HashSet<String> dic = readDic();
Iterator<String> iterator = strings.iterator();
while (iterator.hasNext()){
    String string = iterator.next();
    if (dic.contains(string))
    iterator.remove();
}

This is adapted from Tono Nam's accepted answer correcting a few wrong measurements in it.

The test:

static void Main()
{
    LinkedListPerformance.AddFirst_List(); // 12028 ms
    LinkedListPerformance.AddFirst_LinkedList(); // 33 ms

    LinkedListPerformance.AddLast_List(); // 33 ms
    LinkedListPerformance.AddLast_LinkedList(); // 32 ms

    LinkedListPerformance.Enumerate_List(); // 1.08 ms
    LinkedListPerformance.Enumerate_LinkedList(); // 3.4 ms

    //I tried below as fun exercise - not very meaningful, see code
    //sort of equivalent to insertion when having the reference to middle node

    LinkedListPerformance.AddMiddle_List(); // 5724 ms
    LinkedListPerformance.AddMiddle_LinkedList1(); // 36 ms
    LinkedListPerformance.AddMiddle_LinkedList2(); // 32 ms
    LinkedListPerformance.AddMiddle_LinkedList3(); // 454 ms

    Environment.Exit(-1);
}

And the code:

using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace stackoverflow
{
    static class LinkedListPerformance
    {
        class Temp
        {
            public decimal A, B, C, D;

            public Temp(decimal a, decimal b, decimal c, decimal d)
            {
                A = a; B = b; C = c; D = d;
            }
        }



        static readonly int start = 0;
        static readonly int end = 123456;
        static readonly IEnumerable<Temp> query = Enumerable.Range(start, end - start).Select(temp);

        static Temp temp(int i)
        {
            return new Temp(i, i, i, i);
        }

        static void StopAndPrint(this Stopwatch watch)
        {
            watch.Stop();
            Console.WriteLine(watch.Elapsed.TotalMilliseconds);
        }

        public static void AddFirst_List()
        {
            var list = new List<Temp>();
            var watch = Stopwatch.StartNew();

            for (var i = start; i < end; i++)
                list.Insert(0, temp(i));

            watch.StopAndPrint();
        }

        public static void AddFirst_LinkedList()
        {
            var list = new LinkedList<Temp>();
            var watch = Stopwatch.StartNew();

            for (int i = start; i < end; i++)
                list.AddFirst(temp(i));

            watch.StopAndPrint();
        }

        public static void AddLast_List()
        {
            var list = new List<Temp>();
            var watch = Stopwatch.StartNew();

            for (var i = start; i < end; i++)
                list.Add(temp(i));

            watch.StopAndPrint();
        }

        public static void AddLast_LinkedList()
        {
            var list = new LinkedList<Temp>();
            var watch = Stopwatch.StartNew();

            for (int i = start; i < end; i++)
                list.AddLast(temp(i));

            watch.StopAndPrint();
        }

        public static void Enumerate_List()
        {
            var list = new List<Temp>(query);
            var watch = Stopwatch.StartNew();

            foreach (var item in list)
            {

            }

            watch.StopAndPrint();
        }

        public static void Enumerate_LinkedList()
        {
            var list = new LinkedList<Temp>(query);
            var watch = Stopwatch.StartNew();

            foreach (var item in list)
            {

            }

            watch.StopAndPrint();
        }

        //for the fun of it, I tried to time inserting to the middle of 
        //linked list - this is by no means a realistic scenario! or may be 
        //these make sense if you assume you have the reference to middle node

        //insertion to the middle of list
        public static void AddMiddle_List()
        {
            var list = new List<Temp>();
            var watch = Stopwatch.StartNew();

            for (var i = start; i < end; i++)
                list.Insert(list.Count / 2, temp(i));

            watch.StopAndPrint();
        }

        //insertion in linked list in such a fashion that 
        //it has the same effect as inserting into the middle of list
        public static void AddMiddle_LinkedList1()
        {
            var list = new LinkedList<Temp>();
            var watch = Stopwatch.StartNew();

            LinkedListNode<Temp> evenNode = null, oddNode = null;
            for (int i = start; i < end; i++)
            {
                if (list.Count == 0)
                    oddNode = evenNode = list.AddLast(temp(i));
                else
                    if (list.Count % 2 == 1)
                        oddNode = list.AddBefore(evenNode, temp(i));
                    else
                        evenNode = list.AddAfter(oddNode, temp(i));
            }

            watch.StopAndPrint();
        }

        //another hacky way
        public static void AddMiddle_LinkedList2()
        {
            var list = new LinkedList<Temp>();
            var watch = Stopwatch.StartNew();

            for (var i = start + 1; i < end; i += 2)
                list.AddLast(temp(i));
            for (int i = end - 2; i >= 0; i -= 2)
                list.AddLast(temp(i));

            watch.StopAndPrint();
        }

        //OP's original more sensible approach, but I tried to filter out
        //the intermediate iteration cost in finding the middle node.
        public static void AddMiddle_LinkedList3()
        {
            var list = new LinkedList<Temp>();
            var watch = Stopwatch.StartNew();

            for (var i = start; i < end; i++)
            {
                if (list.Count == 0)
                    list.AddLast(temp(i));
                else
                {
                    watch.Stop();
                    var curNode = list.First;
                    for (var j = 0; j < list.Count / 2; j++)
                        curNode = curNode.Next;
                    watch.Start();

                    list.AddBefore(curNode, temp(i));
                }
            }

            watch.StopAndPrint();
        }
    }
}

You can see the results are in accordance with theoretical performance others have documented here. Quite clear - LinkedList<T> gains big time in case of insertions. I haven't tested for removal from the middle of list, but the result should be the same. Of course List<T> has other areas where it performs way better like O(1) random access.


So many average answers here...

Some linked list implementations use underlying blocks of pre allocated nodes. If they don't do this than constant time / linear time is less relevant as memory performance will be poor and cache performance even worse.

Use linked lists when

1) You want thread safety. You can build better thread safe algos. Locking costs will dominate a concurrent style list.

2) If you have a large queue like structures and want to remove or add anywhere but the end all the time . >100K lists exists but are not that common.


I asked a similar question related to performance of the LinkedList collection, and discovered Steven Cleary's C# implement of Deque was a solution. Unlike the Queue collection, Deque allows moving items on/off front and back. It is similar to linked list, but with improved performance.


When you need built-in indexed access, sorting (and after this binary searching), and "ToArray()" method, you should use List.


Linked lists provide very fast insertion or deletion of a list member. Each member in a linked list contains a pointer to the next member in the list so to insert a member at position i:

  • update the pointer in member i-1 to point to the new member
  • set the pointer in the new member to point to member i

The disadvantage to a linked list is that random access is not possible. Accessing a member requires traversing the list until the desired member is found.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to vb.net

How to get parameter value for date/time column from empty MaskedTextBox HTTP 415 unsupported media type error when calling Web API 2 endpoint variable is not declared it may be inaccessible due to its protection level Differences Between vbLf, vbCrLf & vbCr Constants Simple working Example of json.net in VB.net How to open up a form from another form in VB.NET? Delete a row in DataGridView Control in VB.NET How to get cell value from DataGridView in VB.Net? Set default format of datetimepicker as dd-MM-yyyy How to configure SMTP settings in web.config

Examples related to data-structures

Program to find largest and second largest number in array golang why don't we have a set datastructure How to initialize a vector with fixed length in R C compiling - "undefined reference to"? List of all unique characters in a string? Binary Search Tree - Java Implementation How to clone object in C++ ? Or Is there another solution? How to check queue length in Python Difference between "Complete binary tree", "strict binary tree","full binary Tree"? Write code to convert given number into words (eg 1234 as input should output one thousand two hundred and thirty four)

Examples related to linked-list

Creating a node class in Java Simple linked list in C++ Insert node at a certain position in a linked list C++ Printing out a linked list using toString How to work with string fields in a C struct? JTable - Selected Row click event When to use HashMap over LinkedList or ArrayList and vice-versa C: How to free nodes in the linked list? Java how to sort a Linked List? C linked list inserting node at the end