[c#] Remove duplicates in the list using linq

I have a class Items with properties (Id, Name, Code, Price).

The List of Items is populated with duplicated items.

For ex.:

1         Item1       IT00001        $100
2         Item2       IT00002        $200
3         Item3       IT00003        $150
1         Item1       IT00001        $100
3         Item3       IT00003        $150

How to remove the duplicates in the list using linq?

This question is related to c# linq linq-to-objects generic-list

The answer is


When you don't want to write IEqualityComparer you can try something like following.

 class Program
{

    private static void Main(string[] args)
    {

        var items = new List<Item>();
        items.Add(new Item {Id = 1, Name = "Item1"});
        items.Add(new Item {Id = 2, Name = "Item2"});
        items.Add(new Item {Id = 3, Name = "Item3"});

        //Duplicate item
        items.Add(new Item {Id = 4, Name = "Item4"});
        //Duplicate item
        items.Add(new Item {Id = 2, Name = "Item2"});

        items.Add(new Item {Id = 3, Name = "Item3"});

        var res = items.Select(i => new {i.Id, i.Name})
            .Distinct().Select(x => new Item {Id = x.Id, Name = x.Name}).ToList();

        // now res contains distinct records
    }



}


public class Item
{
    public int Id { get; set; }

    public string Name { get; set; }
}

Use Distinct() but keep in mind that it uses the default equality comparer to compare values, so if you want anything beyond that you need to implement your own comparer.

Please see http://msdn.microsoft.com/en-us/library/bb348436.aspx for an example.


This is how I was able to group by with Linq. Hope it helps.

var query = collection.GroupBy(x => x.title).Select(y => y.FirstOrDefault());

List<Employee> employees = new List<Employee>()
{
    new Employee{Id =1,Name="AAAAA"}
    , new Employee{Id =2,Name="BBBBB"}
    , new Employee{Id =3,Name="AAAAA"}
    , new Employee{Id =4,Name="CCCCC"}
    , new Employee{Id =5,Name="AAAAA"}
};

List<Employee> duplicateEmployees = employees.Except(employees.GroupBy(i => i.Name)
                                             .Select(ss => ss.FirstOrDefault()))
                                            .ToList();

An universal extension method:

public static class EnumerableExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> enumerable, Func<T, TKey> keySelector)
    {
        return enumerable.GroupBy(keySelector).Select(grp => grp.First());
    }
}

Example of usage:

var lstDst = lst.DistinctBy(item => item.Key);

If there is something that is throwing off your Distinct query, you might want to look at MoreLinq and use the DistinctBy operator and select distinct objects by id.

var distinct = items.DistinctBy( i => i.Id );

Try this extension method out. Hopefully this could help.

public static class DistinctHelper
{
    public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
    {
        var identifiedKeys = new HashSet<TKey>();
        return source.Where(element => identifiedKeys.Add(keySelector(element)));
    }
}

Usage:

var outputList = sourceList.DistinctBy(x => x.TargetProperty);

Another workaround, not beautiful buy workable.

I have an XML file with an element called "MEMDES" with two attribute as "GRADE" and "SPD" to record the RAM module information. There are lot of dupelicate items in SPD.

So here is the code I use to remove the dupelicated items:

        IEnumerable<XElement> MList =
            from RAMList in PREF.Descendants("MEMDES")
            where (string)RAMList.Attribute("GRADE") == "DDR4"
            select RAMList;

        List<string> sellist = new List<string>();

        foreach (var MEMList in MList)
        {
            sellist.Add((string)MEMList.Attribute("SPD").Value);
        }

        foreach (string slist in sellist.Distinct())
        {
            comboBox1.Items.Add(slist);
        }

var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());

You have three option here for removing duplicate item in your List:

  1. Use a a custom equality comparer and then use Distinct(new DistinctItemComparer()) as @Christian Hayter mentioned.
  2. Use GroupBy, but please note in GroupBy you should Group by all of the columns because if you just group by Id it doesn't remove duplicate items always. For example consider the following example:

    List<Item> a = new List<Item>
    {
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 2, Name = "Item2", Code = "IT00002", Price = 200},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 3, Name = "Item3", Code = "IT00004", Price = 250}
    };
    var distinctItems = a.GroupBy(x => x.Id).Select(y => y.First());
    

    The result for this grouping will be:

    {Id = 1, Name = "Item1", Code = "IT00001", Price = 100}
    {Id = 2, Name = "Item2", Code = "IT00002", Price = 200}
    {Id = 3, Name = "Item3", Code = "IT00003", Price = 150}
    

    Which is incorrect because it considers {Id = 3, Name = "Item3", Code = "IT00004", Price = 250} as duplicate. So the correct query would be:

    var distinctItems = a.GroupBy(c => new { c.Id , c.Name , c.Code , c.Price})
                         .Select(c => c.First()).ToList();
    

    3.Override Equal and GetHashCode in item class:

    public class Item
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Code { get; set; }
        public int Price { get; set; }
    
        public override bool Equals(object obj)
        {
            if (!(obj is Item))
                return false;
            Item p = (Item)obj;
            return (p.Id == Id && p.Name == Name && p.Code == Code && p.Price == Price);
        }
        public override int GetHashCode()
        {
            return String.Format("{0}|{1}|{2}|{3}", Id, Name, Code, Price).GetHashCode();
        }
    }
    

    Then you can use it like this:

    var distinctItems = a.Distinct();
    

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to linq

Async await in linq select How to resolve Value cannot be null. Parameter name: source in linq? What does Include() do in LINQ? Selecting multiple columns with linq query and lambda expression System.Collections.Generic.List does not contain a definition for 'Select' lambda expression join multiple tables with select and where clause LINQ select one field from list of DTO objects to array The model backing the 'ApplicationDbContext' context has changed since the database was created Check if two lists are equal Why is this error, 'Sequence contains no elements', happening?

Examples related to linq-to-objects

Using LINQ to group a list of objects Linq select objects in list where exists IN (A,B,C) Find() vs. Where().FirstOrDefault() how to query LIST using linq How can I get LINQ to return the object which has the max value for a given property? Remove duplicates in the list using linq Sorting a list using Lambda/Linq to objects Filtering lists using LINQ Dynamic LINQ OrderBy on IEnumerable<T> / IQueryable<T>

Examples related to generic-list

Best way to update an element in a generic List Convert an enum to List<string> Fastest way to Remove Duplicate Value from a list<> by lambda c# foreach (property in object)... Is there a simple way of doing this? Editing an item in a list<T> Remove duplicates in the list using linq How can I easily convert DataReader to List<T>? C# generic list <T> how to get the type of T? How to add item to the beginning of List<T>? Randomize a List<T>