[c#] LINQ's Distinct() on a particular property

I am playing with LINQ to learn about it, but I can't figure out how to use Distinct when I do not have a simple list (a simple list of integers is pretty easy to do, this is not the question). What I if want to use Distinct on a list of an Object on one or more properties of the object?

Example: If an object is Person, with Property Id. How can I get all Person and use Distinct on them with the property Id of the object?

Person1: Id=1, Name="Test1"
Person2: Id=1, Name="Test1"
Person3: Id=2, Name="Test2"

How can I get just Person1 and Person3? Is that possible?

If it's not possible with LINQ, what would be the best way to have a list of Person depending on some of its properties in .NET 3.5?

This question is related to c# linq .net-3.5 distinct

The answer is


What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

Personally I use the following class:

public class LambdaEqualityComparer<TSource, TDest> : 
    IEqualityComparer<TSource>
{
    private Func<TSource, TDest> _selector;

    public LambdaEqualityComparer(Func<TSource, TDest> selector)
    {
        _selector = selector;
    }

    public bool Equals(TSource obj, TSource other)
    {
        return _selector(obj).Equals(_selector(other));
    }

    public int GetHashCode(TSource obj)
    {
        return _selector(obj).GetHashCode();
    }
}

Then, an extension method:

public static IEnumerable<TSource> Distinct<TSource, TCompare>(
    this IEnumerable<TSource> source, Func<TSource, TCompare> selector)
{
    return source.Distinct(new LambdaEqualityComparer<TSource, TCompare>(selector));
}

Finally, the intended usage:

var dates = new List<DateTime>() { /* ... */ }
var distinctYears = dates.Distinct(date => date.Year);

The advantage I found using this approach is the re-usage of LambdaEqualityComparer class for other methods that accept an IEqualityComparer. (Oh, and I leave the yield stuff to the original LINQ implementation...)


When we faced such a task in our project we defined a small API to compose comparators.

So, the use case was like this:

var wordComparer = KeyEqualityComparer.Null<Word>().
    ThenBy(item => item.Text).
    ThenBy(item => item.LangID);
...
source.Select(...).Distinct(wordComparer);

And API itself looks like this:

using System;
using System.Collections;
using System.Collections.Generic;

public static class KeyEqualityComparer
{
    public static IEqualityComparer<T> Null<T>()
    {
        return null;
    }

    public static IEqualityComparer<T> EqualityComparerBy<T, K>(
        this IEnumerable<T> source,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc);
    }

    public static KeyEqualityComparer<T, K> ThenBy<T, K>(
        this IEqualityComparer<T> equalityComparer,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc, equalityComparer);
    }
}

public struct KeyEqualityComparer<T, K>: IEqualityComparer<T>
{
    public KeyEqualityComparer(
        Func<T, K> keyFunc,
        IEqualityComparer<T> equalityComparer = null)
    {
        KeyFunc = keyFunc;
        EqualityComparer = equalityComparer;
    }

    public bool Equals(T x, T y)
    {
        return ((EqualityComparer == null) || EqualityComparer.Equals(x, y)) &&
                EqualityComparer<K>.Default.Equals(KeyFunc(x), KeyFunc(y));
    }

    public int GetHashCode(T obj)
    {
        var hash = EqualityComparer<K>.Default.GetHashCode(KeyFunc(obj));

        if (EqualityComparer != null)
        {
            var hash2 = EqualityComparer.GetHashCode(obj);

            hash ^= (hash2 << 5) + hash2;
        }

        return hash;
    }

    public readonly Func<T, K> KeyFunc;
    public readonly IEqualityComparer<T> EqualityComparer;
}

More details is on our site: IEqualityComparer in LINQ.


Personally I use the following class:

public class LambdaEqualityComparer<TSource, TDest> : 
    IEqualityComparer<TSource>
{
    private Func<TSource, TDest> _selector;

    public LambdaEqualityComparer(Func<TSource, TDest> selector)
    {
        _selector = selector;
    }

    public bool Equals(TSource obj, TSource other)
    {
        return _selector(obj).Equals(_selector(other));
    }

    public int GetHashCode(TSource obj)
    {
        return _selector(obj).GetHashCode();
    }
}

Then, an extension method:

public static IEnumerable<TSource> Distinct<TSource, TCompare>(
    this IEnumerable<TSource> source, Func<TSource, TCompare> selector)
{
    return source.Distinct(new LambdaEqualityComparer<TSource, TCompare>(selector));
}

Finally, the intended usage:

var dates = new List<DateTime>() { /* ... */ }
var distinctYears = dates.Distinct(date => date.Year);

The advantage I found using this approach is the re-usage of LambdaEqualityComparer class for other methods that accept an IEqualityComparer. (Oh, and I leave the yield stuff to the original LINQ implementation...)


In case you need a Distinct method on multiple properties, you can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;

This is how you use it:

using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);

Override Equals(object obj) and GetHashCode() methods:

class Person
{
    public int Id { get; set; }
    public int Name { get; set; }

    public override bool Equals(object obj)
    {
        return ((Person)obj).Id == Id;
        // or: 
        // var o = (Person)obj;
        // return o.Id == Id && o.Name == Name;
    }
    public override int GetHashCode()
    {
        return Id.GetHashCode();
    }
}

and then just call:

List<Person> distinctList = new[] { person1, person2, person3 }.Distinct().ToList();

I think it is enough:

list.Select(s => s.MyField).Distinct();

You could also use query syntax if you want it to look all LINQ-like:

var uniquePeople = from p in people
                   group p by new {p.ID} //or group by new {p.ID, p.Name, p.Whatever}
                   into mygroup
                   select mygroup.FirstOrDefault();

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

The best way to do this that will be compatible with other .NET versions is to override Equals and GetHash to handle this (see Stack Overflow question This code returns distinct values. However, what I want is to return a strongly typed collection as opposed to an anonymous type), but if you need something that is generic throughout your code, the solutions in this article are great.


Solution first group by your fields then select firstordefault item.

    List<Person> distinctPeople = allPeople
   .GroupBy(p => p.PersonId)
   .Select(g => g.FirstOrDefault())
   .ToList();

You can use DistinctBy() for getting Distinct records by an object property. Just add the following statement before using it:

using Microsoft.Ajax.Utilities;

and then use it like following:

var listToReturn = responseList.DistinctBy(x => x.Index).ToList();

where 'Index' is the property on which i want the data to be distinct.


You should be able to override Equals on person to actually do Equals on Person.id. This ought to result in the behavior you're after.


You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.


I've written an article that explains how to extend the Distinct function so that you can do as follows:

var people = new List<Person>();

people.Add(new Person(1, "a", "b"));
people.Add(new Person(2, "c", "d"));
people.Add(new Person(1, "a", "b"));

foreach (var person in people.Distinct(p => p.ID))
    // Do stuff with unique list here.

Here's the article (now in the Web Archive): Extending LINQ - Specifying a Property in the Distinct Function


The following code is functionally equivalent to Jon Skeet's answer.

Tested on .NET 4.5, should work on any earlier version of LINQ.

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
  this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
  HashSet<TKey> seenKeys = new HashSet<TKey>();
  return source.Where(element => seenKeys.Add(keySelector(element)));
}

Incidentially, check out Jon Skeet's latest version of DistinctBy.cs on Google Code.


You can do this with the standard Linq.ToLookup(). This will create a collection of values for each unique key. Just select the first item in the collection

Persons.ToLookup(p => p.Id).Select(coll => coll.First());

You can use DistinctBy() for getting Distinct records by an object property. Just add the following statement before using it:

using Microsoft.Ajax.Utilities;

and then use it like following:

var listToReturn = responseList.DistinctBy(x => x.Index).ToList();

where 'Index' is the property on which i want the data to be distinct.


You should be able to override Equals on person to actually do Equals on Person.id. This ought to result in the behavior you're after.


You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.


You should be able to override Equals on person to actually do Equals on Person.id. This ought to result in the behavior you're after.


You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.


The best way to do this that will be compatible with other .NET versions is to override Equals and GetHash to handle this (see Stack Overflow question This code returns distinct values. However, what I want is to return a strongly typed collection as opposed to an anonymous type), but if you need something that is generic throughout your code, the solutions in this article are great.


Solution first group by your fields then select firstordefault item.

    List<Person> distinctPeople = allPeople
   .GroupBy(p => p.PersonId)
   .Select(g => g.FirstOrDefault())
   .ToList();

You can do this with the standard Linq.ToLookup(). This will create a collection of values for each unique key. Just select the first item in the collection

Persons.ToLookup(p => p.Id).Select(coll => coll.First());

List<Person>lst=new List<Person>
        var result1 = lst.OrderByDescending(a => a.ID).Select(a =>new Player {ID=a.ID,Name=a.Name} ).Distinct();

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

You could also use query syntax if you want it to look all LINQ-like:

var uniquePeople = from p in people
                   group p by new {p.ID} //or group by new {p.ID, p.Name, p.Whatever}
                   into mygroup
                   select mygroup.FirstOrDefault();

You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.


If you don't want to add the MoreLinq library to your project just to get the DistinctBy functionality then you can get the same end result using the overload of Linq's Distinct method that takes in an IEqualityComparer argument.

You begin by creating a generic custom equality comparer class that uses lambda syntax to perform custom comparison of two instances of a generic class:

public class CustomEqualityComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> _comparison;
    Func<T, int> _hashCodeFactory;

    public CustomEqualityComparer(Func<T, T, bool> comparison, Func<T, int> hashCodeFactory)
    {
        _comparison = comparison;
        _hashCodeFactory = hashCodeFactory;
    }

    public bool Equals(T x, T y)
    {
        return _comparison(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _hashCodeFactory(obj);
    }
}

Then in your main code you use it like so:

Func<Person, Person, bool> areEqual = (p1, p2) => int.Equals(p1.Id, p2.Id);

Func<Person, int> getHashCode = (p) => p.Id.GetHashCode();

var query = people.Distinct(new CustomEqualityComparer<Person>(areEqual, getHashCode));

Voila! :)

The above assumes the following:

  • Property Person.Id is of type int
  • The people collection does not contain any null elements

If the collection could contain nulls then simply rewrite the lambdas to check for null, e.g.:

Func<Person, Person, bool> areEqual = (p1, p2) => 
{
    return (p1 != null && p2 != null) ? int.Equals(p1.Id, p2.Id) : false;
};

EDIT

This approach is similar to the one in Vladimir Nesterovsky's answer but simpler.

It is also similar to the one in Joel's answer but allows for complex comparison logic involving multiple properties.

However, if your objects can only ever differ by Id then another user gave the correct answer that all you need to do is override the default implementations of GetHashCode() and Equals() in your Person class and then just use the out-of-the-box Distinct() method of Linq to filter out any duplicates.


List<Person>lst=new List<Person>
        var result1 = lst.OrderByDescending(a => a.ID).Select(a =>new Player {ID=a.ID,Name=a.Name} ).Distinct();

Please give a try with below code.

var Item = GetAll().GroupBy(x => x .Id).ToList();

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

Use:

List<Person> pList = new List<Person>();
/* Fill list */

var result = pList.Where(p => p.Name != null).GroupBy(p => p.Id).Select(grp => grp.FirstOrDefault());

The where helps you filter the entries (could be more complex) and the groupby and select perform the distinct function.


If you don't want to add the MoreLinq library to your project just to get the DistinctBy functionality then you can get the same end result using the overload of Linq's Distinct method that takes in an IEqualityComparer argument.

You begin by creating a generic custom equality comparer class that uses lambda syntax to perform custom comparison of two instances of a generic class:

public class CustomEqualityComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> _comparison;
    Func<T, int> _hashCodeFactory;

    public CustomEqualityComparer(Func<T, T, bool> comparison, Func<T, int> hashCodeFactory)
    {
        _comparison = comparison;
        _hashCodeFactory = hashCodeFactory;
    }

    public bool Equals(T x, T y)
    {
        return _comparison(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _hashCodeFactory(obj);
    }
}

Then in your main code you use it like so:

Func<Person, Person, bool> areEqual = (p1, p2) => int.Equals(p1.Id, p2.Id);

Func<Person, int> getHashCode = (p) => p.Id.GetHashCode();

var query = people.Distinct(new CustomEqualityComparer<Person>(areEqual, getHashCode));

Voila! :)

The above assumes the following:

  • Property Person.Id is of type int
  • The people collection does not contain any null elements

If the collection could contain nulls then simply rewrite the lambdas to check for null, e.g.:

Func<Person, Person, bool> areEqual = (p1, p2) => 
{
    return (p1 != null && p2 != null) ? int.Equals(p1.Id, p2.Id) : false;
};

EDIT

This approach is similar to the one in Vladimir Nesterovsky's answer but simpler.

It is also similar to the one in Joel's answer but allows for complex comparison logic involving multiple properties.

However, if your objects can only ever differ by Id then another user gave the correct answer that all you need to do is override the default implementations of GetHashCode() and Equals() in your Person class and then just use the out-of-the-box Distinct() method of Linq to filter out any duplicates.


When we faced such a task in our project we defined a small API to compose comparators.

So, the use case was like this:

var wordComparer = KeyEqualityComparer.Null<Word>().
    ThenBy(item => item.Text).
    ThenBy(item => item.LangID);
...
source.Select(...).Distinct(wordComparer);

And API itself looks like this:

using System;
using System.Collections;
using System.Collections.Generic;

public static class KeyEqualityComparer
{
    public static IEqualityComparer<T> Null<T>()
    {
        return null;
    }

    public static IEqualityComparer<T> EqualityComparerBy<T, K>(
        this IEnumerable<T> source,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc);
    }

    public static KeyEqualityComparer<T, K> ThenBy<T, K>(
        this IEqualityComparer<T> equalityComparer,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc, equalityComparer);
    }
}

public struct KeyEqualityComparer<T, K>: IEqualityComparer<T>
{
    public KeyEqualityComparer(
        Func<T, K> keyFunc,
        IEqualityComparer<T> equalityComparer = null)
    {
        KeyFunc = keyFunc;
        EqualityComparer = equalityComparer;
    }

    public bool Equals(T x, T y)
    {
        return ((EqualityComparer == null) || EqualityComparer.Equals(x, y)) &&
                EqualityComparer<K>.Default.Equals(KeyFunc(x), KeyFunc(y));
    }

    public int GetHashCode(T obj)
    {
        var hash = EqualityComparer<K>.Default.GetHashCode(KeyFunc(obj));

        if (EqualityComparer != null)
        {
            var hash2 = EqualityComparer.GetHashCode(obj);

            hash ^= (hash2 << 5) + hash2;
        }

        return hash;
    }

    public readonly Func<T, K> KeyFunc;
    public readonly IEqualityComparer<T> EqualityComparer;
}

More details is on our site: IEqualityComparer in LINQ.


In case you need a Distinct method on multiple properties, you can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;

This is how you use it:

using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);

I think it is enough:

list.Select(s => s.MyField).Distinct();

Please give a try with below code.

var Item = GetAll().GroupBy(x => x .Id).ToList();

I've written an article that explains how to extend the Distinct function so that you can do as follows:

var people = new List<Person>();

people.Add(new Person(1, "a", "b"));
people.Add(new Person(2, "c", "d"));
people.Add(new Person(1, "a", "b"));

foreach (var person in people.Distinct(p => p.ID))
    // Do stuff with unique list here.

Here's the article (now in the Web Archive): Extending LINQ - Specifying a Property in the Distinct Function


The following code is functionally equivalent to Jon Skeet's answer.

Tested on .NET 4.5, should work on any earlier version of LINQ.

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
  this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
  HashSet<TKey> seenKeys = new HashSet<TKey>();
  return source.Where(element => seenKeys.Add(keySelector(element)));
}

Incidentially, check out Jon Skeet's latest version of DistinctBy.cs on Google Code.


Use:

List<Person> pList = new List<Person>();
/* Fill list */

var result = pList.Where(p => p.Name != null).GroupBy(p => p.Id).Select(grp => grp.FirstOrDefault());

The where helps you filter the entries (could be more complex) and the groupby and select perform the distinct function.


Override Equals(object obj) and GetHashCode() methods:

class Person
{
    public int Id { get; set; }
    public int Name { get; set; }

    public override bool Equals(object obj)
    {
        return ((Person)obj).Id == Id;
        // or: 
        // var o = (Person)obj;
        // return o.Id == Id && o.Name == Name;
    }
    public override int GetHashCode()
    {
        return Id.GetHashCode();
    }
}

and then just call:

List<Person> distinctList = new[] { person1, person2, person3 }.Distinct().ToList();

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to linq

Async await in linq select How to resolve Value cannot be null. Parameter name: source in linq? What does Include() do in LINQ? Selecting multiple columns with linq query and lambda expression System.Collections.Generic.List does not contain a definition for 'Select' lambda expression join multiple tables with select and where clause LINQ select one field from list of DTO objects to array The model backing the 'ApplicationDbContext' context has changed since the database was created Check if two lists are equal Why is this error, 'Sequence contains no elements', happening?

Examples related to .net-3.5

Get first element from a dictionary redistributable offline .NET Framework 3.5 installer for Windows 8 How to read an entire file to a string using C#? HTTP Error 500.22 - Internal Server Error (An ASP.NET setting has been detected that does not apply in Integrated managed pipeline mode.) Change the Textbox height? How to get current date in 'YYYY-MM-DD' format in ASP.NET? C# Numeric Only TextBox Control Will the IE9 WebBrowser Control Support all of IE9's features, including SVG? Use LINQ to get items in one List<>, that are not in another List<> Get Absolute URL from Relative path (refactored method)

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent