[c#] How to use LINQ Distinct() with multiple fields

I have the following EF class derived from a database (simplified)

class Product
{ 
     public string ProductId;
     public string ProductName;
     public string CategoryId;
     public string CategoryName;
}

ProductId is the Primary Key of the table.

For a bad design decision made by the DB designer (I cannot modify it), I have CategoryId and CategoryName in this table.

I need a DropDownList with (distinct) CategoryId as Value and CategoryName as Text. Therefore I applied the following code:

product.Select(m => new {m.CategoryId, m.CategoryName}).Distinct();

which logically it should create an anonymous object with CategoryId and CategoryName as properties. The Distinct() guarantees that there are no duplicates pair (CategoryId, CategoryName).

But actually it does not work. As far as I understood the Distinct() works just when there is just one field in the collection otherwise it just ignores them...is it correct? Is there any workaround? Thanks!

UPDATE

Sorry product is:

List<Product> product = new List<Product>();

I found an alternative way to get the same result as Distinct():

product.GroupBy(d => new {d.CategoryId, d.CategoryName}) 
       .Select(m => new {m.Key.CategoryId, m.Key.CategoryName})

This question is related to c# linq distinct

The answer is


Employee emp1 = new Employee() { ID = 1, Name = "Narendra1", Salary = 11111, Experience = 3, Age = 30 };Employee emp2 = new Employee() { ID = 2, Name = "Narendra2", Salary = 21111, Experience = 10, Age = 38 };
Employee emp3 = new Employee() { ID = 3, Name = "Narendra3", Salary = 31111, Experience = 4, Age = 33 };
Employee emp4 = new Employee() { ID = 3, Name = "Narendra4", Salary = 41111, Experience = 7, Age = 33 };

List<Employee> lstEmployee = new List<Employee>();

lstEmployee.Add(emp1);
lstEmployee.Add(emp2);
lstEmployee.Add(emp3);
lstEmployee.Add(emp4);

var eemmppss=lstEmployee.Select(cc=>new {cc.ID,cc.Age}).Distinct();

Use the Key keyword in your select will work, like below.

product.Select(m => new {Key m.CategoryId, Key m.CategoryName}).Distinct();

I realize this is bringing up an old thread but figured it might help some people. I generally code in VB.NET when working with .NET so Key may translate differently into C#.


This is my solution, it supports keySelectors of different types:

public static IEnumerable<TSource> DistinctBy<TSource>(this IEnumerable<TSource> source, params Func<TSource, object>[] keySelectors)
{
    // initialize the table
    var seenKeysTable = keySelectors.ToDictionary(x => x, x => new HashSet<object>());

    // loop through each element in source
    foreach (var element in source)
    {
        // initialize the flag to true
        var flag = true;

        // loop through each keySelector a
        foreach (var (keySelector, hashSet) in seenKeysTable)
        {                    
            // if all conditions are true
            flag = flag && hashSet.Add(keySelector(element));
        }

        // if no duplicate key was added to table, then yield the list element
        if (flag)
        {
            yield return element;
        }
    }
}

To use it:

list.DistinctBy(d => d.CategoryId, d => d.CategoryName)

the solution to your problem looks like this:

public class Category {
  public long CategoryId { get; set; }
  public string CategoryName { get; set; }
} 

...

public class CategoryEqualityComparer : IEqualityComparer<Category>
{
   public bool Equals(Category x, Category y)
     => x.CategoryId.Equals(y.CategoryId)
          && x.CategoryName .Equals(y.CategoryName, 
 StringComparison.OrdinalIgnoreCase);

   public int GetHashCode(Mapping obj)
     => obj == null 
         ? 0
         : obj.CategoryId.GetHashCode()
           ^ obj.CategoryName.GetHashCode();
}

...

 var distinctCategories = product
     .Select(_ => 
        new Category {
           CategoryId = _.CategoryId, 
           CategoryName = _.CategoryName
        })
     .Distinct(new CategoryEqualityComparer())
     .ToList();

Answering the headline of the question (what attracted people here) and ignoring that the example used anonymous types....

This solution will also work for non-anonymous types. It should not be needed for anonymous types.

Helper class:

/// <summary>
/// Allow IEqualityComparer to be configured within a lambda expression.
/// From https://stackoverflow.com/questions/98033/wrap-a-delegate-in-an-iequalitycomparer
/// </summary>
/// <typeparam name="T"></typeparam>
public class LambdaEqualityComparer<T> : IEqualityComparer<T>
{
    readonly Func<T, T, bool> _comparer;
    readonly Func<T, int> _hash;

    /// <summary>
    /// Simplest constructor, provide a conversion to string for type T to use as a comparison key (GetHashCode() and Equals().
    /// https://stackoverflow.com/questions/98033/wrap-a-delegate-in-an-iequalitycomparer, user "orip"
    /// </summary>
    /// <param name="toString"></param>
    public LambdaEqualityComparer(Func<T, string> toString)
        : this((t1, t2) => toString(t1) == toString(t2), t => toString(t).GetHashCode())
    {
    }

    /// <summary>
    /// Constructor.  Assumes T.GetHashCode() is accurate.
    /// </summary>
    /// <param name="comparer"></param>
    public LambdaEqualityComparer(Func<T, T, bool> comparer)
        : this(comparer, t => t.GetHashCode())
    {
    }

    /// <summary>
    /// Constructor, provide a equality comparer and a hash.
    /// </summary>
    /// <param name="comparer"></param>
    /// <param name="hash"></param>
    public LambdaEqualityComparer(Func<T, T, bool> comparer, Func<T, int> hash)
    {
        _comparer = comparer;
        _hash = hash;
    }

    public bool Equals(T x, T y)
    {
        return _comparer(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _hash(obj);
    }    
}

Simplest usage:

List<Product> products = duplicatedProducts.Distinct(
    new LambdaEqualityComparer<Product>(p =>
        String.Format("{0}{1}{2}{3}",
            p.ProductId,
            p.ProductName,
            p.CategoryId,
            p.CategoryName))
        ).ToList();

The simplest (but not that efficient) usage is to map to a string representation so that custom hashing is avoided. Equal strings already have equal hash codes.

Reference:
Wrap a delegate in an IEqualityComparer


public List<ItemCustom2> GetBrandListByCat(int id)
    {

        var OBJ = (from a in db.Items
                   join b in db.Brands on a.BrandId equals b.Id into abc1
                   where (a.ItemCategoryId == id)
                   from b in abc1.DefaultIfEmpty()
                   select new
                   {
                       ItemCategoryId = a.ItemCategoryId,
                       Brand_Name = b.Name,
                       Brand_Id = b.Id,
                       Brand_Pic = b.Pic,

                   }).Distinct();


        List<ItemCustom2> ob = new List<ItemCustom2>();
        foreach (var item in OBJ)
        {
            ItemCustom2 abc = new ItemCustom2();
            abc.CategoryId = item.ItemCategoryId;
            abc.BrandId = item.Brand_Id;
            abc.BrandName = item.Brand_Name;
            abc.BrandPic = item.Brand_Pic;
            ob.Add(abc);
        }
        return ob;

    }

The Distinct() guarantees that there are no duplicates pair (CategoryId, CategoryName).

- exactly that

Anonymous types 'magically' implement Equals and GetHashcode

I assume another error somewhere. Case sensitivity? Mutable classes? Non-comparable fields?


Distinct method returns distinct elements from a sequence.

If you take a look on its implementation with Reflector, you'll see that it creates DistinctIterator for your anonymous type. Distinct iterator adds elements to Set when enumerating over collection. This enumerator skips all elements which are already in Set. Set uses GetHashCode and Equals methods for defining if element already exists in Set.

How GetHashCode and Equals implemented for anonymous type? As it stated on msdn:

Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashcode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.

So, you definitely should have distinct anonymous objects, when iterating on distinct collection. And result does not depend on how many fields you use for your anonymous type.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to linq

Async await in linq select How to resolve Value cannot be null. Parameter name: source in linq? What does Include() do in LINQ? Selecting multiple columns with linq query and lambda expression System.Collections.Generic.List does not contain a definition for 'Select' lambda expression join multiple tables with select and where clause LINQ select one field from list of DTO objects to array The model backing the 'ApplicationDbContext' context has changed since the database was created Check if two lists are equal Why is this error, 'Sequence contains no elements', happening?

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent