[c#] How to use LINQ to select object with minimum or maximum property value

I have a Person object with a Nullable DateOfBirth property. Is there a way to use LINQ to query a list of Person objects for the one with the earliest/smallest DateOfBirth value.

Here's what I started with:

var firstBornDate = People.Min(p => p.DateOfBirth.GetValueOrDefault(DateTime.MaxValue));

Null DateOfBirth values are set to DateTime.MaxValue in order to rule them out of the Min consideration (assuming at least one has a specified DOB).

But all that does for me is to set firstBornDate to a DateTime value. What I'd like to get is the Person object that matches that. Do I need to write a second query like so:

var firstBorn = People.Single(p=> (p.DateOfBirth ?? DateTime.MaxValue) == firstBornDate);

Or is there a leaner way of doing it?

This question is related to c# .net linq

The answer is


public class Foo {
    public int bar;
    public int stuff;
};

void Main()
{
    List<Foo> fooList = new List<Foo>(){
    new Foo(){bar=1,stuff=2},
    new Foo(){bar=3,stuff=4},
    new Foo(){bar=2,stuff=3}};

    Foo result = fooList.Aggregate((u,v) => u.bar < v.bar ? u: v);
    result.Dump();
}

Unfortunately there isn't a built-in method to do this, but it's easy enough to implement for yourself. Here are the guts of it:

public static TSource MinBy<TSource, TKey>(this IEnumerable<TSource> source,
    Func<TSource, TKey> selector)
{
    return source.MinBy(selector, null);
}

public static TSource MinBy<TSource, TKey>(this IEnumerable<TSource> source,
    Func<TSource, TKey> selector, IComparer<TKey> comparer)
{
    if (source == null) throw new ArgumentNullException("source");
    if (selector == null) throw new ArgumentNullException("selector");
    comparer ??= Comparer<TKey>.Default;

    using (var sourceIterator = source.GetEnumerator())
    {
        if (!sourceIterator.MoveNext())
        {
            throw new InvalidOperationException("Sequence contains no elements");
        }
        var min = sourceIterator.Current;
        var minKey = selector(min);
        while (sourceIterator.MoveNext())
        {
            var candidate = sourceIterator.Current;
            var candidateProjected = selector(candidate);
            if (comparer.Compare(candidateProjected, minKey) < 0)
            {
                min = candidate;
                minKey = candidateProjected;
            }
        }
        return min;
    }
}

Example usage:

var firstBorn = People.MinBy(p => p.DateOfBirth ?? DateTime.MaxValue);

Note that this will throw an exception if the sequence is empty, and will return the first element with the minimal value if there's more than one.

Alternatively, you can use the implementation we've got in MoreLINQ, in MinBy.cs. (There's a corresponding MaxBy, of course.)

Install via package manager console:

PM> Install-Package morelinq


Another implementation, which could work with nullable selector keys, and for the collection of reference type returns null if no suitable elements found. This could be helpful then processing database results for example.

  public static class IEnumerableExtensions
  {
    /// <summary>
    /// Returns the element with the maximum value of a selector function.
    /// </summary>
    /// <typeparam name="TSource">The type of the elements of source.</typeparam>
    /// <typeparam name="TKey">The type of the key returned by keySelector.</typeparam>
    /// <param name="source">An IEnumerable collection values to determine the element with the maximum value of.</param>
    /// <param name="keySelector">A function to extract the key for each element.</param>
    /// <exception cref="System.ArgumentNullException">source or keySelector is null.</exception>
    /// <exception cref="System.InvalidOperationException">source contains no elements.</exception>
    /// <returns>The element in source with the maximum value of a selector function.</returns>
    public static TSource MaxBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) => MaxOrMinBy(source, keySelector, 1);

    /// <summary>
    /// Returns the element with the minimum value of a selector function.
    /// </summary>
    /// <typeparam name="TSource">The type of the elements of source.</typeparam>
    /// <typeparam name="TKey">The type of the key returned by keySelector.</typeparam>
    /// <param name="source">An IEnumerable collection values to determine the element with the minimum value of.</param>
    /// <param name="keySelector">A function to extract the key for each element.</param>
    /// <exception cref="System.ArgumentNullException">source or keySelector is null.</exception>
    /// <exception cref="System.InvalidOperationException">source contains no elements.</exception>
    /// <returns>The element in source with the minimum value of a selector function.</returns>
    public static TSource MinBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) => MaxOrMinBy(source, keySelector, -1);


    private static TSource MaxOrMinBy<TSource, TKey>
      (IEnumerable<TSource> source, Func<TSource, TKey> keySelector, int sign)
    {
      if (source == null) throw new ArgumentNullException(nameof(source));
      if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
      Comparer<TKey> comparer = Comparer<TKey>.Default;
      TKey value = default(TKey);
      TSource result = default(TSource);

      bool hasValue = false;

      foreach (TSource element in source)
      {
        TKey x = keySelector(element);
        if (x != null)
        {
          if (!hasValue)
          {
            value = x;
            result = element;
            hasValue = true;
          }
          else if (sign * comparer.Compare(x, value) > 0)
          {
            value = x;
            result = element;
          }
        }
      }

      if ((result != null) && !hasValue)
        throw new InvalidOperationException("The source sequence is empty");

      return result;
    }
  }

Example:

public class A
{
  public int? a;
  public A(int? a) { this.a = a; }
}

var b = a.MinBy(x => x.a);
var c = a.MaxBy(x => x.a);

EDIT again:

Sorry. Besides missing the nullable I was looking at the wrong function,

Min<(Of <(TSource, TResult>)>)(IEnumerable<(Of <(TSource>)>), Func<(Of <(TSource, TResult>)>)) does return the result type as you said.

I would say one possible solution is to implement IComparable and use Min<(Of <(TSource>)>)(IEnumerable<(Of <(TSource>)>)), which really does return an element from the IEnumerable. Of course, that doesn't help you if you can't modify the element. I find MS's design a bit weird here.

Of course, you can always do a for loop if you need to, or use the MoreLINQ implementation Jon Skeet gave.


Solution with no extra packages:

var min = lst.OrderBy(i => i.StartDate).FirstOrDefault();
var max = lst.OrderBy(i => i.StartDate).LastOrDefault();

also you can wrap it into extension:

public static class LinqExtensions
{
    public static T MinBy<T, TProp>(this IEnumerable<T> source, Func<T, TProp> propSelector)
    {
        return source.OrderBy(propSelector).FirstOrDefault();
    }

    public static T MaxBy<T, TProp>(this IEnumerable<T> source, Func<T, TProp> propSelector)
    {
        return source.OrderBy(propSelector).LastOrDefault();
    }
}

and in this case:

var min = lst.MinBy(i => i.StartDate);
var max = lst.MaxBy(i => i.StartDate);

By the way... O(n^2) is not the best solution. Paul Betts gave fatster solution than my. But my is still LINQ solution and it's more simple and more short than other solutions here.


IF you want to select object with minimum or maximum property value. another way is to use Implementing IComparable.

public struct Money : IComparable<Money>
{
   public Money(decimal value) : this() { Value = value; }
   public decimal Value { get; private set; }
   public int CompareTo(Money other) { return Value.CompareTo(other.Value); }
}

Max Implementation will be.

var amounts = new List<Money> { new Money(20), new Money(10) };
Money maxAmount = amounts.Max();

Min Implementation will be.

var amounts = new List<Money> { new Money(20), new Money(10) };
Money maxAmount = amounts.Min();

In this way, you can compare any object and get the Max and Min while returning the object type.

Hope This will help someone.


So you are asking for ArgMin or ArgMax. C# doesn't have a built-in API for those.

I've been looking for a clean and efficient (O(n) in time) way to do this. And I think I found one:

The general form of this pattern is:

var min = data.Select(x => (key(x), x)).Min().Item2;
                            ^           ^       ^
              the sorting key           |       take the associated original item
                                Min by key(.)

Specially, using the example in original question:

For C# 7.0 and above that supports value tuple:

var youngest = people.Select(p => (p.DateOfBirth, p)).Min().Item2;

For C# version before 7.0, anonymous type can be used instead:

var youngest = people.Select(p => new {age = p.DateOfBirth, ppl = p}).Min().ppl;

They work because both value tuple and anonymous type have sensible default comparers: for (x1, y1) and (x2, y2), it first compares x1 vs x2, then y1 vs y2. That's why the built-in .Min can be used on those types.

And since both anonymous type and value tuple are value types, they should be both very efficient.

NOTE

In my above ArgMin implementations I assumed DateOfBirth to take type DateTime for simplicity and clarity. The original question asks to exclude those entries with null DateOfBirth field:

Null DateOfBirth values are set to DateTime.MaxValue in order to rule them out of the Min consideration (assuming at least one has a specified DOB).

It can be achieved with a pre-filtering

people.Where(p => p.DateOfBirth.HasValue)

So it's immaterial to the question of implementing ArgMin or ArgMax.

NOTE 2

The above approach has a caveat that when there are two instances that have the same min value, then the Min() implementation will try to compare the instances as a tie-breaker. However, if the class of the instances does not implement IComparable, then a runtime error will be thrown:

At least one object must implement IComparable

Luckily, this can still be fixed rather cleanly. The idea is to associate a distanct "ID" with each entry that serves as the unambiguous tie-breaker. We can use an incremental ID for each entry. Still using the people age as example:

var youngest = Enumerable.Range(0, int.MaxValue)
               .Zip(people, (idx, ppl) => (ppl.DateOfBirth, idx, ppl)).Min().Item3;

A way via extension function on IEnumerable that returns both the object and the minimum found. It takes a Func that can do any operation on the object in the collection:

public static (double min, T obj) tMin<T>(this IEnumerable<T> ienum, 
            Func<T, double> aFunc)
        {
            var okNull = default(T);
            if (okNull != null)
                throw new ApplicationException("object passed to Min not nullable");

            (double aMin, T okObj) best = (double.MaxValue, okNull);
            foreach (T obj in ienum)
            {
                double q = aFunc(obj);
                if (q < best.aMin)
                    best = (q, obj);
            }
            return (best);
        }

Example where object is an Airport and we want to find closest Airport to a given (latitude, longitude). Airport has a dist(lat, lon) function.

(double okDist, Airport best) greatestPort = airPorts.tMin(x => x.dist(okLat, okLon));

NOTE: I include this answer for completeness since the OP didn't mention what the data source is and we shouldn't make any assumptions.

This query gives the correct answer, but could be slower since it might have to sort all the items in People, depending on what data structure People is:

var oldest = People.OrderBy(p => p.DateOfBirth ?? DateTime.MaxValue).First();

UPDATE: Actually I shouldn't call this solution "naive", but the user does need to know what he is querying against. This solution's "slowness" depends on the underlying data. If this is a array or List<T>, then LINQ to Objects has no choice but to sort the entire collection first before selecting the first item. In this case it will be slower than the other solution suggested. However, if this is a LINQ to SQL table and DateOfBirth is an indexed column, then SQL Server will use the index instead of sorting all the rows. Other custom IEnumerable<T> implementations could also make use of indexes (see i4o: Indexed LINQ, or the object database db4o) and make this solution faster than Aggregate() or MaxBy()/MinBy() which need to iterate the whole collection once. In fact, LINQ to Objects could have (in theory) made special cases in OrderBy() for sorted collections like SortedList<T>, but it doesn't, as far as I know.


The following is the more generic solution. It essentially does the same thing (in O(N) order) but on any IEnumberable types and can mixed with types whose property selectors could return null.

public static class LinqExtensions
{
    public static T MinBy<T>(this IEnumerable<T> source, Func<T, IComparable> selector)
    {
        if (source == null)
        {
            throw new ArgumentNullException(nameof(source));
        }
        if (selector == null)
        {
            throw new ArgumentNullException(nameof(selector));
        }
        return source.Aggregate((min, cur) =>
        {
            if (min == null)
            {
                return cur;
            }
            var minComparer = selector(min);
            if (minComparer == null)
            {
                return cur;
            }
            var curComparer = selector(cur);
            if (curComparer == null)
            {
                return min;
            }
            return minComparer.CompareTo(curComparer) > 0 ? cur : min;
        });
    }
}

Tests:

var nullableInts = new int?[] {5, null, 1, 4, 0, 3, null, 1};
Assert.AreEqual(0, nullableInts.MinBy(i => i));//should pass

Try the following idea:

var firstBornDate = People.GroupBy(p => p.DateOfBirth).Min(g => g.Key).FirstOrDefault();

Perfectly simple use of aggregate (equivalent to fold in other languages):

var firstBorn = People.Aggregate((min, x) => x.DateOfBirth < min.DateOfBirth ? x : min);

The only downside is that the property is accessed twice per sequence element, which might be expensive. That's hard to fix.


People.OrderBy(p => p.DateOfBirth.GetValueOrDefault(DateTime.MaxValue)).First()

Would do the trick


I was looking for something similar myself, preferably without using a library or sorting the entire list. My solution ended up similar to the question itself, just simplified a bit.

var firstBorn = People.FirstOrDefault(p => p.DateOfBirth == People.Min(p2 => p2.DateOfBirth));

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to linq

Async await in linq select How to resolve Value cannot be null. Parameter name: source in linq? What does Include() do in LINQ? Selecting multiple columns with linq query and lambda expression System.Collections.Generic.List does not contain a definition for 'Select' lambda expression join multiple tables with select and where clause LINQ select one field from list of DTO objects to array The model backing the 'ApplicationDbContext' context has changed since the database was created Check if two lists are equal Why is this error, 'Sequence contains no elements', happening?