Coder Perfect

How to use LINQ to select object with minimum or maximum property value

Problem

I have a Person object with a DateOfBirth attribute that is nullable. Is there a method to use LINQ to get the Person object with the earliest/smallest DateOfBirth value in a list of Person objects?

Here’s how I got started:

var firstBornDate = People.Min(p => p.DateOfBirth.GetValueOrDefault(DateTime.MaxValue));

DateTime is set to null DateOfBirth values. To rule them out of the Min consideration, use MaxValue (assuming at least one has a specified DOB).

But all that does for me is assign a DateTime value to firstBornDate. I’d like to retrieve the Person object that corresponds to it. Is it necessary for me to write a second query in the following format:

var firstBorn = People.Single(p=> (p.DateOfBirth ?? DateTime.MaxValue) == firstBornDate);

Is there a more efficient method to accomplish it?

Asked by slolife

Solution #1

People.Aggregate((curMin, x) => (curMin == null || (x.DateOfBirth ?? DateTime.MaxValue) <
    curMin.DateOfBirth ? x : curMin))

Answered by Ana Betts

Solution #2

Unfortunately, there isn’t a built-in way to accomplish this, but it’s simple enough to do on your own. Here’s the meat of it:

public static TSource MinBy<TSource, TKey>(this IEnumerable<TSource> source,
    Func<TSource, TKey> selector)
{
    return source.MinBy(selector, null);
}

public static TSource MinBy<TSource, TKey>(this IEnumerable<TSource> source,
    Func<TSource, TKey> selector, IComparer<TKey> comparer)
{
    if (source == null) throw new ArgumentNullException("source");
    if (selector == null) throw new ArgumentNullException("selector");
    comparer ??= Comparer<TKey>.Default;

    using (var sourceIterator = source.GetEnumerator())
    {
        if (!sourceIterator.MoveNext())
        {
            throw new InvalidOperationException("Sequence contains no elements");
        }
        var min = sourceIterator.Current;
        var minKey = selector(min);
        while (sourceIterator.MoveNext())
        {
            var candidate = sourceIterator.Current;
            var candidateProjected = selector(candidate);
            if (comparer.Compare(candidateProjected, minKey) < 0)
            {
                min = candidate;
                minKey = candidateProjected;
            }
        }
        return min;
    }
}

Example usage:

var firstBorn = People.MinBy(p => p.DateOfBirth ?? DateTime.MaxValue);

If the sequence is empty, an exception will be thrown, and if there are multiple elements, the item with the lowest value will be returned.

Alternatively, you can utilize the MinBy.cs implementation from MoreLINQ. (Of course, there’s a MaxBy for this.)

Using the package manager console to install:

Answered by Jon Skeet

Solution #3

NOTE: I’m including this answer for completeness’ sake because the OP didn’t specify the data source and we shouldn’t make any assumptions.

This query returns the proper result, although it may take longer since it must sort all of the items in People, depending on the data structure of People:

var oldest = People.OrderBy(p => p.DateOfBirth ?? DateTime.MaxValue).First();

UPDATE: This solution isn’t quite “naive,” but the user does need to know what he’s searching against. The “slowness” of this solution is determined by the underlying data. If this is an array or a List, LINQ to Objects must sort the entire collection before selecting the first item. In this scenario, it will take longer than the other option. If DateOfBirth is an indexed column in a LINQ to SQL table, SQL Server will use the index instead of sorting all of the results. Indexes could be used in other custom IEnumerable implementations (see i4o: Indexed LINQ, or the object database db4o) to make this approach faster than Aggregate() or MaxBy()/MinBy(), which need to be called several times.

Answered by Lucas

Solution #4

People.OrderBy(p => p.DateOfBirth.GetValueOrDefault(DateTime.MaxValue)).First()

That would suffice.

Answered by Rune FS

Solution #5

You’re looking for ArgMin or ArgMax, correct? There isn’t a built-in API for those in C#.

I’ve been seeking for a simple (O(n) in time) way to accomplish this. And I believe I’ve discovered one:

This pattern can be described in general terms as follows:

var min = data.Select(x => (key(x), x)).Min().Item2;
                            ^           ^       ^
              the sorting key           |       take the associated original item
                                Min by key(.)

Using the example from the original question as an example:

Value tuple support is available in C# 7.0 and higher:

var youngest = people.Select(p => (p.DateOfBirth, p)).Min().Item2;

An anonymous type can be used instead of a type in C# versions prior to 7.0:

var youngest = people.Select(p => new {age = p.DateOfBirth, ppl = p}).Min().ppl;

They work because both the value tuple and the anonymous type have sensible default comparers: for (x1, y1) and (x2, y2), it compares x1 to x2 before comparing y1 to y2. That is why there is a built-in. On those types, Min can be used.

Because the anonymous type and the value tuple are both value types, they should be exceedingly efficient.

NOTE

For the sake of simplicity and clarity, I assumed DateOfBirth was of type DateTime in my ArgMin implementations. The original query requests that entries with a null DateOfBirth column be excluded:

It can be accomplished with the use of a pre-filter.

people.Where(p => p.DateOfBirth.HasValue)

As a result, it makes no difference whether you use ArgMin or ArgMax.

NOTE 2

The aforementioned method comes with the proviso that if two instances have the same min value, the Min() implementation will try to compare them as a tie-breaker. If the instances’ class does not implement IComparable, however, a runtime error will occur:

Fortunately, this is still a relatively simple remedy. The objective is to assign each item a distinct “ID” that will act as an unambiguous tie-breaker. For each entry, we can use an incremental ID. Continuing with the age of the persons as an example:

var youngest = Enumerable.Range(0, int.MaxValue)
               .Zip(people, (idx, ppl) => (ppl.DateOfBirth, idx, ppl)).Min().Item3;

Answered by KFL

Post is based on https://stackoverflow.com/questions/914109/how-to-use-linq-to-select-object-with-minimum-or-maximum-property-value