[c#] Create a list from two object lists with linq

I have the following situation

class Person
{
    string Name;
    int Value;
    int Change;
}

List<Person> list1;
List<Person> list2;

I need to combine the 2 lists into a new List<Person> in case it's the same person the combine record would have that name, value of the person in list2, change would be the value of list2 - the value of list1. Change is 0 if no duplicate

This question is related to c# linq

The answer is


You need something like a full outer join. System.Linq.Enumerable has no method that implements a full outer join, so we have to do it ourselves.

var dict1 = list1.ToDictionary(l1 => l1.Name);
var dict2 = list2.ToDictionary(l2 => l2.Name);
    //get the full list of names.
var names = dict1.Keys.Union(dict2.Keys).ToList();
    //produce results
var result = names
.Select( name =>
{
  Person p1 = dict1.ContainsKey(name) ? dict1[name] : null;
  Person p2 = dict2.ContainsKey(name) ? dict2[name] : null;
      //left only
  if (p2 == null)
  {
    p1.Change = 0;
    return p1;
  }
      //right only
  if (p1 == null)
  {
    p2.Change = 0;
    return p2;
  }
      //both
  p2.Change = p2.Value - p1.Value;
  return p2;
}).ToList();

public void Linq95()
{
    List<Customer> customers = GetCustomerList();
    List<Product> products = GetProductList();

    var customerNames =
        from c in customers
        select c.CompanyName;
    var productNames =
        from p in products
        select p.ProductName;

    var allNames = customerNames.Concat(productNames);

    Console.WriteLine("Customer and product names:");
    foreach (var n in allNames)
    {
        Console.WriteLine(n);
    }
}

There are a few pieces to doing this, assuming each list does not contain duplicates, Name is a unique identifier, and neither list is ordered.

First create an append extension method to get a single list:

static class Ext {
  public static IEnumerable<T> Append(this IEnumerable<T> source,
                                      IEnumerable<T> second) {
    foreach (T t in source) { yield return t; }
    foreach (T t in second) { yield return t; }
  }
}

Thus can get a single list:

var oneList = list1.Append(list2);

Then group on name

var grouped = oneList.Group(p => p.Name);

Then can process each group with a helper to process one group at a time

public Person MergePersonGroup(IGrouping<string, Person> pGroup) {
  var l = pGroup.ToList(); // Avoid multiple enumeration.
  var first = l.First();
  var result = new Person {
    Name = first.Name,
    Value = first.Value
  };
  if (l.Count() == 1) {
    return result;
  } else if (l.Count() == 2) {
    result.Change = first.Value - l.Last().Value;
    return result;
  } else {
    throw new ApplicationException("Too many " + result.Name);
  }
}

Which can be applied to each element of grouped:

var finalResult = grouped.Select(g => MergePersonGroup(g));

(Warning: untested.)


Why you don't just use Concat?

Concat is a part of linq and more efficient than doing an AddRange()

in your case:

List<Person> list1 = ...
List<Person> list2 = ...
List<Person> total = list1.Concat(list2);

This can easily be done by using the Linq extension method Union. For example:

var mergedList = list1.Union(list2).ToList();

This will return a List in which the two lists are merged and doubles are removed. If you don't specify a comparer in the Union extension method like in my example, it will use the default Equals and GetHashCode methods in your Person class. If you for example want to compare persons by comparing their Name property, you must override these methods to perform the comparison yourself. Check the following code sample to accomplish that. You must add this code to your Person class.

/// <summary>
/// Checks if the provided object is equal to the current Person
/// </summary>
/// <param name="obj">Object to compare to the current Person</param>
/// <returns>True if equal, false if not</returns>
public override bool Equals(object obj)
{        
    // Try to cast the object to compare to to be a Person
    var person = obj as Person;

    return Equals(person);
}

/// <summary>
/// Returns an identifier for this instance
/// </summary>
public override int GetHashCode()
{
    return Name.GetHashCode();
}

/// <summary>
/// Checks if the provided Person is equal to the current Person
/// </summary>
/// <param name="personToCompareTo">Person to compare to the current person</param>
/// <returns>True if equal, false if not</returns>
public bool Equals(Person personToCompareTo)
{
    // Check if person is being compared to a non person. In that case always return false.
    if (personToCompareTo == null) return false;

    // If the person to compare to does not have a Name assigned yet, we can't define if it's the same. Return false.
    if (string.IsNullOrEmpty(personToCompareTo.Name) return false;

    // Check if both person objects contain the same Name. In that case they're assumed equal.
    return Name.Equals(personToCompareTo.Name);
}

If you don't want to set the default Equals method of your Person class to always use the Name to compare two objects, you can also write a comparer class which uses the IEqualityComparer interface. You can then provide this comparer as the second parameter in the Linq extension Union method. More information on how to write such a comparer method can be found on http://msdn.microsoft.com/en-us/library/system.collections.iequalitycomparer.aspx


Does the following code work for your problem? I've used a foreach with a bit of linq inside to do the combining of lists and assumed that people are equal if their names match, and it seems to print the expected values out when run. Resharper doesn't offer any suggestions to convert the foreach into linq so this is probably as good as it'll get doing it this way.

public class Person
{
   public string Name { get; set; }
   public int Value { get; set; }
   public int Change { get; set; }

   public Person(string name, int value)
   {
      Name = name;
      Value = value;
      Change = 0;
   }
}


class Program
{
   static void Main(string[] args)
   {
      List<Person> list1 = new List<Person>
                              {
                                 new Person("a", 1),
                                 new Person("b", 2),
                                 new Person("c", 3),
                                 new Person("d", 4)
                              };
      List<Person> list2 = new List<Person>
                              {
                                 new Person("a", 4),
                                 new Person("b", 5),
                                 new Person("e", 6),
                                 new Person("f", 7)
                              };

      List<Person> list3 = list2.ToList();

      foreach (var person in list1)
      {
         var existingPerson = list3.FirstOrDefault(x => x.Name == person.Name);
         if (existingPerson != null)
         {
            existingPerson.Change = existingPerson.Value - person.Value;
         }
         else
         {
            list3.Add(person);
         }
      }

      foreach (var person in list3)
      {
         Console.WriteLine("{0} {1} {2} ", person.Name,person.Value,person.Change);
      }
      Console.Read();
   }
}

I noticed that this question was not marked as answered after 2 years - I think the closest answer is Richards, but it can be simplified quite a lot to this:

list1.Concat(list2)
    .ToLookup(p => p.Name)
    .Select(g => g.Aggregate((p1, p2) => new Person 
    {
        Name = p1.Name,
        Value = p1.Value, 
        Change = p2.Value - p1.Value 
    }));

Although this won't error in the case where you have duplicate names in either set.

Some other answers have suggested using unioning - this is definitely not the way to go as it will only get you a distinct list, without doing the combining.


This is Linq

var mergedList = list1.Union(list2).ToList();

This is Normaly (AddRange)

var mergedList=new List<Person>();
mergeList.AddRange(list1);
mergeList.AddRange(list2);

This is Normaly (Foreach)

var mergedList=new List<Person>();

foreach(var item in list1)
{
    mergedList.Add(item);
}
foreach(var item in list2)
{
     mergedList.Add(item);
}

This is Normaly (Foreach-Dublice)

var mergedList=new List<Person>();

foreach(var item in list1)
{
    mergedList.Add(item);
}
foreach(var item in list2)
{
   if(!mergedList.Contains(item))
   {
     mergedList.Add(item);
   }
}