I am trying to remove duplicates from a List of objects based on some property.
can we do it in a simple way using java 8
List<Employee> employee
Can we remove duplicates from it based on id
property of employee. I have seen posts removing duplicate strings form arraylist of string.
If order does not matter and when it's more performant to run in parallel, Collect to a Map and then get values:
employee.stream().collect(Collectors.toConcurrentMap(Employee::getId, Function.identity(), (p, q) -> p)).values()
The easiest way to do it directly in the list is
HashSet<Object> seen=new HashSet<>();
employee.removeIf(e->!seen.add(e.getID()));
removeIf
will remove an element if it meets the specified criteriaSet.add
will return false
if it did not modify the Set
, i.e. already contains the valueOf course, it only works if the list supports removal of elements.
Another solution is to use a Predicate, then you can use this in any filter:
public static <T> Predicate<T> distinctBy(Function<? super T, ?> f) {
Set<Object> objects = new ConcurrentHashSet<>();
return t -> objects.add(f.apply(t));
}
Then simply reuse the predicate anywhere:
employees.stream().filter(distinctBy(e -> e.getId));
Note: in the JavaDoc of filter, which says it takes a stateless Predicte. Actually, this works fine even if the stream is parallel.
About other solutions:
1) Using .collect(Collectors.toConcurrentMap(..)).values()
is a good solution, but it's annoying if you want to sort and keep the order.
2) stream.removeIf(e->!seen.add(e.getID()));
is also another very good solution. But we need to make sure the collection implemented removeIf, for example it will throw exception if we construct the collection use Arrays.asList(..)
.
If you can make use of equals
, then filter the list by using distinct
within a stream (see answers above). If you can not or don't want to override the equals
method, you can filter
the stream in the following way for any property, e.g. for the property Name (the same for the property Id etc.):
Set<String> nameSet = new HashSet<>();
List<Employee> employeesDistinctByName = employees.stream()
.filter(e -> nameSet.add(e.getName()))
.collect(Collectors.toList());
Another version which is simple
BiFunction<TreeSet<Employee>,List<Employee> ,TreeSet<Employee>> appendTree = (y,x) -> (y.addAll(x))? y:y;
TreeSet<Employee> outputList = appendTree.apply(new TreeSet<Employee>(Comparator.comparing(p->p.getId())),personList);
Try this code:
Collection<Employee> nonDuplicatedEmployees = employees.stream()
.<Map<Integer, Employee>> collect(HashMap::new,(m,e)->m.put(e.getId(), e), Map::putAll)
.values();
This worked for me:
list.stream().distinct().collect(Collectors.toList());
You need to implement equals, of course
There are a lot of good answers here but I didn't find the one about using reduce
method. So for your case, you can apply it in following way:
List<Employee> employeeList = employees.stream()
.reduce(new ArrayList<>(), (List<Employee> accumulator, Employee employee) ->
{
if (accumulator.stream().noneMatch(emp -> emp.getId().equals(employee.getId())))
{
accumulator.add(employee);
}
return accumulator;
}, (acc1, acc2) ->
{
acc1.addAll(acc2);
return acc1;
});
Source: Stackoverflow.com