[java] Is there a no-duplicate List implementation out there?

I know about SortedSet, but in my case I need something that implements List, and not Set. So is there an implementation out there, in the API or elsewhere?

It shouldn't be hard to implement myself, but I figured why not ask people here first?

This question is related to java list collections duplicates

The answer is


Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)


So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   

I just made my own UniqueList in my own little library like this:

package com.bprog.collections;//my own little set of useful utilities and classes

import java.util.HashSet;
import java.util.ArrayList;
import java.util.List;
/**
*
* @author Jonathan
*/
public class UniqueList {

private HashSet masterSet = new HashSet();
private ArrayList growableUniques;
private Object[] returnable;

public UniqueList() {
    growableUniques = new ArrayList();
}

public UniqueList(int size) {
    growableUniques = new ArrayList(size);
}

public void add(Object thing) {
    if (!masterSet.contains(thing)) {
        masterSet.add(thing);
        growableUniques.add(thing);
    }
}

/**
 * Casts to an ArrayList of unique values
 * @return 
 */
public List getList(){
    return growableUniques;
}

public Object get(int index) {
    return growableUniques.get(index);
}

public Object[] toObjectArray() {
    int size = growableUniques.size();
    returnable = new Object[size];
    for (int i = 0; i < size; i++) {
        returnable[i] = growableUniques.get(i);
    }
    return returnable;
    }
}

I have a TestCollections class that looks like this:

package com.bprog.collections;
import com.bprog.out.Out;
/**
*
* @author Jonathan
*/
public class TestCollections {
    public static void main(String[] args){
        UniqueList ul = new UniqueList();
        ul.add("Test");
        ul.add("Test");
        ul.add("Not a copy");
        ul.add("Test"); 
        //should only contain two things
        Object[] content = ul.toObjectArray();
        Out.pl("Array Content",content);
    }
}

Works fine. All it does is it adds to a set if it does not have it already and there's an Arraylist that is returnable, as well as an object array.


The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.


So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   

NOTE: it does not take subList implementation into account.

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Set;

public class UniqueList<T> extends ArrayList<T> {

    private static final long serialVersionUID = 1L;

    /** Unique elements SET */
    private final Set<T> set=new HashSet();

    /** Used by addAll methods */
    private Collection<T> addUnique(Collection<? extends T> col) {
        Collection<T> unique=new ArrayList();
        for(T e: col){
            if (set.add(e)) unique.add(e);
        }
        return unique;
    }

    @Override
    public boolean add(T e) {
        return set.add(e) ? super.add(e) : false;
    }

    @Override
    public boolean addAll(Collection<? extends T> col) {
        return super.addAll(addUnique(col));
    }

    @Override
    public void add(int index, T e) {
        if (set.add(e)) super.add(index, e);
    }

    @Override
    public boolean addAll(int index, Collection<? extends T> col) {
        return super.addAll(index, addUnique(col));
    }

}

You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.


in add method, why not using HashSet.add() to check duplicates instead of HashSet.consist(). HashSet.add() will return true if no duplicate and false otherwise.


You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.


Off the top of my head, lists allow duplicates. You could quickly implement a UniqueArrayList and override all the add / insert functions to check for contains() before you call the inherited methods. For personal use, you could only implement the add method you use, and override the others to throw an exception in case future programmers try to use the list in a different manner.


Here is what I did and it works.

Assuming I have an ArrayList to work with the first thing I did was created a new LinkedHashMap.

LinkedHashSet<E> hashSet = new LinkedHashSet<E>()

Then I attempt to add my new element to the LinkedHashSet. The add method does not alter the LinkedHasSet and returns false if the new element is a duplicate. So this becomes a condition I can test before adding to the ArrayList.

if (hashSet.add(E)) arrayList.add(E);

This is a simple and elegant way to prevent duplicates from being added to an array list. If you want you can encapsulate it in and override of the add method in a class that extends the ArrayList. Just remember to deal with addAll by looping through the elements and calling the add method.


The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.


You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.


The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.


I just made my own UniqueList in my own little library like this:

package com.bprog.collections;//my own little set of useful utilities and classes

import java.util.HashSet;
import java.util.ArrayList;
import java.util.List;
/**
*
* @author Jonathan
*/
public class UniqueList {

private HashSet masterSet = new HashSet();
private ArrayList growableUniques;
private Object[] returnable;

public UniqueList() {
    growableUniques = new ArrayList();
}

public UniqueList(int size) {
    growableUniques = new ArrayList(size);
}

public void add(Object thing) {
    if (!masterSet.contains(thing)) {
        masterSet.add(thing);
        growableUniques.add(thing);
    }
}

/**
 * Casts to an ArrayList of unique values
 * @return 
 */
public List getList(){
    return growableUniques;
}

public Object get(int index) {
    return growableUniques.get(index);
}

public Object[] toObjectArray() {
    int size = growableUniques.size();
    returnable = new Object[size];
    for (int i = 0; i < size; i++) {
        returnable[i] = growableUniques.get(i);
    }
    return returnable;
    }
}

I have a TestCollections class that looks like this:

package com.bprog.collections;
import com.bprog.out.Out;
/**
*
* @author Jonathan
*/
public class TestCollections {
    public static void main(String[] args){
        UniqueList ul = new UniqueList();
        ul.add("Test");
        ul.add("Test");
        ul.add("Not a copy");
        ul.add("Test"); 
        //should only contain two things
        Object[] content = ul.toObjectArray();
        Out.pl("Array Content",content);
    }
}

Works fine. All it does is it adds to a set if it does not have it already and there's an Arraylist that is returnable, as well as an object array.


So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   

I needed something like that, so I went to the commons collections and used the SetUniqueList, but when I ran some performance test, I found that it seems not optimized comparing to the case if I want to use a Set and obtain an Array using the Set.toArray() method.

The SetUniqueTest took 20:1 time to fill and then traverse 100,000 Strings comparing to the other implementation, which is a big deal difference.

So, if you worry about the performance, I recommend you to use the Set and Get an Array instead of using the SetUniqueList, unless you really need the logic of the SetUniqueList, then you'll need to check other solutions...

Testing code main method:

public static void main(String[] args) {


SetUniqueList pq = SetUniqueList.decorate(new ArrayList());
Set s = new TreeSet();

long t1 = 0L;
long t2 = 0L;
String t;


t1 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    pq.add("a" + Math.random());
}
while (!pq.isEmpty()) {
    t = (String) pq.remove(0);
}
t1 = System.nanoTime() - t1;

t2 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    s.add("a" + Math.random());
}

s.clear();
String[] d = (String[]) s.toArray(new String[0]);
s.clear();
for (int i = 0; i < d.length; i++) {
    t = d[i];

}
t2 = System.nanoTime() - t2;

System.out.println((double)t1/1000/1000/1000); //seconds
System.out.println((double)t2/1000/1000/1000); //seconds
System.out.println(((double) t1) / t2);        //comparing results

}

Regards, Mohammed Sleem


Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)


Off the top of my head, lists allow duplicates. You could quickly implement a UniqueArrayList and override all the add / insert functions to check for contains() before you call the inherited methods. For personal use, you could only implement the add method you use, and override the others to throw an exception in case future programmers try to use the list in a different manner.


Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)


I needed something like that, so I went to the commons collections and used the SetUniqueList, but when I ran some performance test, I found that it seems not optimized comparing to the case if I want to use a Set and obtain an Array using the Set.toArray() method.

The SetUniqueTest took 20:1 time to fill and then traverse 100,000 Strings comparing to the other implementation, which is a big deal difference.

So, if you worry about the performance, I recommend you to use the Set and Get an Array instead of using the SetUniqueList, unless you really need the logic of the SetUniqueList, then you'll need to check other solutions...

Testing code main method:

public static void main(String[] args) {


SetUniqueList pq = SetUniqueList.decorate(new ArrayList());
Set s = new TreeSet();

long t1 = 0L;
long t2 = 0L;
String t;


t1 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    pq.add("a" + Math.random());
}
while (!pq.isEmpty()) {
    t = (String) pq.remove(0);
}
t1 = System.nanoTime() - t1;

t2 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    s.add("a" + Math.random());
}

s.clear();
String[] d = (String[]) s.toArray(new String[0]);
s.clear();
for (int i = 0; i < d.length; i++) {
    t = d[i];

}
t2 = System.nanoTime() - t2;

System.out.println((double)t1/1000/1000/1000); //seconds
System.out.println((double)t2/1000/1000/1000); //seconds
System.out.println(((double) t1) / t2);        //comparing results

}

Regards, Mohammed Sleem


NOTE: it does not take subList implementation into account.

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Set;

public class UniqueList<T> extends ArrayList<T> {

    private static final long serialVersionUID = 1L;

    /** Unique elements SET */
    private final Set<T> set=new HashSet();

    /** Used by addAll methods */
    private Collection<T> addUnique(Collection<? extends T> col) {
        Collection<T> unique=new ArrayList();
        for(T e: col){
            if (set.add(e)) unique.add(e);
        }
        return unique;
    }

    @Override
    public boolean add(T e) {
        return set.add(e) ? super.add(e) : false;
    }

    @Override
    public boolean addAll(Collection<? extends T> col) {
        return super.addAll(addUnique(col));
    }

    @Override
    public void add(int index, T e) {
        if (set.add(e)) super.add(index, e);
    }

    @Override
    public boolean addAll(int index, Collection<? extends T> col) {
        return super.addAll(index, addUnique(col));
    }

}

So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   

Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)


You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.


The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.


in add method, why not using HashSet.add() to check duplicates instead of HashSet.consist(). HashSet.add() will return true if no duplicate and false otherwise.


Off the top of my head, lists allow duplicates. You could quickly implement a UniqueArrayList and override all the add / insert functions to check for contains() before you call the inherited methods. For personal use, you could only implement the add method you use, and override the others to throw an exception in case future programmers try to use the list in a different manner.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>

Examples related to collections

Kotlin's List missing "add", "remove", Map missing "put", etc? How to unset (remove) a collection element after fetching it? How can I get a List from some class properties with Java 8 Stream? Java 8 stream map to list of keys sorted by values How to convert String into Hashmap in java How can I turn a List of Lists into a List in Java 8? MongoDB Show all contents from all collections Get nth character of a string in Swift programming language Java 8 Distinct by property Is there a typescript List<> and/or Map<> class/library?

Examples related to duplicates

Remove duplicates from dataframe, based on two columns A,B, keeping row with max value in another column C Remove duplicates from a dataframe in PySpark How to "select distinct" across multiple data frame columns in pandas? How to find duplicate records in PostgreSQL Drop all duplicate rows across multiple columns in Python Pandas Left Join without duplicate rows from left table Finding duplicate integers in an array and display how many times they occurred How do I use SELECT GROUP BY in DataTable.Select(Expression)? How to delete duplicate rows in SQL Server? Python copy files to a new directory and rename if file name already exists