[java] A good Sorted List for Java

I'm looking for a good sorted list for java. Googling around give me some hints about using TreeSet/TreeMap. But these components is lack of one thing: random access to an element in the set. For example, I want to access nth element in the sorted set, but with TreeSet, I must iterate over other n-1 elements before I can get there. It would be a waste since I would have upto several thousands elements in my Set.

Basically, I'm looking for some thing similar to a sorted list in .NET, with ability to add element fast, remove element fast, and have random access to any element in the list.

Has this kind of sorted list implemented somewhere? Thanks.

Edited

My interest in SortedList grows out of this problems: I need to maintains a list of many thousands object (and can grow up to many hundred of thousands). These objects will be persisted to database. I want to randomly select few dozens of element from the whole list. So, I tried to maintain a separated on-memory list that contains the primary keys (Long numbers) of all objects. I need to add/remove keys from the list when object is added / removed from database. I'm using an ArrayList right now but I'm afraid ArrayList would not suit it when the number of records grows. (Imagine you have to iterate over several hundred thousands of elements every time an object is removed from database). Back to the time when I did .NET programming, then I would use a sorted List (List is a .NET class that once Sorted property set to true, will maintain order of its element, and provide binary search that help remove/insert element very quick). I'm hoping that I can find some thing similar from java BCL but unluckily, I didn't find a good match.

This question is related to java sorting

The answer is


SortedList decorator from Java Happy Libraries can be used to decorate TreeList from Apache Collections. That would produce a new list which performance is compareable to TreeSet. https://sourceforge.net/p/happy-guys/wiki/Sorted%20List/


GlazedLists has a very, very good sorted list implementation


Depending on how you're using the list, it may be worth it to use a TreeSet and then use the toArray() method at the end. I had a case where I needed a sorted list, and I found that the TreeSet + toArray() was much faster than adding to an array and merge sorting at the end.


This is the SortedList implementation I am using. Maybe this helps with your problem:

import java.util.Collection;
import java.util.Collections;
import java.util.Comparator;
import java.util.LinkedList;
/**
 * This class is a List implementation which sorts the elements using the
 * comparator specified when constructing a new instance.
 * 
 * @param <T>
 */
public class SortedList<T> extends ArrayList<T> {
    /**
     * Needed for serialization.
     */
    private static final long serialVersionUID = 1L;
    /**
     * Comparator used to sort the list.
     */
    private Comparator<? super T> comparator = null;
    /**
     * Construct a new instance with the list elements sorted in their
     * {@link java.lang.Comparable} natural ordering.
     */
    public SortedList() {
    }
    /**
     * Construct a new instance using the given comparator.
     * 
     * @param comparator
     */
    public SortedList(Comparator<? super T> comparator) {
        this.comparator = comparator;
    }
    /**
     * Construct a new instance containing the elements of the specified
     * collection with the list elements sorted in their
     * {@link java.lang.Comparable} natural ordering.
     * 
     * @param collection
     */
    public SortedList(Collection<? extends T> collection) {
        addAll(collection);
    }
    /**
     * Construct a new instance containing the elements of the specified
     * collection with the list elements sorted using the given comparator.
     * 
     * @param collection
     * @param comparator
     */
    public SortedList(Collection<? extends T> collection, Comparator<? super T> comparator) {
        this(comparator);
        addAll(collection);
    }
    /**
     * Add a new entry to the list. The insertion point is calculated using the
     * comparator.
     * 
     * @param paramT
     * @return <code>true</code> if this collection changed as a result of the call.
     */
    @Override
    public boolean add(T paramT) {
        int initialSize = this.size();
        // Retrieves the position of an existing, equal element or the 
        // insertion position for new elements (negative).
        int insertionPoint = Collections.binarySearch(this, paramT, comparator);
        super.add((insertionPoint > -1) ? insertionPoint : (-insertionPoint) - 1, paramT);
        return (this.size() != initialSize);
    }
    /**
     * Adds all elements in the specified collection to the list. Each element
     * will be inserted at the correct position to keep the list sorted.
     * 
     * @param paramCollection
     * @return <code>true</code> if this collection changed as a result of the call.
     */
    @Override
    public boolean addAll(Collection<? extends T> paramCollection) {
        boolean result = false;
        if (paramCollection.size() > 4) {
            result = super.addAll(paramCollection);
            Collections.sort(this, comparator);
        }
        else {
            for (T paramT:paramCollection) {
                result |= add(paramT);
            }
        }
        return result;
    }
    /**
     * Check, if this list contains the given Element. This is faster than the
     * {@link #contains(Object)} method, since it is based on binary search.
     * 
     * @param paramT
     * @return <code>true</code>, if the element is contained in this list;
     * <code>false</code>, otherwise.
     */
    public boolean containsElement(T paramT) {
        return (Collections.binarySearch(this, paramT, comparator) > -1);
    }
    /**
     * @return The comparator used for sorting this list.
     */
    public Comparator<? super T> getComparator() {
        return comparator;
    }
    /**
     * Assign a new comparator and sort the list using this new comparator.
     * 
     * @param comparator
     */
    public void setComparator(Comparator<? super T> comparator) {
        this.comparator = comparator;
        Collections.sort(this, comparator);
    }
}

This solution is very flexible and uses existing Java functions:

  • Completely based on generics
  • Uses java.util.Collections for finding and inserting list elements
  • Option to use a custom Comparator for list sorting

Some notes:

  • This sorted list is not synchronized since it inherits from java.util.ArrayList. Use Collections.synchronizedList if you need this (refer to the Java documentation for java.util.ArrayList for details).
  • The initial solution was based on java.util.LinkedList. For better performance, specifically for finding the insertion point (Logan's comment) and quicker get operations (https://dzone.com/articles/arraylist-vs-linkedlist-vs), this has been changed to java.util.ArrayList.

Phuong:

Sorting 40,000 random numbers:

0.022 seconds

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Random;


public class test
{
    public static void main(String[] args)
    {
        List<Integer> nums = new ArrayList<Integer>();
        Random rand = new Random();
        for( int i = 0; i < 40000; i++ )
        {
            nums.add( rand.nextInt(Integer.MAX_VALUE) );
        }

        long start = System.nanoTime();
        Collections.sort(nums);
        long end = System.nanoTime();

        System.out.println((end-start)/1e9);
    }
}   

Since you rarely need sorting, as per your problem statement, this is probably more efficient than it needs to be.


To test the efficiancy of earlier awnser by Konrad Holl, I did a quick comparison with what I thought would be the slow way of doing it:

package util.collections;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.ListIterator;

/**
 *
 * @author Earl Bosch
 * @param <E> Comparable Element
 *
 */
public class SortedList<E extends Comparable> implements List<E> {

    /**
     * The list of elements
     */
    private final List<E> list = new ArrayList();

    public E first() {
        return list.get(0);
    }

    public E last() {
        return list.get(list.size() - 1);
    }

    public E mid() {
        return list.get(list.size() >>> 1);
    }

    @Override
    public void clear() {
        list.clear();
    }

    @Override
    public boolean add(E e) {
        list.add(e);
        Collections.sort(list);
        return true;
    }

    @Override
    public int size() {
        return list.size();
    }

    @Override
    public boolean isEmpty() {
        return list.isEmpty();
    }

    @Override
    public boolean contains(Object obj) {
        return list.contains((E) obj);
    }

    @Override
    public Iterator<E> iterator() {
        return list.iterator();
    }

    @Override
    public Object[] toArray() {
        return list.toArray();
    }

    @Override
    public <T> T[] toArray(T[] arg0) {
        return list.toArray(arg0);
    }

    @Override
    public boolean remove(Object obj) {
        return list.remove((E) obj);
    }

    @Override
    public boolean containsAll(Collection<?> c) {
        return list.containsAll(c);
    }

    @Override
    public boolean addAll(Collection<? extends E> c) {

        list.addAll(c);
        Collections.sort(list);
        return true;
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> c) {
        throw new UnsupportedOperationException("Not supported.");
    }

    @Override
    public boolean removeAll(Collection<?> c) {
        return list.removeAll(c);
    }

    @Override
    public boolean retainAll(Collection<?> c) {
        return list.retainAll(c);
    }

    @Override
    public E get(int index) {
        return list.get(index);
    }

    @Override
    public E set(int index, E element) {
        throw new UnsupportedOperationException("Not supported.");
    }

    @Override
    public void add(int index, E element) {
        throw new UnsupportedOperationException("Not supported.");
    }

    @Override
    public E remove(int index) {
        return list.remove(index);
    }

    @Override
    public int indexOf(Object obj) {
        return list.indexOf((E) obj);
    }

    @Override
    public int lastIndexOf(Object obj) {
        return list.lastIndexOf((E) obj);
    }

    @Override
    public ListIterator<E> listIterator() {
        return list.listIterator();
    }

    @Override
    public ListIterator<E> listIterator(int index) {
        return list.listIterator(index);
    }

    @Override
    public List<E> subList(int fromIndex, int toIndex) {
        throw new UnsupportedOperationException("Not supported.");
    }

}

Turns out its about twice as fast! I think its because of SortedLinkList slow get - which make's it not a good choice for a list.

Compared times for same random list:

  • SortedLinkList : 15731.460
  • SortedList : 6895.494
  • ca.odell.glazedlists.SortedList : 712.460
  • org.apache.commons.collections4.TreeList : 3226.546

Seems glazedlists.SortedList is really fast...


What about using a HashMap? Insertion, deletion, and retrieval are all O(1) operations. If you wanted to sort everything, you could grab a List of the values in the Map and run them through an O(n log n) sorting algorithm.

edit

A quick search has found LinkedHashMap, which maintains insertion order of your keys. It's not an exact solution, but it's pretty close.


You need no sorted list. You need no sorting at all.

I need to add/remove keys from the list when object is added / removed from database.

But not immediately, the removal can wait. Use an ArrayList containing the ID's all alive objects plus at most some bounded percentage of deleted objects. Use a separate HashSet to keep track of deleted objects.

private List<ID> mostlyAliveIds = new ArrayList<>();
private Set<ID> deletedIds = new HashSet<>();

I want to randomly select few dozens of element from the whole list.

ID selectOne(Random random) {
    checkState(deletedIds.size() < mostlyAliveIds.size());
    while (true) {
        int index = random.nextInt(mostlyAliveIds.size());
        ID id = mostlyAliveIds.get(index);
        if (!deletedIds.contains(ID)) return ID;
    }
}

Set<ID> selectSome(Random random, int count) {
    checkArgument(deletedIds.size() <= mostlyAliveIds.size() - count);
    Set<ID> result = new HashSet<>();
    while (result.size() < count) result.add(selectOne(random));
}

For maintaining the data, do something like

void insert(ID id) {
    if (!deletedIds.remove(id)) mostlyAliveIds.add(ID);
} 

void delete(ID id) {
    if (!deletedIds.add(id)) {
         throw new ImpossibleException("Deleting a deleted element);
    }
    if (deletedIds.size() > 0.1 * mostlyAliveIds.size()) {
        mostlyAliveIds.removeAll(deletedIds);
        deletedIds.clear();
    }
}

The only tricky part is the insert which has to check if an already deleted ID was resurrected.

The delete ensures that no more than 10% of elements in mostlyAliveIds are deleted IDs. When this happens, they get all removed in one sweep (I didn't check the JDK sources, but I hope, they do it right) and the show goes on.

With no more than 10% of dead IDs, the overhead of selectOne is no more than 10% on the average.

I'm pretty sure that it's faster than any sorting as the amortized complexity is O(n).


Generally you can't have constant time look up and log time deletions/insertions, but if you're happy with log time look ups then you can use a SortedList.

Not sure if you'll trust my coding but I recently wrote a SortedList implementation in Java, which you can download from http://www.scottlogic.co.uk/2010/12/sorted_lists_in_java/. This implementation allows you to look up the i-th element of the list in log time.