[java] Java 8 Streams: multiple filters vs. complex condition

Sometimes you want to filter a Stream with more than one condition:

myList.stream().filter(x -> x.size() > 10).filter(x -> x.isCool()) ...

or you could do the same with a complex condition and a single filter:

myList.stream().filter(x -> x.size() > 10 && x -> x.isCool()) ...

My guess is that the second approach has better performance characteristics, but I don't know it.

The first approach wins in readability, but what is better for the performance?

This question is related to java lambda filter java-8 java-stream

The answer is


The code that has to be executed for both alternatives is so similar that you can’t predict a result reliably. The underlying object structure might differ but that’s no challenge to the hotspot optimizer. So it depends on other surrounding conditions which will yield to a faster execution, if there is any difference.

Combining two filter instances creates more objects and hence more delegating code but this can change if you use method references rather than lambda expressions, e.g. replace filter(x -> x.isCool()) by filter(ItemType::isCool). That way you have eliminated the synthetic delegating method created for your lambda expression. So combining two filters using two method references might create the same or lesser delegation code than a single filter invocation using a lambda expression with &&.

But, as said, this kind of overhead will be eliminated by the HotSpot optimizer and is negligible.

In theory, two filters could be easier parallelized than a single filter but that’s only relevant for rather computational intense tasks¹.

So there is no simple answer.

The bottom line is, don’t think about such performance differences below the odor detection threshold. Use what is more readable.


¹…and would require an implementation doing parallel processing of subsequent stages, a road currently not taken by the standard Stream implementation


This is the result of the 6 different combinations of the sample test shared by @Hank D It's evident that predicate of form u -> exp1 && exp2 is highly performant in all the cases.

one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=3372, min=31, average=33.720000, max=47}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9150, min=85, average=91.500000, max=118}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9046, min=81, average=90.460000, max=150}

one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8336, min=77, average=83.360000, max=189}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9094, min=84, average=90.940000, max=176}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10501, min=99, average=105.010000, max=136}

two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=11117, min=98, average=111.170000, max=238}
one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8346, min=77, average=83.460000, max=113}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9089, min=81, average=90.890000, max=137}

two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10434, min=98, average=104.340000, max=132}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9113, min=81, average=91.130000, max=179}
one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8258, min=77, average=82.580000, max=100}

one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=9131, min=81, average=91.310000, max=139}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10265, min=97, average=102.650000, max=131}
one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8442, min=77, average=84.420000, max=156}

one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8553, min=81, average=85.530000, max=125}
one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=8219, min=77, average=82.190000, max=142}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10305, min=97, average=103.050000, max=132}

This test shows that your second option can perform significantly better. Findings first, then the code:

one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=4142, min=29, average=41.420000, max=82}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=13315, min=117, average=133.150000, max=153}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10320, min=82, average=103.200000, max=127}

now the code:

enum Gender {
    FEMALE,
    MALE
}

static class User {
    Gender gender;
    int age;

    public User(Gender gender, int age){
        this.gender = gender;
        this.age = age;
    }

    public Gender getGender() {
        return gender;
    }

    public void setGender(Gender gender) {
        this.gender = gender;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

static long test1(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter((u) -> u.getGender() == Gender.FEMALE && u.getAge() % 2 == 0)
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

static long test2(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter(u -> u.getGender() == Gender.FEMALE)
            .filter(u -> u.getAge() % 2 == 0)
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

static long test3(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter(((Predicate<User>) u -> u.getGender() == Gender.FEMALE).and(u -> u.getAge() % 2 == 0))
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

public static void main(String... args) {
    int size = 10000000;
    List<User> users =
    IntStream.range(0,size)
            .mapToObj(i -> i % 2 == 0 ? new User(Gender.MALE, i % 100) : new User(Gender.FEMALE, i % 100))
            .collect(Collectors.toCollection(()->new ArrayList<>(size)));
    repeat("one filter with predicate of form u -> exp1 && exp2", users, Temp::test1, 100);
    repeat("two filters with predicates of form u -> exp1", users, Temp::test2, 100);
    repeat("one filter with predicate of form predOne.and(pred2)", users, Temp::test3, 100);
}

private static void repeat(String name, List<User> users, ToLongFunction<List<User>> test, int iterations) {
    System.out.println(name + ", list size " + users.size() + ", averaged over " + iterations + " runs: " + IntStream.range(0, iterations)
            .mapToLong(i -> test.applyAsLong(users))
            .summaryStatistics());
}

A complex filter condition is better in performance perspective, but the best performance will show old fashion for loop with a standard if clause is the best option. The difference on a small array 10 elements difference might ~ 2 times, for a large array the difference is not that big.
You can take a look on my GitHub project, where I did performance tests for multiple array iteration options

For small array 10 element throughput ops/s: 10 element array For medium 10,000 elements throughput ops/s: enter image description here For large array 1,000,000 elements throughput ops/s: 1M elements

NOTE: tests runs on

  • 8 CPU
  • 1 GB RAM
  • OS version: 16.04.1 LTS (Xenial Xerus)
  • java version: 1.8.0_121
  • jvm: -XX:+UseG1GC -server -Xmx1024m -Xms1024m

UPDATE: Java 11 has some progress on the performance, but the dynamics stay the same

Benchmark mode: Throughput, ops/time Java 8vs11


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to lambda

Java 8 optional: ifPresent return object orElseThrow exception How to properly apply a lambda function into a pandas data frame column What are functional interfaces used for in Java 8? Java 8 lambda get and remove element from list Variable used in lambda expression should be final or effectively final Filter values only if not null using lambda in Java8 forEach loop Java 8 for Map entry set Java 8 Lambda Stream forEach with multiple statements Java 8 stream map to list of keys sorted by values Task.Run with Parameter(s)?

Examples related to filter

Monitoring the Full Disclosure mailinglist Pyspark: Filter dataframe based on multiple conditions How Spring Security Filter Chain works Copy filtered data to another sheet using VBA Filter object properties by key in ES6 How do I filter date range in DataTables? How do I filter an array with TypeScript in Angular 2? Filtering array of objects with lodash based on property value How to filter an array from all elements of another array How to specify "does not contain" in dplyr filter

Examples related to java-8

Default interface methods are only supported starting with Android N Class has been compiled by a more recent version of the Java Environment Why is ZoneOffset.UTC != ZoneId.of("UTC")? Modify property value of the objects in list using Java 8 streams How to use if-else logic in Java 8 stream forEach Android Studio Error: Error:CreateProcess error=216, This version of %1 is not compatible with the version of Windows you're running Error:could not create the Java Virtual Machine Error:A fatal exception has occured.Program will exit What are functional interfaces used for in Java 8? java.time.format.DateTimeParseException: Text could not be parsed at index 21 Java 8 lambda get and remove element from list

Examples related to java-stream

Sorting a list with stream.sorted() in Java Modify property value of the objects in list using Java 8 streams How to use if-else logic in Java 8 stream forEach Java 8 lambda get and remove element from list Create list of object from another using Java 8 Streams Java 8 Stream API to find Unique Object matching a property value Reverse a comparator in Java 8 Ignore duplicates when producing map using streams Modifying Objects within stream in Java8 while iterating How can I get a List from some class properties with Java 8 Stream?