Index

Advanced Stream Operations

Java Functional Programming

5.1 Parallel Streams and Performance

Java’s Streams API offers a powerful feature called parallel streams, which enables processing data concurrently using multiple CPU cores. This is achieved by splitting the stream’s data into multiple chunks, processing them in parallel threads, and then combining the results. Parallel streams can dramatically speed up data-intensive operations on large datasets, making them a valuable tool for performance optimization.

How Parallel Streams Work

When you create a parallel stream (via .parallelStream() or .stream().parallel()), the framework uses the ForkJoinPool to distribute tasks across available CPU cores. Each thread processes a portion of the data independently, and the results are merged at the end.

This approach contrasts with sequential streams, where processing happens on a single thread, executing operations one element at a time in order.

Potential Benefits

Common Pitfalls and Considerations

When to Use Parallel Streams

Measuring Performance: Sequential vs. Parallel

Here’s a simple benchmark comparing sequential and parallel streams for summing a large range of numbers:

import java.util.stream.IntStream;

public class ParallelStreamBenchmark {
    public static void main(String[] args) {
        int max = 10_000_000;

        // Sequential sum
        long start = System.currentTimeMillis();
        long seqSum = IntStream.rangeClosed(1, max)
                               .sum();
        long end = System.currentTimeMillis();
        System.out.println("Sequential sum: " + seqSum + " in " + (end - start) + " ms");

        // Parallel sum
        start = System.currentTimeMillis();
        long parSum = IntStream.rangeClosed(1, max)
                               .parallel()
                               .sum();
        end = System.currentTimeMillis();
        System.out.println("Parallel sum: " + parSum + " in " + (end - start) + " ms");
    }
}

Typical output:

Sequential sum: 50000005000000 in 150 ms
Parallel sum: 50000005000000 in 50 ms

The parallel version often executes faster on multi-core machines, but results may vary depending on CPU, JVM optimizations, and system load.

Conclusion

Parallel streams are a convenient way to utilize multiple CPU cores and improve performance for large-scale data processing. However, understanding when to use them and avoiding common pitfalls like side effects and order-dependence is crucial. Always measure and profile your application to ensure parallel streams deliver the desired performance benefits.

Index

5.2 Short-circuiting Operations: limit, findFirst, anyMatch

Short-circuiting operations in Java Streams are powerful tools that can terminate the stream pipeline early once a certain condition is met, improving efficiency by avoiding unnecessary processing. These operations help save time and resources, especially when dealing with large or potentially infinite data sources.

limit(long maxSize)

The limit operation restricts the stream to process only the first maxSize elements, ignoring the rest. This is especially useful for implementing pagination or sampling.

Example: Pagination with limit

List<String> names = List.of("Alice", "Bob", "Charlie", "Diana", "Evan");

List<String> firstTwo = names.stream()
                             .limit(2)
                             .collect(Collectors.toList());

System.out.println(firstTwo); // Output: [Alice, Bob]

findFirst()

The findFirst operation retrieves the first element in the stream that matches the criteria (if any). It returns an Optional<T>, so it handles the case where no element matches safely.

Example: Find the first long name

List<String> names = List.of("Bob", "Alice", "Charlie", "Diana");

Optional<String> firstLongName = names.stream()
                                      .filter(name -> name.length() > 5)
                                      .findFirst();

firstLongName.ifPresent(System.out::println); // Output: Charlie

findFirst is useful for early termination in searches where only one match is needed.

anyMatch(PredicateT)

The anyMatch operation checks whether any element in the stream matches the given predicate. It returns true immediately when a match is found, or false if none matches.

Example: Quick check for a condition

List<Integer> numbers = List.of(1, 3, 5, 8, 9);

boolean hasEven = numbers.stream()
                         .anyMatch(n -> n % 2 == 0);

System.out.println(hasEven); // Output: true (because of 8)

anyMatch is ideal for quickly verifying the existence of elements that meet a condition, which can significantly reduce processing time on large datasets.

Summary

Operation Purpose Result Type Use Case Example
limit(n) Process only first n elements Stream Pagination or sampling
findFirst Get first matching element Optional<T> Early search termination
anyMatch Check if any element matches boolean Quick condition checks

Why Use Short-circuiting?

Short-circuiting saves computation by stopping the pipeline as soon as the result is known, which is especially beneficial for large or infinite streams.

These operations demonstrate the flexibility and efficiency of the Streams API, allowing you to build performant data-processing pipelines that terminate early when possible.

Index

5.3 FlatMap for Nested Data

When working with nested data structures—such as lists of lists or collections inside objects—the flatMap operation becomes essential. It flattens multiple levels of nested streams into a single continuous stream, simplifying processing.

Difference Between map and flatMap

In short: map produces a stream of streams when the mapping function returns a stream, while flatMap merges those streams into a single stream.

Example 1: Flattening a List of Lists

Suppose you have a list of lists of integers:

List<List<Integer>> listOfLists = List.of(
    List.of(1, 2, 3),
    List.of(4, 5),
    List.of(6, 7, 8, 9)
);

Using map would produce a stream of streams:

Stream<Stream<Integer>> mapped = listOfLists.stream()
    .map(list -> list.stream());

mapped.forEach(s -> s.forEach(System.out::println)); 
// Prints all numbers but requires nested loops

Using flatMap flattens this into a single stream of integers:

List<Integer> flattened = listOfLists.stream()
    .flatMap(List::stream)
    .collect(Collectors.toList());

System.out.println(flattened);
// Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Example 2: Extracting Nested Fields from Objects

Imagine you have a Person class where each person has a list of phone numbers:

class Person {
    private String name;
    private List<String> phoneNumbers;

    // Constructor, getters omitted for brevity
}

List<Person> people = List.of(
    new Person("Alice", List.of("123-456", "234-567")),
    new Person("Bob", List.of("345-678")),
    new Person("Charlie", List.of("456-789", "567-890"))
);

To get a flat list of all phone numbers:

List<String> allNumbers = people.stream()
    .flatMap(person -> person.getPhoneNumbers().stream())
    .collect(Collectors.toList());

System.out.println(allNumbers);
// Output: [123-456, 234-567, 345-678, 456-789, 567-890]
Click to view full runnable Code

import java.util.*;
import java.util.stream.*;

class Person {
    private String name;
    private List<String> phoneNumbers;

    public Person(String name, List<String> phoneNumbers) {
        this.name = name;
        this.phoneNumbers = phoneNumbers;
    }

    public String getName() {
        return name;
    }

    public List<String> getPhoneNumbers() {
        return phoneNumbers;
    }
}

public class MapFlatMapExample {
    public static void main(String[] args) {
        // Example 1: Flattening a List of Lists
        List<List<Integer>> listOfLists = List.of(
            List.of(1, 2, 3),
            List.of(4, 5),
            List.of(6, 7, 8, 9)
        );

        System.out.println("Using map (Stream<Stream<Integer>>):");
        Stream<Stream<Integer>> mapped = listOfLists.stream()
            .map(list -> list.stream());
        mapped.forEach(stream -> stream.forEach(System.out::println)); // nested iteration

        System.out.println("\nUsing flatMap (Stream<Integer>):");
        List<Integer> flattened = listOfLists.stream()
            .flatMap(List::stream)
            .collect(Collectors.toList());
        System.out.println("Flattened list: " + flattened); // [1, 2, 3, 4, 5, 6, 7, 8, 9]

        // Example 2: Extracting Nested Fields from Objects
        List<Person> people = List.of(
            new Person("Alice", List.of("123-456", "234-567")),
            new Person("Bob", List.of("345-678")),
            new Person("Charlie", List.of("456-789", "567-890"))
        );

        List<String> allNumbers = people.stream()
            .flatMap(person -> person.getPhoneNumbers().stream())
            .collect(Collectors.toList());

        System.out.println("\nAll phone numbers: " + allNumbers);
        // Output: [123-456, 234-567, 345-678, 456-789, 567-890]
    }
}

When to Use flatMap

Summary

Operation Result
map Stream of streams (nested)
flatMap Flattened, single-level stream

By mastering flatMap, you can handle deeply nested or complex data structures cleanly and efficiently, unlocking more powerful data processing patterns in Java’s functional programming paradigm.

Index

5.4 Example: Processing Nested Collections

When working with nested collections—such as a list of departments each containing a list of employees—handling the data can become verbose and cumbersome using traditional loops. The flatMap operation in Java Streams simplifies this by flattening nested streams into a single stream, allowing seamless processing of deeply nested data.

Let’s consider a practical example: we have a List<Department>, where each Department holds a list of Employee objects. Our goal is to find all employees with a salary greater than $75,000 across all departments.

Here is how we might model the classes:

import java.util.*;
import java.util.stream.Collectors;

class Employee {
    String name;
    double salary;

    Employee(String name, double salary) {
        this.name = name;
        this.salary = salary;
    }

    @Override
    public String toString() {
        return name + " ($" + salary + ")";
    }
}

class Department {
    String name;
    List<Employee> employees;

    Department(String name, List<Employee> employees) {
        this.name = name;
        this.employees = employees;
    }
}

Traditional nested loops approach:

List<Employee> highEarners = new ArrayList<>();
for (Department dept : departments) {
    for (Employee emp : dept.employees) {
        if (emp.salary > 75000) {
            highEarners.add(emp);
        }
    }
}

While this works, it quickly becomes bulky as complexity grows.

Using Streams and flatMap:

List<Employee> highEarners = departments.stream()
    // Flatten the stream of departments into a stream of employees
    .flatMap(dept -> dept.employees.stream())
    // Filter employees with salary > 75,000
    .filter(emp -> emp.salary > 75000)
    // Collect the results into a list
    .collect(Collectors.toList());

This approach is concise, readable, and expressive. flatMap replaces the nested iteration by producing one continuous stream of employees from all departments, so you can apply filters and other operations directly.

Sample Data and Full Example:

public class NestedCollectionsExample {
    public static void main(String[] args) {
        List<Department> departments = Arrays.asList(
            new Department("Engineering", Arrays.asList(
                new Employee("Alice", 90000),
                new Employee("Bob", 60000),
                new Employee("Charlie", 80000)
            )),
            new Department("HR", Arrays.asList(
                new Employee("Diana", 70000),
                new Employee("Evan", 85000)
            )),
            new Department("Sales", Arrays.asList(
                new Employee("Fiona", 72000),
                new Employee("George", 78000)
            ))
        );

        List<Employee> highEarners = departments.stream()
            .flatMap(dept -> dept.employees.stream())
            .filter(emp -> emp.salary > 75000)
            .collect(Collectors.toList());

        System.out.println("Employees with salary > $75,000:");
        highEarners.forEach(System.out::println);
    }
}
Click to view full runnable Code

import java.util.*;
import java.util.stream.Collectors;

class Employee {
    String name;
    double salary;

    Employee(String name, double salary) {
        this.name = name;
        this.salary = salary;
    }

    @Override
    public String toString() {
        return name + " ($" + salary + ")";
    }
}

class Department {
    String name;
    List<Employee> employees;

    Department(String name, List<Employee> employees) {
        this.name = name;
        this.employees = employees;
    }
}

public class NestedCollectionsExample {
    public static void main(String[] args) {
        List<Department> departments = Arrays.asList(
            new Department("Engineering", Arrays.asList(
                new Employee("Alice", 90000),
                new Employee("Bob", 60000),
                new Employee("Charlie", 80000)
            )),
            new Department("HR", Arrays.asList(
                new Employee("Diana", 70000),
                new Employee("Evan", 85000)
            )),
            new Department("Sales", Arrays.asList(
                new Employee("Fiona", 72000),
                new Employee("George", 78000)
            ))
        );

        List<Employee> highEarners = departments.stream()
            .flatMap(dept -> dept.employees.stream())
            .filter(emp -> emp.salary > 75000)
            .collect(Collectors.toList());

        System.out.println("Employees with salary > $75,000:");
        highEarners.forEach(System.out::println);
    }
}

Expected Output:

Employees with salary > $75,000:
Alice ($90000.0)
Charlie ($80000.0)
Evan ($85000.0)
George ($78000.0)

Summary

Index