Java’s Streams API offers a powerful feature called parallel streams, which enables processing data concurrently using multiple CPU cores. This is achieved by splitting the stream’s data into multiple chunks, processing them in parallel threads, and then combining the results. Parallel streams can dramatically speed up data-intensive operations on large datasets, making them a valuable tool for performance optimization.
When you create a parallel stream (via .parallelStream()
or .stream().parallel()
), the framework uses the ForkJoinPool to distribute tasks across available CPU cores. Each thread processes a portion of the data independently, and the results are merged at the end.
This approach contrasts with sequential streams, where processing happens on a single thread, executing operations one element at a time in order.
forEach
) do not guarantee order in parallel streams, which might be problematic if the order matters.Use parallel streams when:
Avoid parallel streams when:
Here’s a simple benchmark comparing sequential and parallel streams for summing a large range of numbers:
import java.util.stream.IntStream;
public class ParallelStreamBenchmark {
public static void main(String[] args) {
int max = 10_000_000;
// Sequential sum
long start = System.currentTimeMillis();
long seqSum = IntStream.rangeClosed(1, max)
.sum();
long end = System.currentTimeMillis();
System.out.println("Sequential sum: " + seqSum + " in " + (end - start) + " ms");
// Parallel sum
start = System.currentTimeMillis();
long parSum = IntStream.rangeClosed(1, max)
.parallel()
.sum();
end = System.currentTimeMillis();
System.out.println("Parallel sum: " + parSum + " in " + (end - start) + " ms");
}
}
Typical output:
Sequential sum: 50000005000000 in 150 ms
Parallel sum: 50000005000000 in 50 ms
The parallel version often executes faster on multi-core machines, but results may vary depending on CPU, JVM optimizations, and system load.
Parallel streams are a convenient way to utilize multiple CPU cores and improve performance for large-scale data processing. However, understanding when to use them and avoiding common pitfalls like side effects and order-dependence is crucial. Always measure and profile your application to ensure parallel streams deliver the desired performance benefits.
limit
, findFirst
, anyMatch
Short-circuiting operations in Java Streams are powerful tools that can terminate the stream pipeline early once a certain condition is met, improving efficiency by avoiding unnecessary processing. These operations help save time and resources, especially when dealing with large or potentially infinite data sources.
limit(long maxSize)
The limit
operation restricts the stream to process only the first maxSize
elements, ignoring the rest. This is especially useful for implementing pagination or sampling.
Example: Pagination with limit
List<String> names = List.of("Alice", "Bob", "Charlie", "Diana", "Evan");
List<String> firstTwo = names.stream()
.limit(2)
.collect(Collectors.toList());
System.out.println(firstTwo); // Output: [Alice, Bob]
findFirst()
The findFirst
operation retrieves the first element in the stream that matches the criteria (if any). It returns an Optional<T>
, so it handles the case where no element matches safely.
Example: Find the first long name
List<String> names = List.of("Bob", "Alice", "Charlie", "Diana");
Optional<String> firstLongName = names.stream()
.filter(name -> name.length() > 5)
.findFirst();
firstLongName.ifPresent(System.out::println); // Output: Charlie
findFirst
is useful for early termination in searches where only one match is needed.
anyMatch(PredicateT)
The anyMatch
operation checks whether any element in the stream matches the given predicate. It returns true
immediately when a match is found, or false
if none matches.
Example: Quick check for a condition
List<Integer> numbers = List.of(1, 3, 5, 8, 9);
boolean hasEven = numbers.stream()
.anyMatch(n -> n % 2 == 0);
System.out.println(hasEven); // Output: true (because of 8)
anyMatch
is ideal for quickly verifying the existence of elements that meet a condition, which can significantly reduce processing time on large datasets.
Operation | Purpose | Result Type | Use Case Example |
---|---|---|---|
limit(n) |
Process only first n elements |
Stream | Pagination or sampling |
findFirst |
Get first matching element | Optional<T> |
Early search termination |
anyMatch |
Check if any element matches | boolean |
Quick condition checks |
Short-circuiting saves computation by stopping the pipeline as soon as the result is known, which is especially beneficial for large or infinite streams.
These operations demonstrate the flexibility and efficiency of the Streams API, allowing you to build performant data-processing pipelines that terminate early when possible.
When working with nested data structures—such as lists of lists or collections inside objects—the flatMap
operation becomes essential. It flattens multiple levels of nested streams into a single continuous stream, simplifying processing.
map
and flatMap
map
transforms each element into another element (or stream) but preserves the nesting.flatMap
transforms each element into a stream and then flattens those streams into one stream.In short: map
produces a stream of streams when the mapping function returns a stream, while flatMap
merges those streams into a single stream.
Suppose you have a list of lists of integers:
List<List<Integer>> listOfLists = List.of(
List.of(1, 2, 3),
List.of(4, 5),
List.of(6, 7, 8, 9)
);
Using map
would produce a stream of streams:
Stream<Stream<Integer>> mapped = listOfLists.stream()
.map(list -> list.stream());
mapped.forEach(s -> s.forEach(System.out::println));
// Prints all numbers but requires nested loops
Using flatMap
flattens this into a single stream of integers:
List<Integer> flattened = listOfLists.stream()
.flatMap(List::stream)
.collect(Collectors.toList());
System.out.println(flattened);
// Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Imagine you have a Person
class where each person has a list of phone numbers:
class Person {
private String name;
private List<String> phoneNumbers;
// Constructor, getters omitted for brevity
}
List<Person> people = List.of(
new Person("Alice", List.of("123-456", "234-567")),
new Person("Bob", List.of("345-678")),
new Person("Charlie", List.of("456-789", "567-890"))
);
To get a flat list of all phone numbers:
List<String> allNumbers = people.stream()
.flatMap(person -> person.getPhoneNumbers().stream())
.collect(Collectors.toList());
System.out.println(allNumbers);
// Output: [123-456, 234-567, 345-678, 456-789, 567-890]
import java.util.*;
import java.util.stream.*;
class Person {
private String name;
private List<String> phoneNumbers;
public Person(String name, List<String> phoneNumbers) {
this.name = name;
this.phoneNumbers = phoneNumbers;
}
public String getName() {
return name;
}
public List<String> getPhoneNumbers() {
return phoneNumbers;
}
}
public class MapFlatMapExample {
public static void main(String[] args) {
// Example 1: Flattening a List of Lists
List<List<Integer>> listOfLists = List.of(
List.of(1, 2, 3),
List.of(4, 5),
List.of(6, 7, 8, 9)
);
System.out.println("Using map (Stream<Stream<Integer>>):");
Stream<Stream<Integer>> mapped = listOfLists.stream()
.map(list -> list.stream());
mapped.forEach(stream -> stream.forEach(System.out::println)); // nested iteration
System.out.println("\nUsing flatMap (Stream<Integer>):");
List<Integer> flattened = listOfLists.stream()
.flatMap(List::stream)
.collect(Collectors.toList());
System.out.println("Flattened list: " + flattened); // [1, 2, 3, 4, 5, 6, 7, 8, 9]
// Example 2: Extracting Nested Fields from Objects
List<Person> people = List.of(
new Person("Alice", List.of("123-456", "234-567")),
new Person("Bob", List.of("345-678")),
new Person("Charlie", List.of("456-789", "567-890"))
);
List<String> allNumbers = people.stream()
.flatMap(person -> person.getPhoneNumbers().stream())
.collect(Collectors.toList());
System.out.println("\nAll phone numbers: " + allNumbers);
// Output: [123-456, 234-567, 345-678, 456-789, 567-890]
}
}
flatMap
Operation | Result |
---|---|
map |
Stream of streams (nested) |
flatMap |
Flattened, single-level stream |
By mastering flatMap
, you can handle deeply nested or complex data structures cleanly and efficiently, unlocking more powerful data processing patterns in Java’s functional programming paradigm.
When working with nested collections—such as a list of departments each containing a list of employees—handling the data can become verbose and cumbersome using traditional loops. The flatMap
operation in Java Streams simplifies this by flattening nested streams into a single stream, allowing seamless processing of deeply nested data.
Let’s consider a practical example: we have a List<Department>
, where each Department
holds a list of Employee
objects. Our goal is to find all employees with a salary greater than $75,000 across all departments.
Here is how we might model the classes:
import java.util.*;
import java.util.stream.Collectors;
class Employee {
String name;
double salary;
Employee(String name, double salary) {
this.name = name;
this.salary = salary;
}
@Override
public String toString() {
return name + " ($" + salary + ")";
}
}
class Department {
String name;
List<Employee> employees;
Department(String name, List<Employee> employees) {
this.name = name;
this.employees = employees;
}
}
List<Employee> highEarners = new ArrayList<>();
for (Department dept : departments) {
for (Employee emp : dept.employees) {
if (emp.salary > 75000) {
highEarners.add(emp);
}
}
}
While this works, it quickly becomes bulky as complexity grows.
flatMap
:List<Employee> highEarners = departments.stream()
// Flatten the stream of departments into a stream of employees
.flatMap(dept -> dept.employees.stream())
// Filter employees with salary > 75,000
.filter(emp -> emp.salary > 75000)
// Collect the results into a list
.collect(Collectors.toList());
This approach is concise, readable, and expressive. flatMap
replaces the nested iteration by producing one continuous stream of employees from all departments, so you can apply filters and other operations directly.
public class NestedCollectionsExample {
public static void main(String[] args) {
List<Department> departments = Arrays.asList(
new Department("Engineering", Arrays.asList(
new Employee("Alice", 90000),
new Employee("Bob", 60000),
new Employee("Charlie", 80000)
)),
new Department("HR", Arrays.asList(
new Employee("Diana", 70000),
new Employee("Evan", 85000)
)),
new Department("Sales", Arrays.asList(
new Employee("Fiona", 72000),
new Employee("George", 78000)
))
);
List<Employee> highEarners = departments.stream()
.flatMap(dept -> dept.employees.stream())
.filter(emp -> emp.salary > 75000)
.collect(Collectors.toList());
System.out.println("Employees with salary > $75,000:");
highEarners.forEach(System.out::println);
}
}
import java.util.*;
import java.util.stream.Collectors;
class Employee {
String name;
double salary;
Employee(String name, double salary) {
this.name = name;
this.salary = salary;
}
@Override
public String toString() {
return name + " ($" + salary + ")";
}
}
class Department {
String name;
List<Employee> employees;
Department(String name, List<Employee> employees) {
this.name = name;
this.employees = employees;
}
}
public class NestedCollectionsExample {
public static void main(String[] args) {
List<Department> departments = Arrays.asList(
new Department("Engineering", Arrays.asList(
new Employee("Alice", 90000),
new Employee("Bob", 60000),
new Employee("Charlie", 80000)
)),
new Department("HR", Arrays.asList(
new Employee("Diana", 70000),
new Employee("Evan", 85000)
)),
new Department("Sales", Arrays.asList(
new Employee("Fiona", 72000),
new Employee("George", 78000)
))
);
List<Employee> highEarners = departments.stream()
.flatMap(dept -> dept.employees.stream())
.filter(emp -> emp.salary > 75000)
.collect(Collectors.toList());
System.out.println("Employees with salary > $75,000:");
highEarners.forEach(System.out::println);
}
}
Employees with salary > $75,000:
Alice ($90000.0)
Charlie ($80000.0)
Evan ($85000.0)
George ($78000.0)
List<Department>
to List<Employee>
transformation is streamlined by flatMap
.flatMap
, nested loops are needed to iterate through departments and employees.flatMap
"flattens" the nested lists into a single stream, enabling operations like filter
to work across all employees.