Index

Reducing and Aggregating Data

Java Streams

7.1 Using reduce() for Aggregation

The reduce() method in Java Streams is a powerful terminal operation that combines stream elements into a single result by repeatedly applying an operation. It’s often used for aggregation tasks like summing numbers, concatenating strings, or computing product.

Three Variations of reduce()

  1. Optional<T> reduce(BinaryOperator<T> accumulator)

    • Reduces elements using the accumulator function.
    • Returns an Optional because the stream might be empty.
    • Example: Summing integers without an initial value.
  2. T reduce(T identity, BinaryOperator<T> accumulator)

    • Uses an identity value as the initial result.
    • Returns the reduced value directly.
    • The identity is a neutral element for the accumulator (e.g., 0 for addition).
  3. <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner)

    • Supports parallel execution by separating accumulation and combining.
    • accumulator processes each element into the result type.
    • combiner merges partial results.
    • Useful for complex reductions or when result type differs from stream element type.

Key Concepts

Examples

Example 1: String Concatenation with reduce()

import java.util.Optional;
import java.util.stream.Stream;

public class ReduceExample {
    public static void main(String[] args) {
        Optional<String> result = Stream.of("Java", "Streams", "Reduce")
                                       .reduce((a, b) -> a + " " + b);

        result.ifPresent(System.out::println);  // Output: Java Streams Reduce
    }
}

Example 2: Numerical Sum with Identity and Method Reference

import java.util.stream.Stream;

public class ReduceExample2 {
    public static void main(String[] args) {
        int sum = Stream.of(1, 2, 3, 4, 5)
                        .reduce(0, Integer::sum);

        System.out.println("Sum: " + sum); // Output: Sum: 15
    }
}

Example 3: Using Three-Parameter reduce() for Parallel Processing

import java.util.Arrays;
import java.util.List;

public class ReduceExample3 {
    public static void main(String[] args) {
        List<String> words = Arrays.asList("apple", "banana", "cherry");

        String result = words.parallelStream()
                             .reduce(
                                 "",                         // identity
                                 (partial, word) -> partial + word.toUpperCase(), // accumulator
                                 (s1, s2) -> s1 + s2        // combiner
                             );

        System.out.println(result); // Output: APPLEBANANACHERRY
    }
}

When to Prefer reduce() Over collect()

By mastering reduce(), you gain fine-grained control over aggregation, enabling concise yet expressive data summarization in your stream pipelines.

Index

7.2 Common Reduction Patterns (sum, max, min)

Reduction operations like calculating the sum, maximum, minimum, or product are some of the most frequent tasks in stream processing. Java Streams allow you to perform these aggregations using the flexible reduce() method, but also provide specialized terminal operations for primitive streams that offer better readability and performance.

Using reduce() for Common Aggregations

Sum using reduce()

import java.util.stream.Stream;

public class SumReduce {
    public static void main(String[] args) {
        int sum = Stream.of(1, 2, 3, 4, 5)
                        .reduce(0, Integer::sum);

        System.out.println("Sum: " + sum);  // Output: Sum: 15
    }
}

This example sums integers with an identity value of 0 and the built-in method reference Integer::sum.

Maximum using reduce()

import java.util.Optional;
import java.util.stream.Stream;

public class MaxReduce {
    public static void main(String[] args) {
        Optional<Integer> max = Stream.of(3, 7, 2, 9, 5)
                                     .reduce(Integer::max);

        max.ifPresent(m -> System.out.println("Max: " + m));  // Output: Max: 9
    }
}

Here, no identity is provided, so the result is wrapped in an Optional.

Minimum using reduce()

import java.util.Optional;
import java.util.stream.Stream;

public class MinReduce {
    public static void main(String[] args) {
        Optional<Integer> min = Stream.of(3, 7, 2, 9, 5)
                                     .reduce(Integer::min);

        min.ifPresent(m -> System.out.println("Min: " + m));  // Output: Min: 2
    }
}

Product using reduce()

import java.util.stream.Stream;

public class ProductReduce {
    public static void main(String[] args) {
        int product = Stream.of(1, 2, 3, 4)
                            .reduce(1, (a, b) -> a * b);

        System.out.println("Product: " + product);  // Output: Product: 24
    }
}

Product does not have a built-in shortcut method, so reduce() is ideal here.

Comparing with Primitive Stream Terminal Operations

For primitive streams like IntStream, Java provides direct methods:

Example:

import java.util.stream.IntStream;

public class PrimitiveSumMaxMin {
    public static void main(String[] args) {
        int sum = IntStream.of(1, 2, 3, 4, 5).sum();
        int max = IntStream.of(3, 7, 2, 9, 5).max().orElseThrow();
        int min = IntStream.of(3, 7, 2, 9, 5).min().orElseThrow();

        System.out.printf("Sum: %d, Max: %d, Min: %d%n", sum, max, min);
        // Output: Sum: 15, Max: 9, Min: 2
    }
}

Readability, Performance, and Precision

Summary

Index

7.3 Custom Reduction Operations

Beyond simple numeric reductions, the reduce() method enables complex aggregations involving custom objects and multi-step logic. This is particularly useful when you want to combine multiple fields or accumulate rich summaries from a stream of data.

Reduction in Parallel Streams: Key Requirements

Example: Aggregating User Records into a Summary Report

Suppose we have a User class and want to produce a summary containing the total count of users, the combined length of all usernames, and a concatenated list of unique domains in their emails.

import java.util.*;
import java.util.stream.*;

class User {
    String username;
    String email;

    User(String username, String email) {
        this.username = username;
        this.email = email;
    }
}

class UserSummary {
    int userCount;
    int totalNameLength;
    Set<String> emailDomains;

    UserSummary() {
        this.userCount = 0;
        this.totalNameLength = 0;
        this.emailDomains = new HashSet<>();
    }

    UserSummary accumulate(User user) {
        userCount++;
        totalNameLength += user.username.length();

        String domain = user.email.substring(user.email.indexOf('@') + 1);
        emailDomains.add(domain);

        return this;
    }

    UserSummary combine(UserSummary other) {
        userCount += other.userCount;
        totalNameLength += other.totalNameLength;
        emailDomains.addAll(other.emailDomains);
        return this;
    }

    @Override
    public String toString() {
        return String.format("UserCount: %d, TotalNameLength: %d, Domains: %s",
                             userCount, totalNameLength, emailDomains);
    }
}

public class CustomReduceExample {
    public static void main(String[] args) {
        List<User> users = List.of(
            new User("alice", "alice@example.com"),
            new User("bob", "bob@openai.com"),
            new User("charlie", "charlie@example.com")
        );

        UserSummary summary = users.parallelStream()
                                   .reduce(
                                       new UserSummary(),        // identity
                                       UserSummary::accumulate,  // accumulator
                                       UserSummary::combine      // combiner
                                   );

        System.out.println(summary);
    }
}

Step-by-Step Walkthrough

Why Use This Approach?

Summary

Custom reduction with reduce() empowers you to aggregate sophisticated data structures by clearly defining how to accumulate and combine partial results. When designing these operations, always ensure associativity and statelessness to maintain correctness and maximize parallel processing benefits.

Index