Index

Grouping and Partitioning Data

Java Streams

11.1 Grouping with Collectors.groupingBy()

The Collectors.groupingBy() method is one of the most powerful collectors in the Java Streams API, used to group elements of a stream based on a classification function. It partitions the input elements into a Map whose keys are the classification results, and whose values are collections (or other results) of the grouped elements.

Basic Syntax

Map<K, List<T>> grouped = stream.collect(Collectors.groupingBy(classifier));

You can also specify downstream collectors to change the result collection type (e.g., Set, Map) or perform further reduction.

Example 1: Grouping Users by Age Bracket

Suppose we want to group a list of users by their age bracket (e.g., "Youth", "Adult", "Senior"):

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class GroupingExample1 {
    static class User {
        String name;
        int age;

        User(String name, int age) {
            this.name = name;
            this.age = age;
        }

        @Override
        public String toString() {
            return name + "(" + age + ")";
        }
    }

    public static void main(String[] args) {
        List<User> users = List.of(
            new User("Alice", 23),
            new User("Bob", 45),
            new User("Charlie", 17),
            new User("Diana", 65),
            new User("Eve", 34)
        );

        Map<String, List<User>> groupedByAgeBracket = users.stream()
            .collect(Collectors.groupingBy(user -> {
                if (user.age < 18) return "Youth";
                else if (user.age < 60) return "Adult";
                else return "Senior";
            }));

        System.out.println("Users grouped by age bracket:");
        groupedByAgeBracket.forEach((ageGroup, groupUsers) -> {
            System.out.println(ageGroup + ": " + groupUsers);
        });
    }
}

Output:

Users grouped by age bracket:
Youth: [Charlie(17)]
Adult: [Alice(23), Bob(45), Eve(34)]
Senior: [Diana(65)]

Example 2: Grouping Orders by Status with Different Collection Types

This example groups orders by their status and collects results into a Set to avoid duplicates.

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

public class GroupingExample2 {
    static class Order {
        String id;
        String status;

        Order(String id, String status) {
            this.id = id;
            this.status = status;
        }

        @Override
        public String toString() {
            return id;
        }
    }

    public static void main(String[] args) {
        List<Order> orders = List.of(
            new Order("1001", "SHIPPED"),
            new Order("1002", "PENDING"),
            new Order("1003", "SHIPPED"),
            new Order("1004", "CANCELLED"),
            new Order("1005", "PENDING")
        );

        Map<String, Set<Order>> groupedByStatus = orders.stream()
            .collect(Collectors.groupingBy(
                order -> order.status,
                Collectors.toSet()  // Collect into a Set instead of default List
            ));

        System.out.println("Orders grouped by status:");
        groupedByStatus.forEach((status, orderSet) -> {
            System.out.println(status + ": " + orderSet);
        });
    }
}

Output:

Orders grouped by status:
SHIPPED: [1003, 1001]
PENDING: [1005, 1002]
CANCELLED: [1004]

Summary

By mastering groupingBy(), you can transform flat streams into structured maps that reflect the logical organization of your data.

Index

11.2 Multi-level Grouping

Multi-level grouping allows you to group data hierarchically by applying Collectors.groupingBy() multiple times in a nested fashion. This technique produces nested maps, where the value of one grouping is itself a map resulting from a further grouping operation.

Why Multi-level Grouping?

Sometimes, a single classification is not enough. For example:

Syntax Overview

Map<K1, Map<K2, List<T>>> nestedGrouping = stream.collect(
    Collectors.groupingBy(
        classifier1,
        Collectors.groupingBy(classifier2)
    )
);

Here, classifier1 is the outer grouping function, and classifier2 is the inner grouping function applied within each outer group.

Example 1: Grouping Employees by Department, then Role

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class MultiLevelGroupingExample1 {
    static class Employee {
        String name;
        String department;
        String role;

        Employee(String name, String department, String role) {
            this.name = name;
            this.department = department;
            this.role = role;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<Employee> employees = List.of(
            new Employee("Alice", "HR", "Manager"),
            new Employee("Bob", "HR", "Recruiter"),
            new Employee("Charlie", "IT", "Developer"),
            new Employee("Diana", "IT", "Developer"),
            new Employee("Eve", "IT", "Manager")
        );

        Map<String, Map<String, List<Employee>>> grouped = employees.stream()
            .collect(Collectors.groupingBy(
                emp -> emp.department,
                Collectors.groupingBy(emp -> emp.role)
            ));

        System.out.println("Employees grouped by department and role:");
        grouped.forEach((dept, roleMap) -> {
            System.out.println(dept + ":");
            roleMap.forEach((role, emps) -> {
                System.out.println("  " + role + " -> " + emps);
            });
        });
    }
}

Output:

Employees grouped by department and role:
HR:
  Manager -> [Alice]
  Recruiter -> [Bob]
IT:
  Developer -> [Charlie, Diana]
  Manager -> [Eve]

Example 2: Grouping Customers by Country and City

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class MultiLevelGroupingExample2 {
    static class Customer {
        String name;
        String country;
        String city;

        Customer(String name, String country, String city) {
            this.name = name;
            this.country = country;
            this.city = city;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<Customer> customers = List.of(
            new Customer("John", "USA", "New York"),
            new Customer("Jane", "USA", "Boston"),
            new Customer("Pierre", "France", "Paris"),
            new Customer("Marie", "France", "Lyon"),
            new Customer("Steve", "USA", "New York")
        );

        Map<String, Map<String, List<Customer>>> groupedByCountryCity = customers.stream()
            .collect(Collectors.groupingBy(
                c -> c.country,
                Collectors.groupingBy(c -> c.city)
            ));

        System.out.println("Customers grouped by country and city:");
        groupedByCountryCity.forEach((country, cityMap) -> {
            System.out.println(country + ":");
            cityMap.forEach((city, custs) -> {
                System.out.println("  " + city + " -> " + custs);
            });
        });
    }
}

Output:

Customers grouped by country and city:
USA:
  New York -> [John, Steve]
  Boston -> [Jane]
France:
  Paris -> [Pierre]
  Lyon -> [Marie]

Traversing Nested Groupings

The resulting nested map is a structure like:

Map<OuterKey, Map<InnerKey, List<Element>>>

You can traverse it using nested loops or stream operations, as shown in the examples. This structure allows you to drill down from coarse to fine groupings easily.

Summary

Mastering multi-level grouping empowers you to organize complex datasets intuitively and efficiently within your Java Streams workflows.

Index

11.3 Partitioning with Collectors.partitioningBy()

The Collectors.partitioningBy() method splits a stream into two groups based on a boolean predicate. Unlike groupingBy(), which can produce multiple groups keyed by any value, partitioningBy() always returns a Map<Boolean, List<T>>, dividing elements into those that satisfy the predicate (true key) and those that don’t (false key).

Key Differences from groupingBy()

Aspect partitioningBy() groupingBy()
Number of groups Always 2 (true and false) Arbitrary number of groups
Return type Map<Boolean, List<T>> (by default) Map<K, List<T>>
Use case Binary classification Multi-class classification
Performance Slightly more efficient for boolean grouping General-purpose grouping

partitioningBy() is simpler and often more performant when you only need a true/false split.

Example 1: Partitioning Numbers into Even and Odd

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class PartitioningExample1 {
    public static void main(String[] args) {
        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

        Map<Boolean, List<Integer>> partitioned = numbers.stream()
            .collect(Collectors.partitioningBy(n -> n % 2 == 0));

        System.out.println("Even numbers: " + partitioned.get(true));
        System.out.println("Odd numbers: " + partitioned.get(false));
    }
}

Output:

Even numbers: [2, 4, 6, 8, 10]
Odd numbers: [1, 3, 5, 7, 9]

Example 2: Partitioning Users into Active and Inactive

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class PartitioningExample2 {
    static class User {
        String name;
        boolean active;

        User(String name, boolean active) {
            this.name = name;
            this.active = active;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<User> users = List.of(
            new User("Alice", true),
            new User("Bob", false),
            new User("Charlie", true),
            new User("Diana", false)
        );

        Map<Boolean, List<User>> partitioned = users.stream()
            .collect(Collectors.partitioningBy(user -> user.active));

        System.out.println("Active users: " + partitioned.get(true));
        System.out.println("Inactive users: " + partitioned.get(false));
    }
}

Output:

Active users: [Alice, Charlie]
Inactive users: [Bob, Diana]

Using Downstream Collectors

You can further customize the collection of each partition with a downstream collector. For example, counting how many users are active vs inactive:

Map<Boolean, Long> countByActivity = users.stream()
    .collect(Collectors.partitioningBy(
        user -> user.active,
        Collectors.counting()
    ));

System.out.println("Count by activity: " + countByActivity);

Output:

Count by activity: {false=2, true=2}

Summary

Using partitioningBy() makes binary data splits clear and concise, enhancing both performance and code readability.

Index