Grouping and Partitioning Data

Java Streams

11.1 Grouping with `Collectors.groupingBy()`

The Collectors.groupingBy() method is one of the most powerful collectors in the Java Streams API, used to group elements of a stream based on a classification function. It partitions the input elements into a Map whose keys are the classification results, and whose values are collections (or other results) of the grouped elements.

Basic Syntax

Map<K, List<T>> grouped = stream.collect(Collectors.groupingBy(classifier));

classifier: a function that maps each element to a key (the grouping criterion).
By default, elements with the same key are collected into a List.

You can also specify downstream collectors to change the result collection type (e.g., Set, Map) or perform further reduction.

Example 1: Grouping Users by Age Bracket

Suppose we want to group a list of users by their age bracket (e.g., "Youth", "Adult", "Senior"):

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class GroupingExample1 {
    static class User {
        String name;
        int age;

        User(String name, int age) {
            this.name = name;
            this.age = age;
        }

        @Override
        public String toString() {
            return name + "(" + age + ")";
        }
    }

    public static void main(String[] args) {
        List<User> users = List.of(
            new User("Alice", 23),
            new User("Bob", 45),
            new User("Charlie", 17),
            new User("Diana", 65),
            new User("Eve", 34)
        );

        Map<String, List<User>> groupedByAgeBracket = users.stream()
            .collect(Collectors.groupingBy(user -> {
                if (user.age < 18) return "Youth";
                else if (user.age < 60) return "Adult";
                else return "Senior";
            }));

        System.out.println("Users grouped by age bracket:");
        groupedByAgeBracket.forEach((ageGroup, groupUsers) -> {
            System.out.println(ageGroup + ": " + groupUsers);
        });
    }
}

Output:

Users grouped by age bracket:
Youth: [Charlie(17)]
Adult: [Alice(23), Bob(45), Eve(34)]
Senior: [Diana(65)]

Example 2: Grouping Orders by Status with Different Collection Types

This example groups orders by their status and collects results into a Set to avoid duplicates.

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

public class GroupingExample2 {
    static class Order {
        String id;
        String status;

        Order(String id, String status) {
            this.id = id;
            this.status = status;
        }

        @Override
        public String toString() {
            return id;
        }
    }

    public static void main(String[] args) {
        List<Order> orders = List.of(
            new Order("1001", "SHIPPED"),
            new Order("1002", "PENDING"),
            new Order("1003", "SHIPPED"),
            new Order("1004", "CANCELLED"),
            new Order("1005", "PENDING")
        );

        Map<String, Set<Order>> groupedByStatus = orders.stream()
            .collect(Collectors.groupingBy(
                order -> order.status,
                Collectors.toSet()  // Collect into a Set instead of default List
            ));

        System.out.println("Orders grouped by status:");
        groupedByStatus.forEach((status, orderSet) -> {
            System.out.println(status + ": " + orderSet);
        });
    }
}

Output:

Orders grouped by status:
SHIPPED: [1003, 1001]
PENDING: [1005, 1002]
CANCELLED: [1004]

Summary

Collectors.groupingBy() groups stream elements according to a classification function, producing a Map<K, List<T>> by default.
You can customize the downstream collector to change the type of the values, e.g., Set, Map, or aggregate results.
Grouping is essential for data categorization, reporting, and aggregation tasks, enabling concise and expressive processing pipelines.

By mastering groupingBy(), you can transform flat streams into structured maps that reflect the logical organization of your data.

11.2 Multi-level Grouping

Multi-level grouping allows you to group data hierarchically by applying Collectors.groupingBy() multiple times in a nested fashion. This technique produces nested maps, where the value of one grouping is itself a map resulting from a further grouping operation.

Why Multi-level Grouping?

Sometimes, a single classification is not enough. For example:

Grouping employees first by department, then by role within each department.
Grouping customers by country, then by city.
Grouping products by category, then by brand.

Syntax Overview

Map<K1, Map<K2, List<T>>> nestedGrouping = stream.collect(
    Collectors.groupingBy(
        classifier1,
        Collectors.groupingBy(classifier2)
    )
);

Here, classifier1 is the outer grouping function, and classifier2 is the inner grouping function applied within each outer group.

Example 1: Grouping Employees by Department, then Role

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class MultiLevelGroupingExample1 {
    static class Employee {
        String name;
        String department;
        String role;

        Employee(String name, String department, String role) {
            this.name = name;
            this.department = department;
            this.role = role;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<Employee> employees = List.of(
            new Employee("Alice", "HR", "Manager"),
            new Employee("Bob", "HR", "Recruiter"),
            new Employee("Charlie", "IT", "Developer"),
            new Employee("Diana", "IT", "Developer"),
            new Employee("Eve", "IT", "Manager")
        );

        Map<String, Map<String, List<Employee>>> grouped = employees.stream()
            .collect(Collectors.groupingBy(
                emp -> emp.department,
                Collectors.groupingBy(emp -> emp.role)
            ));

        System.out.println("Employees grouped by department and role:");
        grouped.forEach((dept, roleMap) -> {
            System.out.println(dept + ":");
            roleMap.forEach((role, emps) -> {
                System.out.println("  " + role + " -> " + emps);
            });
        });
    }
}

Output:

Employees grouped by department and role:
HR:
  Manager -> [Alice]
  Recruiter -> [Bob]
IT:
  Developer -> [Charlie, Diana]
  Manager -> [Eve]

Example 2: Grouping Customers by Country and City

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class MultiLevelGroupingExample2 {
    static class Customer {
        String name;
        String country;
        String city;

        Customer(String name, String country, String city) {
            this.name = name;
            this.country = country;
            this.city = city;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<Customer> customers = List.of(
            new Customer("John", "USA", "New York"),
            new Customer("Jane", "USA", "Boston"),
            new Customer("Pierre", "France", "Paris"),
            new Customer("Marie", "France", "Lyon"),
            new Customer("Steve", "USA", "New York")
        );

        Map<String, Map<String, List<Customer>>> groupedByCountryCity = customers.stream()
            .collect(Collectors.groupingBy(
                c -> c.country,
                Collectors.groupingBy(c -> c.city)
            ));

        System.out.println("Customers grouped by country and city:");
        groupedByCountryCity.forEach((country, cityMap) -> {
            System.out.println(country + ":");
            cityMap.forEach((city, custs) -> {
                System.out.println("  " + city + " -> " + custs);
            });
        });
    }
}

Output:

Customers grouped by country and city:
USA:
  New York -> [John, Steve]
  Boston -> [Jane]
France:
  Paris -> [Pierre]
  Lyon -> [Marie]

Traversing Nested Groupings

The resulting nested map is a structure like:

Map<OuterKey, Map<InnerKey, List<Element>>>

You can traverse it using nested loops or stream operations, as shown in the examples. This structure allows you to drill down from coarse to fine groupings easily.

Summary

Multi-level grouping is a natural extension of groupingBy() for hierarchical data classification.
The outer collector defines the first grouping level, and the inner collector defines subsequent levels.
The result is a nested Map structure that supports flexible data navigation and aggregation.

Mastering multi-level grouping empowers you to organize complex datasets intuitively and efficiently within your Java Streams workflows.

11.3 Partitioning with `Collectors.partitioningBy()`

The Collectors.partitioningBy() method splits a stream into two groups based on a boolean predicate. Unlike groupingBy(), which can produce multiple groups keyed by any value, partitioningBy() always returns a Map<Boolean, List<T>>, dividing elements into those that satisfy the predicate (true key) and those that don’t (false key).

Key Differences from `groupingBy()`

Aspect	`partitioningBy()`	`groupingBy()`
Number of groups	Always 2 (true and false)	Arbitrary number of groups
Return type	`Map<Boolean, List<T>>` (by default)	`Map<K, List<T>>`
Use case	Binary classification	Multi-class classification
Performance	Slightly more efficient for boolean grouping	General-purpose grouping

partitioningBy() is simpler and often more performant when you only need a true/false split.

Example 1: Partitioning Numbers into Even and Odd

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class PartitioningExample1 {
    public static void main(String[] args) {
        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

        Map<Boolean, List<Integer>> partitioned = numbers.stream()
            .collect(Collectors.partitioningBy(n -> n % 2 == 0));

        System.out.println("Even numbers: " + partitioned.get(true));
        System.out.println("Odd numbers: " + partitioned.get(false));
    }
}

Output:

Even numbers: [2, 4, 6, 8, 10]
Odd numbers: [1, 3, 5, 7, 9]

Example 2: Partitioning Users into Active and Inactive

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class PartitioningExample2 {
    static class User {
        String name;
        boolean active;

        User(String name, boolean active) {
            this.name = name;
            this.active = active;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    public static void main(String[] args) {
        List<User> users = List.of(
            new User("Alice", true),
            new User("Bob", false),
            new User("Charlie", true),
            new User("Diana", false)
        );

        Map<Boolean, List<User>> partitioned = users.stream()
            .collect(Collectors.partitioningBy(user -> user.active));

        System.out.println("Active users: " + partitioned.get(true));
        System.out.println("Inactive users: " + partitioned.get(false));
    }
}

Output:

Active users: [Alice, Charlie]
Inactive users: [Bob, Diana]

Using Downstream Collectors

You can further customize the collection of each partition with a downstream collector. For example, counting how many users are active vs inactive:

Map<Boolean, Long> countByActivity = users.stream()
    .collect(Collectors.partitioningBy(
        user -> user.active,
        Collectors.counting()
    ));

System.out.println("Count by activity: " + countByActivity);

Output:

Count by activity: {false=2, true=2}

Summary

partitioningBy() splits stream elements into two groups based on a boolean predicate, returning a Map<Boolean, List<T>>.
It is simpler and more efficient than groupingBy() when only binary partitioning is needed.
Downstream collectors can be used to perform further reductions on each partition.
Common use cases include separating even/odd numbers, active/inactive users, or any yes/no classification.

Using partitioningBy() makes binary data splits clear and concise, enhancing both performance and code readability.

Grouping and Partitioning Data

Java Streams

11.1 Grouping with Collectors.groupingBy()

Basic Syntax

Example 1: Grouping Users by Age Bracket

Example 2: Grouping Orders by Status with Different Collection Types

Summary

11.2 Multi-level Grouping

Why Multi-level Grouping?

Syntax Overview

Example 1: Grouping Employees by Department, then Role

Example 2: Grouping Customers by Country and City

Traversing Nested Groupings

Summary

11.3 Partitioning with Collectors.partitioningBy()

Key Differences from groupingBy()

Example 1: Partitioning Numbers into Even and Odd

Example 2: Partitioning Users into Active and Inactive

Using Downstream Collectors

Summary

Related Books

11.1 Grouping with `Collectors.groupingBy()`

11.3 Partitioning with `Collectors.partitioningBy()`

Key Differences from `groupingBy()`