Java Streams provide many built-in collectors, but sometimes you need a custom collector to handle special data aggregation or formatting needs. Writing your own collector involves implementing the Collector
interface or using the convenient Collector.of()
factory method by specifying five core components:
Component | Description |
---|---|
Supplier | Creates a new mutable container to hold partial results (e.g., a StringBuilder or list). |
Accumulator | Adds an element from the stream to the mutable container. |
Combiner | Merges two partial containers (needed for parallel processing). |
Finisher | Converts the container into the final desired result (may be identity if no conversion needed). |
Characteristics | Provides hints about the collector’s behavior, like UNORDERED or IDENTITY_FINISH . |
Suppose you want to collect a stream of strings into a single CSV line, enclosed in quotes and separated by commas, such as:
"apple","banana","cherry"
Here’s how to build this collector step-by-step:
import java.util.Set;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class CustomCollectorExample {
public static void main(String[] args) {
var fruits = java.util.List.of("apple", "banana", "cherry");
String csv = fruits.stream()
.collect(csvCollector());
System.out.println(csv);
}
public static Collector<String, StringBuilder, String> csvCollector() {
return Collector.of(
// Supplier: create a new StringBuilder to accumulate results
StringBuilder::new,
// Accumulator: add each string wrapped in quotes with trailing comma
(sb, s) -> {
if (sb.length() > 0) sb.append(",");
sb.append("\"").append(s).append("\"");
},
// Combiner: merge two StringBuilders (needed for parallel streams)
(sb1, sb2) -> {
if (sb1.length() == 0) return sb2;
if (sb2.length() == 0) return sb1;
sb1.append(",").append(sb2);
return sb1;
},
// Finisher: convert StringBuilder to String (final output)
StringBuilder::toString,
// Characteristics: no special behavior, so empty set
Collector.Characteristics.IDENTITY_FINISH
);
}
}
Supplier: Creates an empty StringBuilder
to accumulate elements.
Accumulator: For each element s
, appends a comma if needed, then adds the string enclosed in quotes.
Combiner: Joins two partial StringBuilder
results by appending a comma between them, crucial for parallel execution.
Finisher: Converts the StringBuilder
into the final String
.
Characteristics: Specifies IDENTITY_FINISH
since the finisher is a simple conversion and the container isn’t reused beyond this.
Collectors.joining()
work well for simple joins but may not handle complex formatting or aggregation logic.By understanding these components and how to implement them, you can create efficient, thread-safe, and flexible collectors tailored to your data processing needs.
The Collector
interface in Java Streams defines how to reduce a stream of elements into a single summary result. It abstracts the process of mutable reduction — accumulating elements, combining partial results (for parallel processing), and finishing with a final transformation.
T, A, R
T
: The type of input elements in the stream.A
: The mutable accumulation type, i.e., the container or intermediate data structure that holds partial results during processing.R
: The result type returned after the final transformation.A Collector<T, A, R>
consists of these essential components:
Component | Role |
---|---|
supplier() | Provides a fresh mutable container (A ) to hold partial results. |
accumulator() | Takes the container (A ) and an element (T ) and incorporates the element into the container. |
combiner() | Merges two containers (A and A ) into one — critical for parallel execution where partial results are combined. |
finisher() | Transforms the container (A ) into the final result type (R ). Often an identity function if no transformation is needed. |
characteristics() | Describes behavioral hints about the collector, like concurrency or ordering. |
The characteristics()
method returns a Set<Collector.Characteristics>
with zero or more of these flags:
CONCURRENT: Indicates that the accumulator function can be called concurrently from multiple threads on the same container. Allows the Stream framework to perform parallel reductions without additional combining steps.
UNORDERED: Declares that the collector does not care about the encounter order of the stream elements. This can enable optimizations.
IDENTITY_FINISH: Means the finisher is the identity function, so the accumulator type A
and the result type R
are the same, avoiding an additional transformation step.
Here’s an abstract view of how the collector’s components fit together during reduction:
// 1. Create a new container to hold partial results
A container = supplier.get();
// 2. Process each element in the stream:
for (T element : stream) {
accumulator.accept(container, element);
}
// 3. For parallel streams, combine partial containers:
A combined = combiner.apply(container1, container2);
// 4. Final transformation to result:
R result = finisher.apply(combined);
public interface Collector<T, A, R> {
Supplier<A> supplier();
BiConsumer<A, T> accumulator();
BinaryOperator<A> combiner();
Function<A, R> finisher();
Set<Characteristics> characteristics();
enum Characteristics {
CONCURRENT, UNORDERED, IDENTITY_FINISH
}
}
Collector
interface defines how streams are reduced by accumulating elements into a mutable container and then producing a final result.<T, A, R>
represent the input element type, the mutable accumulator type, and the final result type.Custom collectors empower you to tailor stream reductions to your unique needs. Here are three practical examples demonstrating how to implement and use such collectors for common real-world tasks.
A histogram counts occurrences of each distinct element. This collector accumulates frequencies in a Map<String, Integer>
.
import java.util.*;
import java.util.stream.Collector;
public class HistogramCollector {
public static Collector<String, Map<String, Integer>, Map<String, Integer>> toHistogram() {
return Collector.of(
HashMap::new, // Supplier: create a new map
(map, word) -> map.merge(word, 1, Integer::sum), // Accumulator: increment count
(map1, map2) -> { // Combiner: merge two maps
map2.forEach((k, v) -> map1.merge(k, v, Integer::sum));
return map1;
}
);
}
public static void main(String[] args) {
List<String> words = List.of("apple", "banana", "apple", "orange", "banana", "banana");
Map<String, Integer> histogram = words.stream()
.collect(toHistogram());
System.out.println(histogram);
}
}
Expected Output:
{orange=1, banana=3, apple=2}
This collector joins strings with a delimiter and adds a custom prefix and suffix, useful for generating formatted output.
import java.util.List;
import java.util.Set;
import java.util.stream.Collector;
public class WrappedStringCollector {
public static Collector<String, StringBuilder, String> wrappedCollector(
String prefix, String delimiter, String suffix) {
return Collector.of(
StringBuilder::new,
(sb, s) -> {
if (sb.length() > 0) sb.append(delimiter);
sb.append(s);
},
(sb1, sb2) -> {
if (sb1.length() == 0) return sb2;
if (sb2.length() == 0) return sb1;
return sb1.append(delimiter).append(sb2.toString());
},
sb -> prefix + sb.toString() + suffix
// Removed Characteristics.UNORDERED — order matters
);
}
public static void main(String[] args) {
List<String> items = List.of("red", "green", "blue");
String result = items.stream()
.collect(wrappedCollector("[", "; ", "]"));
System.out.println(result); // Output: [red; green; blue]
}
}
Expected Output:
[red; green; blue]
Suppose you have a list of Person
objects and want to collect their details into a neatly formatted table string.
import java.util.*;
import java.util.stream.Collector;
public class TableCollector {
static class Person {
String name;
int age;
Person(String name, int age) {
this.name = name;
this.age = age;
}
}
public static Collector<Person, StringBuilder, String> toTable() {
return Collector.of(
StringBuilder::new,
(sb, p) -> sb.append(String.format("%-10s | %3d%n", p.name, p.age)),
(sb1, sb2) -> {
sb1.append(sb2);
return sb1;
},
StringBuilder::toString // <-- finisher added here
);
}
public static void main(String[] args) {
List<Person> people = List.of(
new Person("Alice", 30),
new Person("Bob", 25),
new Person("Carol", 28)
);
String table = people.stream()
.collect(toTable());
System.out.println("Name | Age\n------------------");
System.out.print(table);
}
}
Expected Output:
Name | Age
------------------
Alice | 30
Bob | 25
Carol | 28
Map
.Each example demonstrates clean, reusable designs and highlights how custom collectors can be tailored to solve varied data processing tasks seamlessly within the Streams API.