Index

Using Streams for Data Processing

Java Functional Programming

4.1 Introduction to Streams API

The Streams API, introduced in Java 8, is one of the most powerful additions to the language and a cornerstone of functional programming in Java. It provides a high-level, declarative way to process sequences of data—whether from collections, arrays, or I/O sources—using a fluent and composable pipeline model.

At its core, a stream is a sequence of elements that supports various operations to compute results. Unlike collections, which store data in memory, streams are designed for computation, not storage. This distinction makes streams ideal for chaining operations like filtering, mapping, and reducing in a functional style.

Streams vs. Collections

Traditional collections like List or Set are data structures, designed to hold and access elements. In contrast, a stream is a view or pipeline on a data source that can be processed in a functional way.

Key differences:

Functional Programming with Streams

Streams support a wide range of functional operations, such as:

Each operation in a stream pipeline returns a new stream, enabling composition of multiple operations in a readable and fluent way.

Streams also support lazy evaluation, meaning intermediate operations are only performed when a terminal operation is triggered. This results in more efficient processing, especially with large datasets or infinite streams.

Imperative vs. Declarative Example

Let’s compare traditional iteration with a stream-based approach:

Imperative (loop-based):

List<String> names = List.of("Alice", "Bob", "Charlie", "David");
List<String> result = new ArrayList<>();

for (String name : names) {
    if (name.startsWith("A")) {
        result.add(name.toUpperCase());
    }
}
System.out.println(result); // Output: [ALICE]

Declarative (stream-based):

List<String> result = names.stream()
    .filter(name -> name.startsWith("A"))
    .map(String::toUpperCase)
    .collect(Collectors.toList());

System.out.println(result); // Output: [ALICE]

The stream version is shorter, clearer, and more expressive. It describes what to do, not how to do it.

Conclusion

The Streams API revolutionizes how Java developers process data. It encourages a functional approach that is clean, composable, and scalable. With support for parallelism, lazy evaluation, and a rich set of operations, streams form the backbone of modern, functional Java programming.

Index

4.2 Creating Streams: from Collections, Arrays, and I/O

Streams in Java can be created from a variety of data sources including collections, arrays, and I/O sources. This flexibility allows developers to apply functional-style operations across different types of data consistently and efficiently.

Streams from Collections

Most commonly, streams are created from Java collections like List, Set, or Map.

Example – Sequential Stream:

List<String> names = List.of("Alice", "Bob", "Charlie");
names.stream()
     .filter(n -> n.length() > 3)
     .forEach(System.out::println);

Example – Parallel Stream:

names.parallelStream()
     .map(String::toUpperCase)
     .forEach(System.out::println); // May print in non-deterministic order

Use parallel streams when operations are independent, CPU-intensive, and performance gain outweighs the overhead of multithreading. Be cautious with side effects and shared mutable state.

Streams from Arrays

Java provides utility methods to create streams from arrays via the Arrays class.

Example – Stream from Array:

int[] numbers = {1, 2, 3, 4, 5};
int sum = Arrays.stream(numbers)
                .filter(n -> n % 2 == 0)
                .sum(); // Output: 6
System.out.println("Sum of evens: " + sum);

For object arrays, you can use:

String[] fruits = {"apple", "banana", "cherry"};
Stream<String> fruitStream = Arrays.stream(fruits);
fruitStream.forEach(System.out::println);

Streams from I/O Sources

The java.nio.file.Files class provides powerful stream-based methods to read files line by line.

Example – Stream from File:

import java.nio.file.*;
import java.io.IOException;

public class FileStreamExample {
    public static void main(String[] args) throws IOException {
        Path path = Paths.get("data.txt");
        Files.lines(path)
             .filter(line -> !line.isBlank())
             .map(String::trim)
             .forEach(System.out::println);
    }
}

This approach reads a file lazily and efficiently—ideal for processing large files without loading the entire content into memory.

Summary

Source Type Method
Collections stream(), parallelStream()
Arrays Arrays.stream(array)
Files (text) Files.lines(Path)

By understanding how to create streams from these sources, you can effectively harness the power of Java’s functional programming features across various data contexts. This sets the foundation for writing clean, declarative, and efficient data processing pipelines.

Index

4.3 Intermediate Operations: map, filter, sorted, distinct

In the Java Streams API, intermediate operations are used to build up a pipeline of transformations. These operations do not produce a final result immediately; instead, they return a new stream, allowing you to chain multiple operations together. Actual processing occurs only when a terminal operation (like collect() or forEach()) is invoked, making intermediate operations lazy.

This laziness enables efficient data processing, as elements are only evaluated as needed, and in some cases, not at all (e.g., when using limit() or findFirst()).

Let’s explore four essential intermediate operations: map, filter, sorted, and distinct.

map() Transforming Elements

The map(Function<T, R>) method transforms each element in a stream from type T to type R. It is ideal for converting data.

Example: Convert a list of names to uppercase

List<String> names = List.of("alice", "bob", "charlie");
List<String> upper = names.stream()
    .map(String::toUpperCase)
    .collect(Collectors.toList());

System.out.println(upper); // Output: [ALICE, BOB, CHARLIE]

filter() Selecting Elements by Condition

The filter(Predicate<T>) method includes only those elements that match a given condition.

Example: Keep only even numbers

List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);
List<Integer> evens = numbers.stream()
    .filter(n -> n % 2 == 0)
    .collect(Collectors.toList());

System.out.println(evens); // Output: [2, 4, 6]

sorted() Ordering Elements

The sorted() method sorts the stream’s elements. It uses natural ordering by default, but you can supply a custom Comparator.

Example: Sort strings by length

List<String> words = List.of("banana", "apple", "kiwi");
List<String> sorted = words.stream()
    .sorted(Comparator.comparingInt(String::length))
    .collect(Collectors.toList());

System.out.println(sorted); // Output: [kiwi, apple, banana]

distinct() Removing Duplicates

The distinct() method removes duplicate elements based on equals().

Example: Remove duplicate integers

List<Integer> nums = List.of(1, 2, 2, 3, 3, 3, 4);
List<Integer> unique = nums.stream()
    .distinct()
    .collect(Collectors.toList());

System.out.println(unique); // Output: [1, 2, 3, 4]

Chaining Intermediate Operations

Intermediate operations can be chained fluently to create powerful and readable data pipelines.

Example: Transform and filter names

List<String> names = List.of("Alice", "Bob", "Alex", "Charlie");
List<String> result = names.stream()
    .filter(n -> n.startsWith("A"))
    .map(String::toUpperCase)
    .sorted()
    .collect(Collectors.toList());

System.out.println(result); // Output: [ALEX, ALICE]
Click to view full runnable Code

import java.util.*;
import java.util.stream.*;

public class StreamOperationsDemo {

    public static void main(String[] args) {
        // map(): Convert names to uppercase
        List<String> names = List.of("alice", "bob", "charlie");
        List<String> upper = names.stream()
            .map(String::toUpperCase)
            .collect(Collectors.toList());
        System.out.println("Uppercase names: " + upper); // [ALICE, BOB, CHARLIE]

        // filter(): Keep only even numbers
        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);
        List<Integer> evens = numbers.stream()
            .filter(n -> n % 2 == 0)
            .collect(Collectors.toList());
        System.out.println("Even numbers: " + evens); // [2, 4, 6]

        // sorted(): Sort strings by length
        List<String> words = List.of("banana", "apple", "kiwi");
        List<String> sorted = words.stream()
            .sorted(Comparator.comparingInt(String::length))
            .collect(Collectors.toList());
        System.out.println("Sorted by length: " + sorted); // [kiwi, apple, banana]

        // distinct(): Remove duplicate integers
        List<Integer> nums = List.of(1, 2, 2, 3, 3, 3, 4);
        List<Integer> unique = nums.stream()
            .distinct()
            .collect(Collectors.toList());
        System.out.println("Unique values: " + unique); // [1, 2, 3, 4]

        // Chaining: Filter names that start with "A", uppercase, sort
        List<String> mixedNames = List.of("Alice", "Bob", "Alex", "Charlie");
        List<String> result = mixedNames.stream()
            .filter(n -> n.startsWith("A"))
            .map(String::toUpperCase)
            .sorted()
            .collect(Collectors.toList());
        System.out.println("Filtered and transformed: " + result); // [ALEX, ALICE]
    }
}

Conclusion

Intermediate operations are composable, lazy, and pure—they transform or filter data without modifying the underlying source. Their lazy nature allows the Java runtime to optimize stream pipelines and avoid unnecessary computations. By combining map, filter, sorted, and distinct, you can write expressive, functional-style code to process data cleanly and efficiently.

Index

4.4 Terminal Operations: forEach, collect, reduce

In Java’s Streams API, terminal operations are the final step in a stream pipeline. They trigger execution of all intermediate operations and produce a result (e.g., a collection, a single value) or cause a side effect (e.g., printing values). After a terminal operation is invoked, the stream can no longer be used.

In this section, we explore three of the most commonly used terminal operations: forEach, collect, and reduce.

forEach() Iteration and Side Effects

The forEach(Consumer<? super T> action) method performs an action for each element in the stream. It is often used for logging, printing, or performing side effects.

Example: Printing items

List<String> names = List.of("Alice", "Bob", "Charlie");

names.stream()
     .filter(name -> name.length() > 3)
     .forEach(System.out::println); // Output: Alice, Charlie

Note: forEach should be avoided for modifying external state, especially in parallel streams, as it may lead to unpredictable results.

collect() Gathering Results

The collect() method performs a mutable reduction of elements into a collection, string, map, or other structure. It uses the Collectors utility class to define the type of accumulation.

Common Collectors:

Example: Collect to list

List<String> upper = names.stream()
    .map(String::toUpperCase)
    .collect(Collectors.toList());

System.out.println(upper); // Output: [ALICE, BOB, CHARLIE]

Example: Grouping by string length

Map<Integer, List<String>> grouped = names.stream()
    .collect(Collectors.groupingBy(String::length));

System.out.println(grouped);
// Output: {3=[Bob], 5=[Alice], 7=[Charlie]}

Example: Join names into a string

String joined = names.stream()
    .collect(Collectors.joining(", "));

System.out.println(joined); // Output: Alice, Bob, Charlie

The collect() method is powerful and flexible for producing results in many shapes.

reduce() Aggregating Values

The reduce() method combines stream elements into a single result using an accumulator function.

There are three common forms:

Example: Sum of numbers

List<Integer> nums = List.of(1, 2, 3, 4, 5);
int sum = nums.stream()
    .reduce(0, Integer::sum); // identity: 0, accumulator: sum

System.out.println("Sum: " + sum); // Output: Sum: 15

Example: Find the longest name

Optional<String> longest = names.stream()
    .reduce((a, b) -> a.length() > b.length() ? a : b);

longest.ifPresent(System.out::println); // Output: Charlie
Click to view full runnable Code

import java.util.*;
import java.util.stream.*;

public class StreamTerminalOperationsDemo {

    public static void main(String[] args) {
        List<String> names = List.of("Alice", "Bob", "Charlie");

        // forEach(): print names longer than 3 characters
        System.out.println("Names with length > 3:");
        names.stream()
             .filter(name -> name.length() > 3)
             .forEach(System.out::println); // Alice, Charlie

        // collect(): map to uppercase and collect to list
        List<String> upper = names.stream()
            .map(String::toUpperCase)
            .collect(Collectors.toList());
        System.out.println("\nUppercase list: " + upper); // [ALICE, BOB, CHARLIE]

        // collect(): group by string length
        Map<Integer, List<String>> grouped = names.stream()
            .collect(Collectors.groupingBy(String::length));
        System.out.println("\nGrouped by length: " + grouped);
        // Example output: {3=[Bob], 5=[Alice], 7=[Charlie]}

        // collect(): join names into a string
        String joined = names.stream()
            .collect(Collectors.joining(", "));
        System.out.println("\nJoined names: " + joined); // Alice, Bob, Charlie

        // reduce(): sum of numbers
        List<Integer> nums = List.of(1, 2, 3, 4, 5);
        int sum = nums.stream()
            .reduce(0, Integer::sum);
        System.out.println("\nSum of numbers: " + sum); // 15

        // reduce(): find the longest name
        Optional<String> longest = names.stream()
            .reduce((a, b) -> a.length() > b.length() ? a : b);
        longest.ifPresent(name -> System.out.println("\nLongest name: " + name)); // Charlie
    }
}

Summary

Operation Purpose
forEach Perform side effects on each element
collect Accumulate elements into containers
reduce Aggregate elements into a single value

Terminal operations finalize stream processing and allow you to extract meaningful results from transformed data. By using forEach, collect, and reduce, you can build expressive and efficient pipelines that transform data from raw sequences into structured outcomes.

Index

4.5 Example: Processing Employee Data with Streams

This example demonstrates how to use Java Streams to process a list of employee objects. We will filter employees by department and salary, map their data to extract names and emails, sort them by salary, and collect the results into new lists.

Employee Class and Sample Data

import java.util.*;
import java.util.stream.*;

class Employee {
    private String name;
    private String email;
    private String department;
    private double salary;

    public Employee(String name, String email, String department, double salary) {
        this.name = name;
        this.email = email;
        this.department = department;
        this.salary = salary;
    }

    public String getName() { return name; }
    public String getEmail() { return email; }
    public String getDepartment() { return department; }
    public double getSalary() { return salary; }

    @Override
    public String toString() {
        return name + " (" + department + "), $" + salary;
    }
}

public class EmployeeStreamExample {
    public static void main(String[] args) {
        List<Employee> employees = List.of(
            new Employee("Alice", "alice@example.com", "Engineering", 95000),
            new Employee("Bob", "bob@example.com", "Sales", 55000),
            new Employee("Charlie", "charlie@example.com", "Engineering", 105000),
            new Employee("Diana", "diana@example.com", "Marketing", 60000),
            new Employee("Evan", "evan@example.com", "Engineering", 75000)
        );

        // Filter Engineering employees with salary > 80000
        List<Employee> highEarners = employees.stream()
            .filter(e -> e.getDepartment().equals("Engineering"))
            .filter(e -> e.getSalary() > 80000)
            .sorted(Comparator.comparingDouble(Employee::getSalary).reversed())
            .collect(Collectors.toList());

        System.out.println("High earning Engineers:");
        highEarners.forEach(System.out::println);

        // Extract email addresses of Marketing employees
        List<String> marketingEmails = employees.stream()
            .filter(e -> e.getDepartment().equals("Marketing"))
            .map(Employee::getEmail)
            .collect(Collectors.toList());

        System.out.println("\nMarketing team emails:");
        marketingEmails.forEach(System.out::println);
    }
}

Explanation

Expected Output

High earning Engineers:
Charlie (Engineering), $105000.0
Alice (Engineering), $95000.0

Marketing team emails:
diana@example.com

This example highlights the expressiveness of stream pipelines for data processing, showcasing filtering, transformation, sorting, and collecting with clear and readable code.

Index