Index

Basic Regex Syntax and Patterns

Java Regex

2.1 Literal characters and metacharacters

In regular expressions, literal characters are the exact characters you want to match in the text. For example, the pattern cat matches the characters 'c', 'a', and 't' in that order, exactly as they appear.

On the other hand, metacharacters are special characters that have a unique meaning in regex syntax. They do not represent themselves literally but instead perform specific functions that help define complex patterns. Understanding metacharacters is essential to unlock the full power of regular expressions.

Here are some of the most common metacharacters and their special meanings:

Because these metacharacters have special functions, if you need to match them literally in your text, you must escape them by preceding them with a backslash (\). For example, to match a literal dot (.), use the pattern \.; to match a literal asterisk (*), use \*.

Examples:

In summary, literal characters match themselves, while metacharacters control how the matching is performed. Learning when and how to escape metacharacters allows you to write precise and effective regex patterns.

Index

2.2 Character classes and predefined character sets

Character classes in regex allow you to specify a set or range of characters that you want to match at a particular position in the text. They are defined using square brackets [ ]. When a regex engine encounters a character class, it matches any one character that belongs to that set.

For example, the pattern [abc] matches any single character that is either a, b, or c. So it will match "a" in "apple", "b" in "bat", or "c" in "cat". You can also specify ranges inside the brackets, such as [a-z] to match any lowercase letter, or [0-9] to match any digit.

You can combine ranges and individual characters, for example [a-zA-Z0-9] matches any uppercase letter, lowercase letter, or digit.

If you need to match any character except those inside the brackets, you can use a negated character class by starting it with a caret ^, like [^0-9], which matches any character that is not a digit.

Predefined Character Sets

Java regex also provides several predefined character sets, which are shortcuts for commonly used character classes:

Their opposites match any character not in the set:

Examples

For instance, the regex pattern \d{3} matches exactly three digits in a row, which is useful when extracting area codes or zip codes from text.

Practical Uses

Character classes and predefined sets make it easy to write flexible patterns that match groups of similar characters without enumerating every option. For example, if you need to validate or extract numbers, letters, or whitespace-separated words from input, these classes help keep your regex concise and readable.

By combining character classes and predefined sets, you can build powerful regex patterns to efficiently process and analyze text in Java programs.

Index

2.3 Quantifiers: *, +, ?, {n,m}

Quantifiers are one of the most powerful features of regular expressions. They specify how many times the preceding element (a literal, character class, or group) should be matched. Quantifiers allow you to match repeated patterns flexibly, making regex capable of handling varied text lengths and optional parts.

Here are the main quantifiers you’ll use frequently:

Typical Usage Patterns

Quantifiers are often used to match repeated characters or strings of variable length:

Summary

Quantifiers extend regex patterns to handle repetition and optional elements elegantly. By combining quantifiers with literals, classes, or groups, you can create flexible expressions that match diverse inputs, from single characters to long strings with varying lengths.

Index

2.4 Anchors: ^ and $

Unlike most regex elements that match characters, anchors are special symbols that match positions in the input string. They do not consume any characters themselves but assert where in the text a match must occur.

The two most commonly used anchors are:

How Anchors Affect Matching

Anchors are crucial when you want your pattern to match text only if it appears at a specific position. Without anchors, regex looks for the pattern anywhere in the string.

For example, consider the pattern cat:

But if you use the anchor ^ as in ^cat, the regex engine tries to match "cat" only at the beginning of the string:

Similarly, the $ anchor enforces matching at the end of the string:

Practical Uses

Anchors are especially important in validation tasks, where you want to ensure the entire input meets certain criteria. For instance, if you need to check whether a string contains only digits, you would use:

^\d+$

This pattern matches strings that start (^) and end ($) with one or more digits (\d+), and nothing else.

Without anchors, the same pattern \d+ would match any substring of digits inside a longer string, which might not be what you want.

In summary, anchors allow you to control the position of your matches within the text, enabling precise and reliable pattern matching — a vital tool in tasks like input validation and exact text searches.

Index

2.5 Simple matching examples

Let’s look at some straightforward Java regex examples that combine literals, character classes, quantifiers, and anchors. Each example is self-contained and demonstrates common tasks you’ll encounter when working with regex.

Example 1: Validate a Simple Word

This example checks if a string contains only the word "hello" exactly.

import java.util.regex.Pattern;

public class ValidateWord {
    public static void main(String[] args) {
        String input = "hello";
        String regex = "^hello$"; // Match 'hello' exactly from start to end

        boolean matches = Pattern.matches(regex, input);
        System.out.println("Matches 'hello' exactly? " + matches);
    }
}

Example 2: Extract All Numbers from a String

This example finds and prints all numbers inside a given text.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExtractNumbers {
    public static void main(String[] args) {
        String input = "Order 123 costs $45 and 67 cents.";
        String regex = "\\d+"; // Match one or more digits

        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(input);

        while (matcher.find()) {
            System.out.println("Found number: " + matcher.group());
        }
    }
}

Example 3: Check if String Starts with a Capital Letter

This example verifies if the input starts with a capital letter.

import java.util.regex.Pattern;

public class StartsWithCapital {
    public static void main(String[] args) {
        String input = "Java is fun.";
        String regex = "^[A-Z].*"; // Start of string, then a capital letter, then any characters

        boolean matches = Pattern.matches(regex, input);
        System.out.println("Starts with capital letter? " + matches);
    }
}

Example 4: Match an Optional ‘s’ at the End

This example matches "cat" or "cats" — with the 's' being optional.

import java.util.regex.Pattern;

public class OptionalS {
    public static void main(String[] args) {
        String input = "cats";
        String regex = "^cats?$"; // 's' is optional due to '?'

        boolean matches = Pattern.matches(regex, input);
        System.out.println("Matches 'cat' or 'cats'? " + matches);
    }
}

Each of these examples illustrates how to combine regex features to solve practical problems—validating exact words, extracting numbers, checking string boundaries, and handling optional parts. Running these examples will help you become comfortable with the basics of Java regex.

Index