Java - Regular Expressions Character Classes

Introduction

The metacharacters [ and ] sets specify a character class inside a regular expression.

A character class is a set of characters.

The regular expression tries to match one character from the set.

Example

The character class "[ABC]" will match characters A, B, or C.

strings "A@V", "B@V", and "C@V" will match the regular expression "[ABC]@.".

string "H@V" will not match the regular expression "[ABC]@." because @ is not preceded by A, B, or C.

strings "man" or "men" will match the regular expression "m[ae]n".

To specify a range of characters using a character class, use a hyphen - character. For example,

  • "[A-Z]" represents any uppercase English letters;
  • "[0-9]" represents any digit between 0 and 9.

^ in the beginning of a character class negates the meaning. For example,

  • "[^ABC]" means any character except A, B, and C.
  • "[^A-Z]" represents any character except uppercase English letters.

If you use ^ anywhere in a character class except in the beginning, it loses its special meaning and it matches just a ^ character.

For example, "[ABC^]" will match A, B, C, or ^.

You can include two or more ranges in one character class. For example,

  • "[a-zA-Z]" matches any character a through z and A through Z.
  • "[a-zA-Z0-9]" matches any character a through z (uppercase and lowercase), and any digit 0 through 9.

Some examples of character classes are listed in the following table.

Character Classes
Meaning
Category
[abc]
Character a, b, or c
Simple character class
[^xyz]
A character except x, y, and z
Complement or negation
[a-p]
Characters a through p
Range
[a-cx-z]

Characters a through c, or x through z, which would
include a, b, c, x, y, or z.
Union

[0-9&&[4-8]]
Intersection of two ranges (4, 5, 6, 7, or 8)
Intersection
[a-z&&[^aeiou]]

All lowercase letters minus vowels. In other words, a lowercase
letter, which is not a vowel. That is, all lowercase consonants.
Subtraction

Related Topics