PreviousNext

Syntax of regular expressions

A regular expression is a pattern denoted by a sequence of symbols that represent a state-machine or mini-application that is capable of matching particular sequences of characters. The character set operator [...] applies to ASCII characters only (Unicode characters 0 through 255). However, the complete Unicode characters set should be useable in the package's regular expressions. The regular expressions summarized in the table are Perl15 regular expressions.
Syntax
Description

Metacharacters

\

quotes the next metacharacter

^

matches the beginning of the line; does not match embedded newline characters

.

matches any character except for newline characters (/n)

$

matches the end of the line or matches before the newline character placed at the end; does not match embedded newline characters

|

separates alternatives

()

grouping

[]

indicates a character class

Standard quantifiers

*

matches 0 or more times; is equivalent to {0,}

+

matches 1 or more times; is equivalent to {1,}

?

matches 1 or 0 times; is equivalent to {0,1}; use this character following a standard quantifier to match the minimum number of times possible, without causing the rest of the pattern not to match

{n}

matches exactly n times

{n,}

matches at least n times

{n,m}

matches at least n, but not more than m times

Special backslashed characters

\b

null token that matches a word boundary; a word boundary is a spot between two characters that has \w on one side of it and \W on the other side of it in either order; within character classes, \b represents a backspace rather than a word boundary

\B

null token that matches a boundary that is not a word boundary

\A

matches only at the beginning of the string

\Z

matches only at the end of the string (or before newline at the end)

\z

matches only at the end of the string

\G

matches only at pos() (for example, at the end-of-match position of prior m//g)

\n

newline character

\r

carriage return

\t

tab

\f

formfeed

\d

digit [0-9]

\D

non-digit [^0-9]

\w

matches a single alphanumeric character [0-9a-z_A-Z], not a whole word

\W

non-word character [^0-9a-z_A-Z]

\s

whitespace character [\t\n\r\f]

\S

non-whitespace character [^\t\n\r\f]

\xnn

hexidecimal representation of character

\cD

matches the corresponding control character

\nn or \nnn

octal representation of character unless it is a backreference

\1,\2,\3, etc

backreference - matches whatever the first, second, or third parenthesized group matched; if a corresponding group is not available, the BlackBerry® IDE interprets the number as an octal representation of a character

\0

matches a null character

Note: Characters with backslashes, except for backreferences and boundaries, work within a character class. For example [abcd].

Visit www.perldoc.com for more information.

Related topic


   BlackBerry