Compile a pattern that can will match a string if the string starts with any of the given terms. - Java Regular Expressions

Java examples for Regular Expressions:Pattern

Description

Compile a pattern that can will match a string if the string starts with any of the given terms.

Demo Code

/*//w w w . j  a v  a 2 s  .c  o  m
 * Static String formatting and query routines.
 * Copyright (C) 2001-2005 Stephen Ostermiller
 * http://ostermiller.org/contact.pl?regarding=Java+Utilities
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * See COPYING.TXT for details.
 */
//package com.java2s;

import java.util.regex.Pattern;

public class Main {
    public static void main(String[] argv) {
        String[] terms = new String[] { "1", "abc", "level", null,
                "java2s.com", "asdf 123" };
        System.out.println(getStartsWithAnyPattern(terms));
    }

    /**
     * Compile a pattern that can will match a string if the string
     * starts with any of the given terms.
     * <p>
     * Usage:<br>
     * <code>boolean b = getStartsWithAnyPattern(terms).matcher(s).matches();</code>
     * <p>
     * If multiple strings are matched against the same set of terms,
     * it is more efficient to reuse the pattern returned by this function.
     *
     * @param terms Array of search strings.
     * @return Compiled pattern that can be used to match a string to see if it starts with any of the terms.
     *
     * @since ostermillerutils 1.02.25
     */
    public static Pattern getStartsWithAnyPattern(String[] terms) {
        StringBuffer sb = new StringBuffer();
        sb.append("(?s)\\A");
        buildFindAnyPattern(terms, sb);
        sb.append(".*");
        return Pattern.compile(sb.toString());
    }

    /**
     * Build a regular expression that is each of the terms or'd together.
     *
     * @param terms a list of search terms.
     * @param sb place to build the regular expression.
     * @throws IllegalArgumentException if the length of terms is zero.
     *
     * @since ostermillerutils 1.02.25
     */
    private static void buildFindAnyPattern(String[] terms, StringBuffer sb) {
        if (terms.length == 0)
            throw new IllegalArgumentException(
                    "There must be at least one term to find.");
        sb.append("(?:");
        for (int i = 0; i < terms.length; i++) {
            if (i > 0)
                sb.append("|");
            sb.append("(?:");
            sb.append(escapeRegularExpressionLiteral(terms[i]));
            sb.append(")");
        }
        sb.append(")");
    }

    /**
     * Escapes characters that have special meaning to
     * regular expressions
     *
     * @param s String to be escaped
     * @return escaped String
     * @throws NullPointerException if s is null.
     *
     * @since ostermillerutils 1.02.25
     */
    public static String escapeRegularExpressionLiteral(String s) {
        // According to the documentation in the Pattern class:
        //
        // The backslash character ('\') serves to introduce escaped constructs,
        // as defined in the table above, as well as to quote characters that
        // otherwise would be interpreted as unescaped constructs. Thus the
        // expression \\ matches a single backslash and \{ matches a left brace.
        //
        // It is an error to use a backslash prior to any alphabetic character
        // that does not denote an escaped construct; these are reserved for future
        // extensions to the regular-expression language. A backslash may be used
        // prior to a non-alphabetic character regardless of whether that character
        // is part of an unescaped construct.
        //
        // As a result, escape everything except [0-9a-zA-Z]

        int length = s.length();
        int newLength = length;
        // first check for characters that might
        // be dangerous and calculate a length
        // of the string that has escapes.
        for (int i = 0; i < length; i++) {
            char c = s.charAt(i);
            if (!((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'))) {
                newLength += 1;
            }
        }
        if (length == newLength) {
            // nothing to escape in the string
            return s;
        }
        StringBuffer sb = new StringBuffer(newLength);
        for (int i = 0; i < length; i++) {
            char c = s.charAt(i);
            if (!((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'))) {
                sb.append('\\');
            }
            sb.append(c);
        }
        return sb.toString();
    }
}

Related Tutorials