Java Utililty Methods String Tokenize

List of utility methods to do String Tokenize


The list of methods to do String Tokenize are organized into topic(s).


String[]getTokens(String string, String tokenSeparator)
Tokenizes the given string with the given separator string.
List<Integer> tokenEndIndices = new ArrayList<Integer>();
int fromIndex = 0;
while ((fromIndex = string.indexOf(tokenSeparator, fromIndex)) != -1) {
if ((tokenEndIndices.isEmpty()) || (string.length() > tokenEndIndices.get(tokenEndIndices.size() - 1) + 1))
fromIndex = -tokenSeparator.length();
ListgetTokens(String value)
get Tokens
return Arrays.asList(value.replaceAll("[^a-zA-Z0-9]", " ").split("[\\s@&.?$+-]+"));
VectorgetTokens(String vbt)
get Tokens
String t = removeFillers(vbt);
Vector v = new Vector();
StringTokenizer st = new StringTokenizer(t, " ");
int knt = 0;
String retstr = "";
while (st.hasMoreTokens()) {
    String val = st.nextToken();
    val = val.toLowerCase();
ListgetTokenTypes(CommonTree tree)
get Token Types
tokenTypes = new ArrayList<>();
return tokenTypes;
IntegergetUnitType(String typeToken)
Returns the corresponding unit type constant for the given "type token" string.
for (Iterator i = mUnitTypeMap.keySet().iterator(); i.hasNext();) {
    String key = (String);
    if (typeToken.equals(key)) {
        return (Integer) mUnitTypeMap.get(key);
throw new IllegalArgumentException("Type " + typeToken + " is not a known device type");
java.util.ListhasSuffix(String _token)
Checks for suffix.
java.util.List<String> list = new java.util.ArrayList<String>();
if (_token.contains(".") == true) {
    String[] tokenArray = _token.split("\\.");
return list;
booleanisPOSTag(String token)
Very simple way of testing whether something is a part of speech tag.
if (posTagSet == null) {
return posTagSet.contains(token);
booleanisStringFunction(String token)
is String Function
return stringFunctions.contains(token);
booleanisToken(String sentence, String searchWord)
is Token
StringTokenizer sentenceToken = new StringTokenizer(sentence, " .;:");
while (sentenceToken.hasMoreTokens()) {
    if (sentenceToken.nextToken().equals(searchWord))
        return true;
return false;
intmaxTokenLength(String s)
max Token Length
if (s == null)
    return 0;
int max = 0;
StringTokenizer st = new StringTokenizer(s);
while (st.hasMoreTokens()) {
    int tokenLength = st.nextToken().length();
    if (tokenLength > max)
        max = tokenLength;