Common Java Cookbook

Edition: 0.19

Download PDF or Read on Scribd

Download Examples (ZIP)

2.4. Splitting a String

2.4.1. Problem

You want to split a string on a number of different character delimiters.

2.4.2. Solution

Use StringUtils.split() , and supply a series of characters to split upon. The following example demonstrates splitting strings of a comma and a space:

import org.apache.commons.lang.ArrayUtils;
String input = "Frantically oblong";
String input2 = "Pharmacy, basketball,funky";
 
String[] array1 = StringUtils.split( input, " ,", 2 );
String[] array2 = StringUtils.split( input2, " ,", 2 );
System.out.println( ArrayUtils.toString( array1 ) );
System.out.println( ArrayUtils.toString( array2 ) );

This produces the output:

{ "Frantically", "oblong" }
{ "Pharmacy", "basketball" }

2.4.3. Discussion

The StringUtils.split( ) function does not return empty strings for adjacent delimiters. A number of different delimiters can be specified by passing in a string with a space and a comma. This last example limited the number of tokens returned by split with a third parameter to StringUtils.split(). The input2 variable contains three possible tokens, but the split function only returns an array of two elements.

The most recent version of J2SE 1.4 has a String.split() method, but the lack of split( ) in previous versions was an annoyance. To split a string in the old days, one had to instantiate a StringTokenizer, and iterate through an Enumeration to get the components of a delimited string. Anyone who has programmed in Perl and then had to use the StringTokenizer class will tell you that programming without split( ) is time consuming and frustrating. If you are stuck with an older Java Development Kit (JDK), StringUtils adds a split function that returns an Object array. Keep this in mind when you question the need for StringUtils.split(); there are still applications and platforms that do not have a stable 1.4 virtual machine.

The J2SE 1.4 String class has a split() method, but it takes a regular expression. Regular expressions are exceedingly powerful tools, but, for some tasks, regular expressions are needlessly complex. One regular expression to match either a space character or a comma character is [' '',']. I'm sure there are a thousand other ways to match a space or a comma in a regular expression, but, in this example, you simply want to split a string on one of two characters:

String test = "One, Two Three, Four Five";
String[] tokens = test.split( "[' '',']" );
System.out.println( ArrayUtils.toString( tokens );

This example prints out the tokens array:

{ "One", "", "Two", "Three", "", "Four", "Five" }

The array the previous example returns has blanks; the String.split( ) method returns empty strings for adjacent delimiters. This example also uses a rather ugly regular expression involving brackets and single quotes. Don't get me wrong, regular expressions are a welcome addition in Java 1.4, but the same requirements can be satisfied using StringUtils.split(" .")—a simpler way to split a piece of text.

2.4.4. See Also

Note the use of ArrayUtils.toString( ) in the solution section. See Chapter 1 for more information about ArrayUtils in Commons Lang.


Creative Commons License
Common Java Cookbook by Tim O'Brien is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Permissions beyond the scope of this license may be available at http://www.discursive.com/books/cjcook/reference/jakartackbk-PREFACE-1.html. Copyright 2009. Common Java Cookbook Chunked HTML Output. Some Rights Reserved.