Searching, Sorting, and Text Boundary Detection: Detecting Text Boundaries : BreakIterator « I18N « Java






Searching, Sorting, and Text Boundary Detection: Detecting Text Boundaries

Searching, Sorting, and Text Boundary Detection: Detecting Text Boundaries
   

/*
Java Internationalization
By Andy Deitsch, David Czarnecki

ISBN: 0-596-00019-7
O'Reilly
*/
import java.text.*;
import java.util.Locale;

public class HangulTextBoundaryDetection {
  // A helper function to print out the boundary positions
  static void printBoundaries(String source, BreakIterator bi) {
    bi.setText(source);
    int boundary = bi.first();

    while (boundary != BreakIterator.DONE) {
      System.out.print(boundary + " ");
      boundary = bi.next();
    }
  }

  public static void main(String s[]) {
    // we create a string composed of 6 jamo
    String hangul = "\u1112\u1161\u11ab\u1100\u1173\u11af";

    // Retreive a character and a word BreakIterator object
    // that is locale-sensitive for Korean text.
    BreakIterator ci = BreakIterator.getCharacterInstance(Locale.KOREAN);
    BreakIterator wi = BreakIterator.getWordInstance(Locale.KOREAN);

    System.out.print("Character Boundaries: ");
    printBoundaries(hangul, ci);
    System.out.print("\nWord Boundaries:");
    printBoundaries(hangul, wi);
  }
}



           
         
    
    
  








Related examples in the same category

1.BreakIterator for difference localesBreakIterator for difference locales
2.BreakIterator DemoBreakIterator Demo
3.Determining the Character Boundaries in a Unicode String
4.Determining the Word Boundaries in a Unicode String
5.Determining the Sentence Boundaries in a Unicode String
6.Determining Potential Line Breaks in a Unicode String
7.Behaves similiar to BreakIterator.getWordInstance() but handles line break delimeters as simple whitespaces.
8.Wrap multi-line strings (and get the individual lines)