Java String Normalize normalizeUnicode(String str)

Here you can find the source of normalizeUnicode(String str)

Description

Normalize to "Normalization Form Canonical Decomposition" (NFD) REF: http: //stackoverflow.com/questions/3610013/file-listfiles-mangles-unicode- names-with-jdk-6-unicode-normalization-issues This supports proper file name retrieval from file system, among other things.

License

Apache License

Parameter

Parameter Description
str text

Return

normalized string, encoded with NFD bytes

Declaration

public static String normalizeUnicode(String str) 

Method Source Code

//package com.java2s;
/**/*from   w ww.ja va2 s . co  m*/
 *
 * Copyright 2012-2013 The MITRE Corporation.
 *
 * Licensed under the Apache License, Version 2.0 (the "License"); you may not
 * use this file except in compliance with the License. You may obtain a copy of
 * the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations under
 * the License.
 *
 * **************************************************************************
 * NOTICE This software was produced for the U. S. Government under Contract No.
 * W15P7T-12-C-F600, and is subject to the Rights in Noncommercial Computer
 * Software and Noncommercial Computer Software Documentation Clause
 * 252.227-7014 (JUN 1995)
 *
 * (c) 2012 The MITRE Corporation. All Rights Reserved.
 * **************************************************************************
 */

import java.text.Normalizer;

public class Main {
    /**
     * Normalize to "Normalization Form Canonical Decomposition" (NFD) REF:
     * http:
     * //stackoverflow.com/questions/3610013/file-listfiles-mangles-unicode-
     * names-with-jdk-6-unicode-normalization-issues This supports proper file
     * name retrieval from file system, among other things. In many situations
     * we see unicode file names -- Java can list them, but in using the
     * Java-provided version of the filename the OS/FS may not be able to find
     * the file by the name given in a particular normalized form.
     * 
     * @param str text
     * @return normalized string, encoded with NFD bytes
     */
    public static String normalizeUnicode(String str) {
        Normalizer.Form form = Normalizer.Form.NFD;
        if (!Normalizer.isNormalized(str, form)) {
            return Normalizer.normalize(str, form);
        }
        return str;
    }
}

Related

  1. normalizeTibetan(String s)
  2. normalizeToAlpha(String input)
  3. normalizeUnicode(CharSequence text)
  4. normalizeUnicode(final String str)
  5. normalizeUnicode(String input)
  6. normalizeUnicodeDiacritics(String text)
  7. normalizeWhitespace(final String str)
  8. normalizeWhiteSpace(String str)
  9. normalizeWidth(String text)