Java String Encode by Charset encode(String s, Charset encoding)

Here you can find the source of encode(String s, Charset encoding)

Description

Escape a string into URI syntax

This function applies the URI escaping rules defined in section 2 of [RFC 2396], as amended by [RFC 2732], to the string supplied as the first argument, which typically represents all or part of a URI, URI reference or IRI.

License

Apache License

Parameter

Parameter Description
s the String to convert
encoding The encoding to use for unsafe characters

Return

The converted String

Declaration

public static String encode(String s, Charset encoding) 

Method Source Code

//package com.java2s;
//License from project: Apache License 

import java.nio.charset.Charset;

public class Main {
    /**//from w w  w .  j a v  a 2 s.co m
     * Used to convert to hex.  We don't use Integer.toHexString, since
     * it converts to lower case (and the Sun docs pretty clearly specify
     * upper case here), and because it doesn't provide a leading 0.
     */
    private static final String hex = "0123456789ABCDEF";

    /**
     * <p>Escape a string into URI syntax</p>
     * <p>This function applies the URI escaping rules defined in
     * section 2 of [RFC 2396], as amended by [RFC 2732], to the string
     * supplied as the first argument, which typically represents all or part
     * of a URI, URI reference or IRI. The effect of the function is to
     * replace any special character in the string by an escape sequence of
     * the form %xx%yy..., where xxyy... is the hexadecimal representation of
     * the octets used to represent the character in US-ASCII for characters
     * in the ASCII repertoire, and a different character encoding for
     * non-ASCII characters.</p>
     * <p>If the second argument is true, all characters are escaped
     * other than lower case letters a-z, upper case letters A-Z, digits 0-9,
     * and the characters referred to in [RFC 2396] as "marks": specifically,
     * "-" | "_" | "." | "!" | "~" | "" | "'" | "(" | ")". The "%" character
     * itself is escaped only if it is not followed by two hexadecimal digits
     * (that is, 0-9, a-f, and A-F).</p>
     * <p>[RFC 2396] does not define whether escaped URIs should use
     * lower case or upper case for hexadecimal digits. To ensure that escaped
     * URIs can be compared using string comparison functions, this function
     * must always use the upper-case letters A-F.</p>
     * <p>The character encoding used as the basis for determining the
     * octets depends on the setting of the second argument.</p>
     *
     * @param s        the String to convert
     * @param encoding The encoding to use for unsafe characters
     * @return The converted String
     */
    public static String encode(String s, Charset encoding) {
        if (s == null || s.isEmpty()) {
            return null;
        }
        int length = s.length();
        int start = 0;
        int i = 0;
        StringBuilder result = new StringBuilder(length);
        while (true) {
            while ((i < length) && isSafe(s.charAt(i))) {
                i++;
            }
            // Safe character can just be added
            result.append(s.substring(start, i));
            // Are we done?
            if (i >= length) {
                return result.toString();
            } else if (s.charAt(i) == ' ') {
                result.append('+'); // Replace space char with plus symbol.
                i++;
            } else {
                // Get all unsafe characters
                start = i;
                char c;
                while ((i < length) && ((c = s.charAt(i)) != ' ') && !isSafe(c)) {
                    i++;
                }
                // Convert them to %XY encoded strings
                String unsafe = s.substring(start, i);
                byte[] bytes = unsafe.getBytes(encoding);
                for (byte aByte : bytes) {
                    result.append('%');
                    result.append(hex.charAt(((int) aByte & 0xf0) >> 4));
                    result.append(hex.charAt((int) aByte & 0x0f));
                }
            }
            start = i;
        }
    }

    /**
     * Returns true if the given char is
     * either a uppercase or lowercase letter from 'a' till 'z', or a digit
     * froim '0' till '9', or one of the characters '-', '_', '.' or ''. Such
     * 'safe' character don't have to be url encoded.
     *
     * @param c the character
     * @return true or false
     */
    private static boolean isSafe(char c) {
        return (((c >= 'a') && (c <= 'z')) || ((c >= 'A') && (c <= 'Z')) || ((c >= '0') && (c <= '9')) || (c == '-')
                || (c == '_') || (c == '.') || (c == '*'));
    }
}

Related

  1. checkEncoder(CharsetEncoder encoder)
  2. deflate(int level, String str, Charset encoding)
  3. encode(Charset charset, String string)
  4. encode(final String str, final Charset charset)
  5. encode(String charsetName, char[] chars, int offset, int length)
  6. encode(String text, Charset charset)
  7. encode(String value, Charset charset)
  8. encodeBase64(String s, Charset cs)
  9. encodeCHARSET(String string, Charset charset)