Java - File Input Output Character Set

Introduction

java.nio.charset.Charset class represents a character set and a character-encoding scheme.

A character is not always stored in one byte.

Examples of some character set names are US-ASCII, ISO-8859-1, UTF-8, UTF-16BE, UTF-16LE, and UTF-16.

Converting a character based on an encoding scheme is called character encoding.

Converting a sequence of bytes into a character based on an encoding scheme is called decoding.

The java.nio.charset package provides classes to encode/decode a CharBuffer to a ByteBuffer and vice versa.

  • Charset class represents the encoding scheme.
  • CharsetEncoder class performs the encoding.
  • CharsetDecoder class performs the decoding.

The following code shows how to encode a sequence of characters in the string Hello stored in a character buffer and decode it using the UTF-8 encoding-scheme.

import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;

public class Main {
  public static void main(String[] args) throws Exception {
    // Get a Charset object for UTF-8 encoding
    Charset cs = Charset.forName("UTF-8");

    // Character buffer to be encoded
    CharBuffer cb = CharBuffer.wrap("Hello");

    // Encode character buffer into a byte buffer
    ByteBuffer encodedData = cs.encode(cb);

    // Decode the byte buffer back to a character buffer
    CharBuffer decodedData = cs.decode(encodedData);

  }
}

Related Topics