PHP Tutorial - PHP html_entity_decode() Function






Definition

The html_entity_decode() function converts an & escaped string into its original format, reversing the operation of html_entities().

Syntax

PHP html_entity_decode() Function has the following format.

string html_entity_decode ( string html [, int options [, string charset]] )

Parameter

ParameterDescription
htmlthe html string to convert
optionsA bitmask of flags
charsetdefines encoding used in conversion. Default value is ISO-8859-1 prior to PHP 5.4.0, and UTF-8 from PHP 5.4.0 onwards. You are highly encouraged to specify the correct value for your code.

Options

options is a bitmask of one or more of the following flags, which specify how to handle quotes, invalid code unit sequences and the used document type. The default is ENT_COMPAT | ENT_HTML401.

Available flags constants

Constant NameDescription
ENT_COMPATWill convert double-quotes and leave single-quotes alone.
ENT_QUOTESWill convert both double and single quotes.
ENT_NOQUOTESWill leave both double and single quotes unconverted.
ENT_IGNORESilently discard invalid code unit sequences instead of returning an empty string. Using this flag is discouraged as it ? may have security implications.
ENT_SUBSTITUTEReplace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD; (otherwise) instead of returning an empty string.
ENT_DISALLOWEDReplace invalid code points for the given document type with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD; (otherwise) instead of leaving them as is. This may be useful, for instance, to ensure the well-formedness of XML documents with embedded external content.
ENT_HTML401Handle code as HTML 4.01.
ENT_XML1Handle code as XML 1.
ENT_XHTMLHandle code as XHTML.
ENT_HTML5Handle code as HTML 5.




Charset

The following character sets are supported:

Supported charsets
CharsetAliasesDescription
ISO-8859-1ISO8859-1Western European, Latin-1.
ISO-8859-5ISO8859-5Little used cyrillic charset (Latin/Cyrillic).
ISO-8859-15ISO8859-15Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1).
UTF-8NoAliasASCII compatible multi-byte 8-bit Unicode.
cp866ibm866, 866DOS-specific Cyrillic charset.
cp1251Windows-1251, win-1251, 1251Windows-specific Cyrillic charset.
cp1252Windows-1252, 1252Windows specific charset for Western European.
KOI8-Rkoi8-ru, koi8rRussian.
BIG5950Traditional Chinese, mainly used in Taiwan.
GB2312936Simplified Chinese, national standard character set.
BIG5-HKSCSNoAliasBig5 with Hong Kong extensions, Traditional Chinese.
Shift_JISSJIS, SJIS-win, cp932, 932Japanese
EUC-JPEUCJP, eucJP-winJapanese
MacRomanNoAliasCharset that was used by Mac OS.
''NoAliasAn empty string activates detection from script encoding (Zend multibyte), default_charset and current locale (see nl_langinfo() and setlocale()), in this order. Not recommended.




Return

PHP html_entity_decode() function returns the decoded string.

Example

Encode a string for HTML entities


<?PHP
$title = "java2s.com & PHP"; 
$safe_title = htmlentities($title); 
print($safe_title);
print "\n";
$unsafe_title = html_entity_decode($safe_title); 
print($unsafe_title);
?>

The code above generates the following result.