The following is the list of currently supported encodings. The first column corresponds to the encoding name, the second column is the list of aliases, the third column is its CES and CCS components names, and the fourth column is a short description.
Name | Aliases | CES/CCS | Short description |
big5 | csbig5, big_five, bigfive, cn_big5, cp950 | table_pcs / big5, us_ascii | The encoding for the Traditional Chinese. |
cp775 | ibm775, cspc775baltic | table / cp775 | The updated version of CP 437 that supports the balitic languages. |
cp850 | ibm850, 850, cspc850multilingual | table / cp850 | IBM 850 - the updated version of CP 437 where several Latin 1 characters have been added instead of some less-often used characters like the line-drawing and the greek ones. |
cp852 | ibm852, 852, cspcp852 | IBM 852 - the updated version of CP 437 where several Latin 2 characters have been added instead of some less-often used characters like the line-drawing and the greek ones. | |
cp855 | ibm855, 855, csibm855 | table / cp855 | IBM 855 - the updated version of CP 437 that supports Cyrillic. |
cp866 | 866, IBM866, CSIBM866 | table / cp866 | IBM 866 - the updated version of CP 855 which follows more the logical Russian alphabet ordering of the alternative variant that is preferred by many Russian users. |
euc_jp | eucjp | euc / jis_x0208_1990, jis_x0201_1976, jis_x0212_1990 | EUC-JP - The EUC for Japanese. |
euc_kr | euckr | euc / ksx1001 | EUC-KR - The EUC for Korean. |
euc_tw | euctw | euc / cns11643_plane1, cns11643_plane2, cns11643_plane14 | EUC-TW - The EUC for Traditional Chinese. |
iso_8859_1 | iso8859_1, iso88591, iso_8859_1:1987, iso_ir_100, latin1, l1, ibm819, cp819, csisolatin1 | table / iso_8859_1 | ISO 8859-1:1987 - Latin 1, West European. |
iso_8859_10 | iso_8859_10:1992, iso_ir_157, iso885910, latin6, l6, csisolatin6, iso8859_10 | table / iso_8859_10 | ISO 8859-10:1992 - Latin 6, Nordic. |
iso_8859_11 | iso8859_11, iso885911 | table / iso_8859_11 | ISO 8859-11 - Thai. |
iso_8859_13 | iso_8859_13:1998, iso8859_13, iso885913 | table / iso_8859_13 | ISO 8859-13:1998 - Latin 7, Baltic Rim. |
iso_8859_14 | iso_8859_14:1998, iso885914, iso8859_14 | table / iso_8859_14 | ISO 8859-14:1998 - Latin 8, Celtic. |
iso_8859_15 | iso885915, iso_8859_15:1998, iso8859_15, | table / iso_8859_15 | ISO 8859-15:1998 - Latin 9, West Europe, successor of Latin 1. |
iso_8859_2 | iso8859_2, iso88592, iso_8859_2:1987, iso_ir_101, latin2, l2, csisolatin2 | table / iso_8859_2 | ISO 8859-2:1987 - Latin 2, East European. |
iso_8859_3 | iso_8859_3:1988, iso_ir_109, iso8859_3, latin3, l3, csisolatin3, iso88593 | table / iso_8859_3 | ISO 8859-3:1988 - Latin 3, South European. |
iso_8859_4 | iso8859_4, iso88594, iso_8859_4:1988, iso_ir_110, latin4, l4, csisolatin4 | table / iso_8859_4 | ISO 8859-4:1988 - Latin 4, North European. |
iso_8859_5 | iso8859_5, iso88595, iso_8859_5:1988, iso_ir_144, cyrillic, csisolatincyrillic | table / iso_8859_5 | ISO 8859-5:1988 - Cyrillic. |
iso_8859_6 | iso_8859_6:1987, iso_ir_127, iso8859_6, ecma_114, asmo_708, arabic, csisolatinarabic, iso88596 | table / iso_8859_6 | ISO i8859-6:1987 - Arabic. |
iso_8859_7 | iso_8859_7:1987, iso_ir_126, iso8859_7, elot_928, ecma_118, greek, greek8, csisolatingreek, iso88597 | table / iso_8859_7 | ISO 8859-7:1987 - Greek. |
iso_8859_8 | iso_8859_8:1988, iso_ir_138, iso8859_8, hebrew, csisolatinhebrew, iso88598 | table / iso_8859_8 | ISO 8859-8:1988 - Hebrew. |
iso_8859_9 | iso_8859_9:1989, iso_ir_148, iso8859_9, latin5, l5, csisolatin5, iso88599 | table / iso_8859_9 | ISO 8859-9:1989 - Latin 5, Turkish. |
iso_ir_111 | ecma_cyrillic, koi8_e, koi8e, csiso111ecmacyrillic | table / iso_ir_111 | ISO IR 111/ECMA Cyrillic. |
koi8_r | cskoi8r, koi8r, koi8 | table / koi8_r | RFC 1489 Cyrillic. |
koi8_ru | koi8ru | table / koi8_ru | The obsolete Ukrainian. |
koi8_u | koi8u | table / koi8_u | RFC 2319 Ukrainian. |
koi8_uni | koi8uni | table / koi8_uni | KOI8 Unified. |
ucs_2 | ucs2, iso_10646_ucs_2, iso10646_ucs_2, iso_10646_ucs2, iso10646_ucs2, iso10646ucs2, csUnicode | ucs_2 / (UCS) | ISO-10646-UCS-2. Big Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_2_internal | ucs2_internal, ucs_2internal, ucs2internal | ucs_2_internal / (UCS) | ISO-10646-UCS-2 in system byte order. NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_2be | ucs2be | ucs_2 / (UCS) | Big Endian version of ISO-10646-UCS-2 (in fact, equivalent to ucs_2). Big Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_2le | ucs2le | ucs_2 / (UCS) | Little Endian version of ISO-10646-UCS-2. Little Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_4 | ucs4, iso_10646_ucs_4, iso10646_ucs_4, iso_10646_ucs4, iso10646_ucs4, iso10646ucs4 | ucs_4 / (UCS) | ISO-10646-UCS-4. Big Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_4_internal | ucs4_internal, ucs_4internal, ucs4internal | ucs_4_internal / (UCS) | ISO-10646-UCS-4 in system byte order. NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_4be | ucs4be | ucs_4 / (UCS) | Big Endian version of ISO-10646-UCS-4 (in fact, equivalent to ucs_4). Big Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
ucs_4le | ucs4le | ucs_4 / (UCS) | Little Endian version of ISO-10646-UCS-4. Little Endian, NBSP is always interpreted as NBSP (BOM isn’t supported). |
us_ascii | ansi_x3.4_1968, ansi_x3.4_1986, iso_646.irv:1991, ascii, iso646_us, us, ibm367, cp367, csascii | us_ascii / (ASCII) | 7-bit ASCII. |
utf_16 | utf16 | utf_16 / (UCS) | RFC 2781 UTF-16. The very first NBSP code in stream is interpreted as BOM. |
utf_16be | utf16be | utf_16 / (UCS) | Big Endian version of RFC 2781 UTF-16. NBSP is always interpreted as NBSP (BOM isn’t supported). |
utf_16le | utf16le | utf_16 / (UCS) | Little Endian version of RFC 2781 UTF-16. NBSP is always interpreted as NBSP (BOM isn’t supported). |
utf_8 | utf8 | utf_8 / (UCS) | RFC 3629 UTF-8. |
win_1250 | cp1250 | Win-1250 Croatian. | |
win_1251 | cp1251 | table / win_1251 | Win-1251 - Cyrillic. |
win_1252 | cp1252 | table / win_1252 | Win-1252 - Latin 1. |
win_1253 | cp1253 | table / win_1253 | Win-1253 - Greek. |
win_1254 | cp1254 | table / win_1254 | Win-1254 - Turkish. |
win_1255 | cp1255 | table / win_1255 | Win-1255 - Hebrew. |
win_1256 | cp1256 | table / win_1256 | Win-1256 - Arabic. |
win_1257 | cp1257 | table / win_1257 | Win-1257 - Baltic. |
win_1258 | cp1258 | table / win_1258 | Win-1258 - Vietnamese7 that supports Cyrillic. |