15.9 The encodings description file ¶
To simplify the process of adding new encodings support allowing to
automatically generate a lot of "glue" files.
There is the ’encoding.deps’ file in the lib/ subdirectory which
is used to describe encoding’s properties. The ’mkdeps.pl’ Perl script
uses ’encoding.deps’ to generates the "glue" files.
The ’encoding.deps’ file is composed of sections, each section consists
of entries, each entry contains some encoding/CES/CCS description.
The ’encoding.deps’ file’s syntax is very simple. Currently only two
sections are defined: ENCODINGS and CES_DEPENDENCIES.
Each ENCODINGS section’s entry describes one encoding and
contains the following information.
- Encoding name (the ENCODING field). The name should
be unique and only one name is possible.
- The encoding’s CES converter name (the CES field). Only one CES
converter is allowed.
- The whitespace-separated list of CCS table names which are used by the
encoding (the CCS field).
- The whitespace-separated list of aliases names (the ENCODING
field).
Note all names in the ’encoding.deps’ file have to have the normalized
form.
Each CES_DEPENDENCIES section’s entry describes dependencies of
one CES converted. For example, the euc CES converter depends on
the table and the us_ascii CES converter since the
euc CES converter uses them. This means, that both table
and us_ascii CES converters should be linked if the euc
CES converter is enabled.
The CES_DEPENDENCIES section defines the following:
- the CES converter name for which the dependencies are defined in this
entry (the CES field);
- the whitespace-separated list of CES converters which are needed for
this CES converter (the USED_CES field).
The ’mktbl.pl’ Perl script automatically solves the following tasks.
- User works with the iconv library in terms of encodings and doesn’t know
anything about CES converters and CCS tables. The script automatically
generates code which enables all needed CES converters and CCS tables
for all encodings, which were enabled by the user.
- The CES converters may have dependencies and the script automatically
generates the code which handles these dependencies.
- The list of encoding’s aliases is also automatically generated.
- The script uses a lot of macros in order to enable only the minimum set
of code/data which is needed to support the requested encodings in the
requested directions.
The ’mktbl.pl’ Perl script is intended to interpret the ’encoding.deps’
file and generates the following files.
- lib/encnames.h - this header files contains macro definitions for all
encoding names
- lib/aliasesbi.c - the array of encoding names and aliases. The array
is used to find the name of requested encoding by it’s alias.
- ces/cesbi.c - this file defines two arrays
(
_iconv_from_ucs_ces
and _iconv_to_ucs_ces
) which contain
description of enabled "to UCS" and "from UCS" CES converters and the
names of encodings which are supported by these CES converters.
- ces/cesbi.h - this file contains the set of macros which defines
the set of CES converters which should be enabled if only the set of
enabled encodings is given (through macros defined in the
newlib.h file). Note, that one CES converter may handle several
encodings.
- ces/cesdeps.h - the CES converters dependencies are handled in
this file.
- ccs/ccsdeps.h - the array of linked-in CCS tables is defined
here.
- ccs/ccsnames.h - this header files contains macro definitions for all
CCS names.
- encoding.aliases - the list of supported encodings and their
aliases which is intended for the Newlib configure scripts in order to
handle the iconv-related configure script options.