contents
- other programs you need
- sources for other programs
- contents of fonty-rg
- preparations
- creating your own font definition file
- how a definition line looks like
- how many definitions are allowed
- how hexadecimal counting works
- what ranges we might meet
- rules for some HEX ranges
- contents of the definition file
- how font positions and limits are
created
- summary
- practical hints
- (re)building fonts
- creating your userdefined acm kernel map
- how an acm map looks like
- 256 positions, simple definitions
- 512 positions, merged definitions
- using your font on console
- general procedure to use a font
- procedure for official combinations
- procedure for own mixtures
- switching between font contents
- names for files dealing with fonts
- how the whole chain works
- how your system looks like after booting
- error and other messages
- 9th column displayed incorrectly
other programs you need
the first three are usually already present on your system
comment
- a shell working as /bin/sh
- kbd or console-tools/console-data
- perl-5.001 or later (expected as /usr/bin/perl)
where to get the sources for other programs
if you are running a distribution, search at their ftp server
if you are compiling from source here the home
comment
contents of fonty-rg
- top directory
build.sh (shell)
wrapper for building the fonts
choose (perl)
select needed pictures for a font
compact (perl)
putting same pictures in the font together
vga (perl)
sorting characters according to pixels in the 8th
column
*.psf.gz
precompiled compressed, ready-to-use console fonts
comment
- subdirectory /charsets
cz2t.sh (shell)
remove colums from given file
LatCyrGr (contains Latin, Cyrillic, Greek)
table hex to Unicode
*.txt (contains what the filename says)
=xx U+xxxx OFFICIAL NAME: table hex to Unicode
chavo.chars (contains special mixture East Europe, Esperanto)
=xx U+xxxx OFFICIAL NAME: table hex to Unicode
graphics
U+xxxx: Unicode values for creating a box
- subdirectory /source
*.sbf
glyphs: pictures how each Unicode value looks
contains the range of values the filename says
preparations
- unpack the fonty-rg package into your desired directory
you will get a directory called fonty-rg
- unpack the fonty-1.0 package
comment
creating your own font definition file
font defintion files are simple text files in /charsets
how a definition line looks like
=HEX | U+number | # OFFICIAL NAME
------+----------+--------------------
=20 | U+0020 | # SPACE
=21 | U+0021 | # EXCLAMATION MARK
- =
is only a mark helping to catch this number in scripts
- 20 and 21
are the hexadecimal values as `man ascii` or `man iso8859_x` show
- U+
the signal that this plus the following number is a Unicode value
- 0020 and 0021
Unicode numbers, always together with U+ they make the
Unicode values associated with a picture/glyph
- #
comment sign for the following description
- SPACE and EXCLAMATION MARK
are the official names of those characters
Such comments help if you are dealing with characters for a
foreign language or seldom used ones, so you do not know their
meaning by heart.
how many definitions are allowed
Every line in such a text file which is not beginning with a #
comment sign is a definition line and you can have either maximal
256 or maximal 512 of different definitions. Finally a defintion
always ends up in a picture, the glyph. And the lines in your
text file are only roughly an estimation whether you are within the
limit or already exeeding it. If you want to define for example a
font which contains iso8859-1 and also cyrillic letters, somewhere
in your text file you will have these 2 lines:
1 line for latin capital letter A (which is U+0041)
1 line for cyrillic capital letter A (which is U+0410)
There are separate pictures for both of them, but when you see them
you will realize that both pictures are the same and therefore
your 2 lines only count as 1 defintion. Later, when you call the
build.sh script to build your font, the compact script will take
care, that all lines which have the same picture are put into one
defintion; in our example it will do something like this
instead of [U+0041] picture and [U+0410] picture
do [U+0041] [U+0410] picture (both is capital A)
Usually you will not need more than 256 definitions. But in case you
have put a lot into it and suddenly realize that you are beyond
256, don't worry, then it will make a 512-table. After 512, however,
is the end with this kind of tables.
comment
how hexadecimal counting works
Now lets have a short look onto the hexadecimal values
(the HEX in our examples above). They are counted like this:
00 ___10 ___20 from 00 to 09 and 0A to 0F (end of 0)
01 | 11 | 21 from 10 to 19 and 1A to 1F (end of 1)
.. | .. | .. from 20 to 29 and 2A to 2F (end of 2)
09 | 19 | 29
0A | 1A | 2A you can best remember if you just speak
.. | .. | .. the numbers separately, like this:
0F | 1F | 2F one E, one F, two zero, two one ....
|___| |___| |__ and so on
what ranges we might meet
Counting down in this way we can now split the long row into several
parts which have special meanings/purposes, and doing this we get
ranges. (Those ranges make it just a bit easier if we want to speak
about a certain number of values which are treated equally). Ranges
which are quite famous are these:
00 # NUL --+ control characters (essential terminal
.. .. | controlling characters like bell, carriage
1F # UNIT SEPARATOR --+ return, end of transmission ...)
20 # SPACE --+
.. .. | ascii characters
7F # DELETE --+
80 # blink --+ control characters (additional terminal
.. .. | controlling characters like bold, underline,
9F # --+ reverse ...)
A0 # NO-BREAK SPACE --+
.. .. | iso8859 characters
FF # subset dependend--+
rules for some HEX ranges
For ascii and parts of iso8859 range the Unicode number is equal
to the HEX number, just put two zeros in front of it, so your lines
would look like this:
=20 U+0020 -+
=21 U+0021 |
... | the ascii range is the same everywhere
=7E U+007E |
=7F U+007F -+
=A0 U+00A0 -+
=A1 U+00A1 |
... no | in the iso range it does not always continue like this;
| here the Unicode number depends on what the subset says
| it should look like. For example
=A4 U+00A4 # CURRENCY in iso8859-1
=A4 U+20AC # EURO SIGN in iso8859-15
Unicode values for a subset can for example be
looked up in the other .txt files in /charsets
=00 U+0000 -+ the range of essential controls
... |
=1E U+001E |
=1F U+001F -+
comment
=80 U+ -+ the range of additional controls
... |
=9F U+ -+
contents of the definition file
Your font file can keep whatever pictures (glyphs) you like to have
produced; but please remember that people outside are mostly using
official character maps like iso8859-1 or iso8859-15. And if you go
and mix the characters of different sets those outside will see only
strange rubbish. You can for example only replace the currency sign
by the euro sign in iso8859-1 and make your myfont.psf from that.
If you now create a text containing "1/4 euro" and send it off to
other people, those who use the -1 subset will see "1/4 currency"
and those with the -15 subset see "capital OE ligature euro".
These are called conflicting characters, because the same HEX value
has a different Unicode value (and therefore a different picture) in
another iso8859 subset.
But you can of course have both full character sets in your font with
lines for all conflicting characters. In our example here it means,
you first write all defintions for the -1 subset and then write all
definitions for the conflicting characters of the -15 subset. (in that
case you would already see when creating your text that the combination
"1/4 euro" is not possible and you would end up in "0.25 cent").
how font positions and limits are created
Now uncompress the chavo.psf.gz font in the top directory of fonty-rg;
this is a nearly completely filled up font. Then extract the builtin
Unicode character table with this command:
psfgettable chavo.psf chavo.builtin.table
Open the chavo.builtin.table in your editor. Note, it has 259 lines,
the first 3 lines are comments so they do not count: 259-3=256 lines,
yes this is the first limit. Now scroll down with your eyes on the
first column and you will see, that this follows the counting you
already learned - but here we have in front of it 0x0.
We need the 0x0 in front here because, if this would be a simple
hexadecimal notation with only 0x in front, there would be a definite
end at 0xFF (F is the last one we have) and we would never be able to
make a font with more than 256 definitions. But the next limit was 512,
and such a font going beyond 256 definitions would continue now like
this: 0x0FF and then 0x100, 0x101, 0x102 ..
So the first 256 have 0 as first number, the second 256 have 1 as
first number (the counting sheme is the same as we learned, it just
has one number more in front).
This first column is also called the font position and it is needed
to find the picture (glyph). If your program tells the screen driver,
now spit out the word "error" you expect the screen driver does not
need ages for searching "where the heck in this file might be the o,
ah here, no this looks more like a zero, just hold on ..."
summary
With the lines which you write in your font definition text, you tell
what Unicode value should be associated whith the HEX value of
that character, and the comments behind it describe its official
name (and if you know this character you also know how the glyph
looks). Unicode values which turn out to have glyphs looking the
same, will later be put together to make a single definition from your
two defintion lines. So this single definition occupies only a single
font position. You can either have a font with maximal 256 font
positions or a font with maximal 512 font positions.
practical hints
- take an already existing defintion file as starting point
- copy that one to a file with your desired name
- at the bottom of your new definition file add additional lines
you might consider inserting a # comment line with description
this way it is easier to take it as template for new ones
- all iso8859 text files can be easily examined with diff
- definition lines which are identical must not be repeated
you need for example the ascii range only once
some of the general ones from iso range are also identical
(re)building the fonts
- change into the fonty-rg/ directory and execute
- ./build.sh
it might take a while until you see some messages
you will get the new built fonts in this directory
the are named <purpose>.psf.gz
they will be written over the ones comming with this package
creating your userdefined acm kernel map
If your font mixes just all characters which you find nice looking and
you want to display them besides eachother, you can not use an acm
kernel map which only contains ranges of official characters. So you
need to make your own acm map.
There are basically two ways what the kernel makes out of the bytes
it receives from a program when looking up values in the acm map.
- it simply displays what the font has for that value
this is called direct-to-font
and it looks like: (for) 0x8F (display) 0x8F
an example is ..consoletrans/trivial
- it displays the glyph of the Unicode value for that value
this is called user-to-unicode, where user means program
and it looks like: (for) 0x8f (display the) U+008f (glyph)
examples are ..consoletrans/8859-x_to_uni.trans
We would not need all the Unicode numbers if we simply want the kernel to
display direct-to-font, so this case is less interesting for us.
how an acm map looks like
As a general rule for our acm map we can use lines like these, which
are the most flexible way of writing (as they allow comments):
0x000 U+fffd -+ this is the default for "unknown character"
... | for 256 positions
0x0ff U+ -+
0x100 U+ -+
... | for 512 positions
0x1ff U+ -+
^----------------- internal value which a program sends to be displayed
^--------------- 3 digits for 512 positions but don't harm for 256
^--------- Unicode value for the glyph which is printed
256 positions, simple definitions
for an acm map which deals with 256 positions and only have definition
lines which are not merged into one definition, you can do this
- take the text file you created in the /charsets directory
- replace all "=" signs with "0x0"
- delete the official name explanation
- fill up the missing counting values in the first column
these might be the essential and additional control ranges
- give the first value the Unicode value for unknown character
like this: 0x000 U+FFFD
you always see this if a requested character is not in the font
- you can specify an alternate if the font does not have the glyph
in the line in question just add a second Unicode value
like this: 0x04a U+20AC U+004A
means if you don't find the euro sign display the currency sign
- HEX values which you did not define must stand alone
like this: 0x003
means no picture if the program sends these bytes
this is necessary to keep the order according to your font
- save it under a name related to your font like <your_font>.acm
the .acm will reflect that this is an acm kernel mapping
for kbd this will finally go to unimaps/
for console-tools it finally goes to consoletrans/
512 positions, merged definitions
If you have up to 512 definitions which are all pointing to a different
picture/glyph, you just continue according to what you did with the
first 256 positions. So: go down the whole HEX counting, leave the
Unicode value empty if you did not specify a glyph, and maybe add an
alternate Unicode value to display if the glyph is not in the font.
If you have definition lines which will be merged into one line, this
means that you want to alternately display two different character sets.
And this again means you must change the acm kernel map first.
comment
using your font on console
for the first test you can keep the font in the fonty-rg directory
general procedure to use a font
This assumes you did not put the screen driver in utf8 mode
To get your own font working you might need the following components
- your font of course
- with kbd: setfont <your_font>.psf.gz
- with console-tools: consolechars <your_font>.psf.gz
- a screen driver translation map hex to Unicode to find the glyph
only needed if the font does not have a builtin Unicode map
- with kbd: setfont -u <fitting>.trans
- with console-tools: consolechars -u <fitting>.sfm
- a userdefined acm kernel map for the (sub)set you want to use
- with kbd: setfont -m <desired_set>.uni
- with console-tools: consolechars -m <desired_set>.acm
- the command to switch to the userdefined kernel acm map
general for defining G0 to keep it: echo -e "\033(K"
general for defining G1 to keep it: echo -e "\033)K"
- kbd: not needed for G0 (included in "-m")
comment
- console-tools: not needed for G0, G1 defining with --g1
procedure for official combinations
if your font contains a combination of official character sets
like chavo combines the officials latin1,2,3 and koi8-r
- - load your font: consolechars or setfont <your_font>.psf.gz
- fonty-rg fonts have a hex to Unicode screen driver map builtin
means you normally do not need an external screen driver map
with kbd: -u ..consoletrans/<fitting>.trans
with console-tools: -u ..consoletrans/<fitting>.sfm
- - load the momentarily desired userdefined acm kernel map
- kbd: setfont -m ..unimaps/<desired>.uni
- console-tools: consolechars -m ..consoletrans/<desired>.acm
for example to get the koi8-r from the chavo font: koi8-r.uni|.acm
Notes:
Some kbd .uni maps like the iso0x.uni do not work, to get this work
correctly you can download console-data and copy the .acm files
from /consoletrans to the share/kbd/unimaps/ directory.
This is unique for all ttyX terminals of the system which means
you can't activate another userdefined map at the same time
- kbd and console-tools include the G0 defining when using "-m .."
so G0 defining in an extra step is not needed
console-tools have an option "--g1" to define G1 instead of G0
procedure for own mixtures
if your font is a mixture not corresponding to official sets
like exchange in iso8859-1 only the currency by euro
- - load your font: consolechars or setfont <your_font>.psf.gz
- fonty-rg fonts have a hex to Unicode screen driver map builtin
means you normally do not need an external screen driver map
with kbd: -u ..consoletrans/<fitting>.trans
with console-tools: - ..consoletrans/<fitting>.sfm
- then you need to load your special extracted acm kernel map
- kbd: setfont -m <your_font>.uni
- console-tools: consolechars -m <your_font>.acm
- kbd and console-tools include the G0 defining when using "-m .."
so G0 defining in an extra step is not needed
switching between font contents
You might have a font which contains characters for several
different character sets, like the chavo font is able to display
4 character sets. With the command for your userdefined acm kernel
map you already said which of those character sets should be used
in the beginning. Now we see how to get our font display another
character set.
In all we have 4 acm kernel maps which we can switch to
and 2 (kind of) variables which hold the defined acm kernel map.
Those 2 variables are G0 and G1 and their initial value is predefined,
but that is just for convenience and we can always define them new.
- predefined value for G0: latin1
defining G0 variable to hold a special acm map
G0 to ISO latin1 with `echo -en "\033(B"`
G0 to IBM PC 743 with `echo -en "\033(U"`
G0 to DEC VT100 with `echo -en "\033(0"`
G0 to userdefined with `echo -en "\033(K"`
- predefined value for G1: DEC VT100
defining G1 variable to hold a special acm map
G1 to ISO latin1 with `echo -en "\033)B"`
G1 to IBM PC 743 with `echo -en "\033)U"`
G1 to DEC VT100 with `echo -en "\033)0"`
G1 to userdefined with `echo -en "\033)K"`
A terminal always starts with the predifined G0 value, if we don't
like that value we define the variable to hold another acm map.
Once our two G0 and G1 values are ok, we can switch between them with
- switch to G0 with key press: CTRL+O (ctrl and capital O together)
- switch to G1 with key press: CTRL+N (ctrl and capital N together)
final system adjustments
if your tests succeeded and you want to keep your versions
- separating them from your systems fonts you might use /usr/local
- make directories corresponding to your systems shared ones
- copy your new font to the consolefonts/ directory
you might consider to use the .psf and .psfu naming sheme
it is easier to see from the fonts name whether a table is builtin
if you stripped out some builtin hex to Unicode tables
- copy your new translation maps to consoletrans/ and/or unimaps/
you might consider to use the .acm and .sfm naming sheme
it is easier to see from the maps name when it is used
if you decide to work with fonty-rg more often
- copy fonty-rgs scripts to bin/
- if not yet present add this directory to your PATH variable
- in share/ make a directory called fonty-rg
- copy fonty-rgs directory /source to share/fonty-rg/
- if you like do alike for the /charset directory
names for files dealing with fonts
- .acm map
application charset map, also screen map or console map
translation table hex|dec|oct value to Unicode value
used by the kernel to translate the bytes received from programs
stored by console-tools in ..consoletrans/*.acm
stored by kbd in ..consoletrans/*.trans or without .trans
formats depend on which program was used to create the file
- .bdf
contains glyphs (pictures) for each Unicode value
used by uni-vga package as base file to create fonts
- .sbf
contains glyphs (pictures) for each Unicode value
used by the fonty packages as base file to create fonts
- .sfm
Unicode screen font map, also unciode(-to-font) map or unimap
table with tranlations from hex|dec value into Unicode value
used by the screen driver for fonts without a builtin table
stored by console-tools in ..consoletrans/*.sfm
stored by kbd in ..unimaps/*.uni or without .uni
- .psf
the ready-to-use console font, uncompressed
kbd package uses this for fonts without builtin translation map
fonty-rg uses this for fonts with builtin maps
stored by console-tools and kbd in ..consolefonts/
- .psfu
special for kbd package to indicate a builting translation map
stored by kbd in ..consolefonts/*.psfu.gz
- ?
raw font in binary format
used if you start your system with a frambuffer supporting kernel
can not be saved and reloaded with the old.font options
raw fonts can be converted by font2psf (Martin Lohner, SUSE)
how the whole chain works
kernel = lot of other things + console driver
console driver = keyboard driver + screen driver
- fingers press keys
- keyboard hardware sends scancodes to kernel
[programs for scancodes: getkeycodes and setkeycodes]
- kernel looks up keycode for the scancode in a table
- kernel sends keycode to keyboard driver
- keyboard driver looks up character for keycode in keytable
[programs for keytables: dumpkeys and loadkeys]
[programs for special keys: setmetamode]
- keyboard driver is in one of 4 modes
to send the characters to programs
1. raw mode sends scancode
for programs with an own keyboard driver (X11)
2. keycode mode sends keycode
for unknown purpose
3. ascii mode sends character as 8-bit encoding
(only 256 available)
4. utf8 mode sends character as prefixed 8-bit encoding
which makes multi-bytes
comment
[programs for keyboard (driver) mode: kbd_mode and showkey]
- keyboard driver sends characters to program
- program works until result
- program sends characters to display to screen driver
- screen driver is in one of 2 modes
to receive characters from programs
1. utf mode interpretes received bytes as utf8 sequence
converts it into (UCS-2) 16-bit sequences
comment
looks up the glyph to display in the sfm screen
font map
2. byte mode interpretes received bytes as byte sequences
looks up the bytes in the acm application charset map
converts it into utf8 sequences, than into 16-bit
sequences
than looks up the glyph in the sfm screen font map
[program for screen driver mode: vt-is-UTF8 (not in kbd package)]
[switch to utf mode with `echo -en "\033%G"`]
[switch to utf mode with `echo "\x1b%G"`]
[switch to byte mode with `echo -en "\033%@"`]
[switch to byte mode with `echo "\x1b%@"`]
[programs for Unicode tables in fonts: psf{get,add,strip}table]
comment
acm application charset map is one of 4 maps,
3 of them built into kernel (also called console maps)
1. default IBM codepage 437 character set
for i386 other architectures (also called PC code)
comment
2. DEC VT100 character set
3. ISO latin1 character set
4. user definable which is at boot time straight-to-font
U+FFFD mostly font position 0 is the replacement character
displayed if a character is not found in the sfm screen font map
control ranges with Unicode values from U+F000 to U+F1FF
(straight-to-font range) directly display what the font has
comment
[acm maps are in /usr/src/linux/drivers/char/console.c]
[program for acm application charset map: none]
a terminal can switch between two modes with G0 and G1
1. G0 is by default ISO latin1
2. G1 is by default DEC VT100
comment
[switch to G0 with ]
[switch to G1 with ]
example: on tty1=cp437 builtin with G0 switch
example: on tty1=vt100 builtin with G1 switch
example: on tty2=iso01 builtin with G0 switch
example: on tty2=iso02 user with G1 switch
but on all tty's there can only be one user-defined at the time
example: on tty3=myown user with G0 impossible
[adjust G0 to ISO latin1 with `echo -en '\033(B'`]
[adjust G0 to IBM PC 743 with `echo -en '\033(U'`]
[adjust G0 to DEC VT100 with `echo -en '\033(0'`]
[adjust G0 to userdefined with `echo -en '\033(K'`]
[adjust G1 to ISO latin1 with `echo -en '\033)B'`]
[adjust G1 to IBM PC 743 with `echo -en '\033)U'`]
[adjust G1 to DEC VT100 with `echo -en '\033)0'`]
[adjust G1 to userdefined with `echo -en '\033)K'`]
comment
- screen driver interpretes received character due to mode it is in
- screen driver converts character to USC-2 16-bit
- screen driver looks up glyph for the character in font map
- screen driver prints the glyph to the screen
how your system looks like after booting
If you start your linux system and do not run special initscripts
which change settings your system will have this status:
- your keymap is the US keymap (qwerty/defkeymap)
an initscript uses loadkeys to change the map for your keyboard
- the keyboard driver is in the default ASCII mode
normally no initscript uses kbd_mode to change to utf8 mode
- so all your programs receive 8-bit ascii characters
- the screen driver is in the default byte mode
normally no initscript uses the %@ echo to change to utf8 mode
- so all characters from programs will be treated as byte sequences
- the acm application character map is the default G0 ISO latin1
normally no initscript changes this
- so the cp437.uni map will be used as base for transforming
comment
- the console font is the default8x16.psfu.gz in /consolefonts/
maybe an initscript uses setfont | consolechars to change the font
shared directories and what they contain
all those are usually in /usr/share/ (formerly in /usr/lib/)
subdirectories are up to distribution/install options
- consolefonts/
used by console-tools and kbd
with command "consolechars ..." or "setfont ..."
contains compiled fonts in the linux default psf format
- consoletrans/
used by console-tools and kbd
contains mapping tables
- console-tools
with command "consolechars -m ..."
contains acm kernel mapping tables as *.acm or *.trans
with command "consolechars -u ..."
contains font mapping tables as *.sfm
- kbd
with the command "setfont -m ..."
contains acm kernel mapping tables as *.trans or without
- unimaps/
used by kbd
with command "setfont -u ..." or "loadunimap"
contains non-builtin font mapping tables as *.uni or without
comment
error and other messages
These are some messages which you might see and what to make of them.
9th column displayed incorrectly
- occurs during building a font
- message sent by vga script
Open one of the .sbf files containing the glyphs and look at the
lines which show a picture of a character. You will count 16
lines for the height and 9 dots for the width. We deal with a
font here which is described as 8x16 font, with 8 referring to
the width and 16 to the height. So the 8th and 9th column of
the width are the ones of interest here.
The video memory has space for 8 pixels (8 pixels = 1 byte).
The normal VGA hardware has space for 9 pixels and this one is
responsible how it will look like on the screen if all those
pictures are finally put after eachother. You would for
example expect letters to be separated by a little space so they
are readable, but at the same time expect a box you create to
have the lines as one piece and not interrupted by little spaces.
So from the 8 pixels of the video memory the VGA driver has to
make 9 pixels to bring everything correctly onto screen by either
adding the 9th pixels as blanks or by repeating the 8th pixels.
The "blankings" are in the graphics area (letter) and the
"repeatings" are in the pseudographics area (box), just the other
way round you would assume from the word "graphic".
BUT the pseudographics area is hardweired into your graphic card,
and if we have a picture which would actually need to be here but
is not, it will be treated like one of the graphic area and instead
of repeat it will blank.
Now the pictures in the glyph files have all 9 columns, they are
allowed to have 9 although the video memory will only keep 8. So
those pictures should give us the idea how the final result might
look like. And the vga script will look at their 8th and 9th
column to find out whether it is a letter, so it can put into
the graphics area (remember the pseudographics was hardcoded).
Knowing all this now, look at these lines and note how the last
one will give a different result on screen than the glyph will
make you believe (simply because 8th and 9th column are not
equal. So having unequal things in column 8 and 9 in the glyph
might be a reason for such a message.
glyph says video memory VGA does on screen description
+-+ +-+ +-+
00...00.|.| 00...00.| | . append 00...00.|.| graphic (letter)
| | | | | |
...00000|0| ...00000| | 0 repeat ...00000|0| pseudographic (box)
| | | | | |
.......0|0| ........| | 0 repeat .......0|0| pseudographics
.......0|.| ........| | 0 repeat .......0|0| with different
........|0| ........| | . repeat ........|.| 8th/9th column
+-+ +-+ +-+
^----what----^---happens with the----^---9th pixel
^--if the 8th pixel is treated according to this area--^
comment
If you run into this with the koi8-r and koi8-u fonts, the
reason is a different one. With no locale specified running
build.sh will announce 234 characters in the font for both files;
but 8r has 7 of them and 8u has 3 of them displayed incorrectly.
So the difference between 8r and 8u must eliminate 4 offending
characters.
Finding the difference is done simply with diff-ing their text
source files. Generally spoken 8r has drawings and 8u replaces
all drawings with letters. Now looking into the glyph files for
all the drawings show that 4 of them will be recognized as
pseudographics (8/9 being equally zero like box) and the other
4 of them will be recognized as graphics (8/9 being equally dot,
so letter).
As we only deal with column 8/9 here, we can also say, 4 replace
graphic with graphic, so there is no change; and 4 replace
pseudographic with graphic and the offending ones are gone. So
pseudographics (these are the drawings which continue straight
to next one) will be displayed incorrectly (like normal letters).
Now we have a look at the definition lines in whole and we see
that all Unicode values are connected to the HEX values of the
characters which are somehow between =Cx and =Dx. These belong
to the iso range and are assumed to be letters and no drawings.