Eight Bit Clean Issues

19 Eight Bit Clean Issues

19.1 Displaying Characters with the High Bit Set

There are several issues to consider here. The most important issue is how to get jed to display 8 bit characters in a “clean” way. By “clean” I mean any character with the high bit set is sent to the display device as is. This is achieved by putting the line:

      DISPLAY_EIGHT_BIT = 1;

in the jed.rc (.jedrc) startup file. European systems might want to put this in the file site.sl for all users. The default is 1 so unless its value has been changed, this step may not be necessary.

There is another issue. Suppose you want to display 8 bit characters with extended Ascii codes greater than or equal to some value, say 160. This is done by putting DISPLAY_EIGHT_BIT = 160;. I believe that ISO Latin character sets assume this. This is the default value for Unix and VMS systems.

19.2 Inputting Characters with the hight bit Set

Inputting characters with the high bit set into jed is another issue. How jed interprets this bit is controlled by the variable META_CHAR. What happens is this: When jed reads a character from the input device with the high bit set, it:

Checks the value of META_CHAR. If this value is -1, jed simply inserts the character into the buffer.
For any other value of META_CHAR in the range 0 to 255, jed returns two 7-bit characters. The first character returned is META_CHAR itself. The next character returned is the original character but with the high bit stripped.

The default value of META_CHAR is -1 which means that when jed sees a character with the high bit set, jed leaves it as is. Please note that a character with the high bit set it cannot be the prefix character of a keymap. It can be a part of the keymap but not the prefix.

Some systems only handle 7-bit character sequences and as a result, jed will only see 7-bit characters. jed is still able to insert any character in the range 0-255 on a 7-bit system. This is done through the use of the quoted_insert function which, by default, is bound to the backquote key ‘. If the quoted_insert function is called with a digit argument (repeat argument), the character with the value of the argument is inserted into the buffer. Operationally, one hits Esc, enters the extended Ascii code and hits the backquote key. For example, to insert character 255 into the buffer, simply press the following five keys: Esc 2 5 5 ‘.

19.3 Upper Case - Lower Case Conversions

The above discussion centers around input and output of characters with the high bit set. How jed treats them internally is another issue and new questions arise. For example, what is the uppercase equivalent of a character with ASCII code 231? This may vary from language to language. Some languages even have characters whose uppercase equivalent correspond to multiple characters. For jed, the following assumptions have been made:

Each character is only 8 bits.
Each character has a unique uppercase equivalent.
Each character has a unique lowercase equivalent.

It would be nice if a fourth assumption could be made:

The value of the lowercase of a character is greater than or equal to its uppercase counterpart.

However, apparently this is not possible since most IBMPC character sets violate this assumption. Hence, jed does not assume it. Suppose X is the upper case value of some character and suppose Y is its lower case value. Then to make jed aware of this fact and use it case conversions, it may be necessary to put a statement of the form:

     define_case (X, Y);

in the startup file. For example, suppose 211 is the uppercase of 244. Then, the line

      define_case (211, 244);

will make jed use this fact in operations involving the case of a character.

This has already been done for the ISO Latin 1 character set. See the file iso-latin.sl for details. For MSDOS, this will not work. Instead use the files dos437.sl and dos850.sl. By default, jed’s internal lookup tables are initialized to the ISO Latin set for Unix and VMS systems and to the DOS 437 code page for the IBMPC. To change the defaults, it is only necessary to load the appropriate file. For example, to load dos850.sl definitions, put

      evalfile ("dos850"); pop ();

in the startup file (e.g., site.sl). In addition to uppercase/lowercase information, these files also contain word definitions, i.e., which characters constitute a “word”.