jBASE Configuration and Properties

This section provides the configuration details such as the environment variables, functional and JQL changes, sorting order, error message files and so on pertaining to jBASE internationalization.

JBASE 4.1 provides support code page conversion, collation sequences, international dates and times along with number and currency formatting for internationalisation. The internationalization configuration depends on the user ID and/or the following jBASE environment variables:

  • JBASE_CODEPAGE
  • JBASE_LOCALE
  • JBASE_TIMEZONE

The user ID configuration or environment variables have no effect if the,

  • Account (in which it executes the application) is not configured for international mode or
  • Environment variable JBASE_I18N is not set

Application providers are responsible for the handling of all directionality issues. The jBASE library functions such as length (LEN), string comparisons (LT, LE, GT, GE) and collation order statements like (LOCATE/SORT) are modified to operate on a character basis in international mode rather than bytes, along with the currently configured user locale.

Environment Variables

The environment variables involved in internationalization are as follows.

Function Changes for International Mode

Certain jBASE library functions need to be modified to process data as UTF-8 encoded multi-byte sequences. It bases resultant values on characters rather than bytes. Some functions change their internal functionality based on the state of international mode or JBASE_I18N variable.

JQL Changes for International Mode

The modification of the jBASE jQL Processor required for complete internationalization capabilities are as follows.

Pure Numeric keys

It sorts Keys from the largest negative number to the largest positive number. A single leading minus (-) or plus (+) sign may be present. This ignores leading zeros before a decimal point and trailing zeros after a decimal point for sorting purposes. Nulls will sort either before all numeric keys or as zero, depending on emulation option. If international mode is set, the characters defined in the Unicode 3.0 specification (section 4.6) to be decimal digits are sorted as numbers.

Mixed Alpha Numeric Sorting

A field can contain alpha, alphanumeric, and pure numeric values, which demands a meaningful sort order. For example, a field containing a suppliers’ part number. In this case, each candidate key is split into parts, alternating between numeric and non-numeric parts. Sign (+ or -) characters are valid only as the first character of the key and are treated as non-numeric, if available in other positions. If the part is numeric then the system processes that part in the same manner as a pure numeric key. Based on the status of the international mode, the system does the following.

Status

Action

True

Passes non-numeric parts through the collation algorithm to produce collation key parts

False

Sorts the non-numeric parts left to right

Data Conversion

When executing programs in international mode, it processes all variable contents as UTF-8 encoded sequences. By default, all data must be held as UTF-8 encoded byte sequences. This means that data imported into an account configured to operate in international mode must be converted from the data’s current code page to UTF-8. Normally if all the data are 8-bit bytes in the range 0x00-0x7f (ASCII), conversion is not necessary as these values are already UTF-8 encoded. However, values outside the 0x00-0x7f range must be converted into UTF-8 properly to avoid ambiguity between character set code page values.

For example, the character represented by the hex value 0xE0 in the Latin2 code page, (ISO-8859-2), is described as LATIN SMALL LETTER R WITH ACUTE. However the same hex value in the Latin1 code page, (ISO-8859-1), is used to represent the character LATIN SMALL LETTER A WITH GRAVE.

To avoid this clash of code pages the Unicode specification provides unique hex value representations for both of these characters within the specifications 32-bit value sequence.

EXAMPLE:

Unicode Value

Represents

0x00E0

LATIN SMALL LETTER A WITH GRAVE

0x0155

LATIN SMALL LETTER R WITH ACUTE

NOTE: UTF-8 is an encoding of 32 bit Unicode values with special properties for effective usage in Unix and Windows platforms.

The complete conversion from the original code page to UTF-8 also eliminates the requirement for on the fly conversions when reading or writing to files, as this would add massive and unnecessary overhead to all application processing, whereas the conversion from original code page to UTF-8 is a one-time investment.

File Conversion

The first requirement before configuring an account and application for international mode is to convert the file data from the original code page into UTF-8 encoded byte sequences.

Error Message Files

In international mode, the error message files use the configured locale to generate de-nationalised error message files to be used instead of the default error message file.

The detection of the correct error message file for the locale works depending on the full locale specification. The search happens for the full locale definition (LanguageCode_CountryCode_Variant) with all the three arguments. If it fails with the full definition, the searches continues with the first two arguments (LanguageCode_ContryCode) and if it still fails, the search continues with only the first argument (LanguageCode).

For example, in case of the locale fr_FR_EURO, the search happens in the following order, if the search continuously fails.

  • jbcmessages_fr_FR_EURO
  • jbcmessages_fr_FR
  • jbcmessages_fr
  • jbcmessages

Spooling

The jBASE spooler files holds the created spooler jobs as UTF-8 encoded byte sequences only if generated by a program executing in international mode, that is, as per the account definition. If not, it creates spooler jobs in the normal Latin1 (ISO-8859-1) code page as previously.

Printing

You can configure the CODEPAGE parameter, in the FORM TYPE configuration file in the jBASE release sub directory config (see jspform_deflt) to specify a code page to be used for conversion when de-spooling the print job. The syntax of the parameter is as follows:

CODEPAGE codepage      

In the above syntax, codepage is the name of the code page to be used to convert the print job from the internal format of UTF-8 encoded byte sequences to the required code page for the printer device. For example:

CODEPAGE shift-jis

This code page parameter will convert the UTF-8 byte sequence in the print job to shift-jis for Japanese.

NOTE: The internal format MUST always be UTF-8 if using CODEPAGE parameters; otherwise, fatal conversion errors can occur. If the CODEPAGE parameter is not specified, output will be not be converted, hence if the spool job was generated by a process executing in international mode, then output will be in UTF-8, otherwise if the job was generated by a process executing in normal mode, output will be in ISO-8859-1 (latin1).

Whenever possible, printers should be configured to support UTF-8, thereby eliminating the code page conversion and reducing unnecessary conversion overheads on the system.

Copyright © 2020- Temenos Headquarters SA

Published on :
Wednesday, October 12, 2022 6:57:34 PM IST

Feedback
x