IBM Cloud Docs
Configuring support for Arabic

Configuring support for Arabic

Read these guidelines to understand how Knowledge Studio handles Arabic character shaping and numeric shaping in Arabic documents.

About this task

With respect to character shaping, the Arabic alphabet does not have capital letters, but letters can change shape depending on their position in the text string and the surrounding letters. Different operating systems and code page conversion programs handle letter shaping in different ways. Unshaped storage is a standard for Windows systems, and Knowledge Studio presumes that Arabic text is stored unshaped. If you want to upload shaped text into Knowledge Studio, you must first convert the text to unshaped form by using standard tools, such as the International Components for Unicode (ICU) API (see the ArabicShaping Class at http://icu-project.org/apiref/icu4j/com/ibm/icu/text/ArabicShaping.html).

Important: In some cases, lack of proper Arabic character shaping might cause content to be displayed incorrectly in the ground truth editor.

With respect to numeric shaping, Knowledge Studio treats numeric shaping as a storage-level property, similar to how Arabic content is handled on the iOS platform. Because a lot of Arabic content is created on platforms like Windows, which treat numeric shaping as a display-level property, you need to either convert content to make numeric shaping a storage-level property or use a Firefox browser when you use Knowledge Studio. Firefox supports the ability to set numeric shaping preferences explicitly at the browser level and enforce the appropriate display for all content shown in the browser.

Procedure

To configure numeric shaping in the Firefox browser:

  1. In the browser URL field, enter about:config. If you are shown a warning from Firefox, click the action to disregard the warning and continue. For information about editing about:config properties, see http://kb.mozillazine.org/About:config_entries.

  2. Type bidi in the search filter field.

  3. Select the bidi.numeral property, which controls how numerals are displayed, and press Enter.

  4. Change the value of this property as required. For example, enter 3 and then click OK.

    • 0: Nominal numerals (the default value)
    • 1: Regular context numerals
    • 2: Hindi context numerals
    • 3: Arabic numerals
    • 4: Hindi numerals

    Important: When the bidi.numeral property is used, Firefox completely ignores the code point of specific digit characters in the content of a web page.

Related reference

Language support