Using Input Methods on the JavaTM
Platform
by Naoto Sato
This is an updated version of an article that was originally
published in September, 2002. You can find the original
article on The
Swing Connection.
Do you know how many characters are defined in Unicode Standard 4.0,
the supported version in the Java 2 runtime environment version 1.5.0?
It is 96,382!1 You may wonder how these characters are input into
an application written for the Java platform. There are
input methods for this purpose. Input methods allow the
user to input text where each character may not be directly
represented by a single keystroke. A user may compose text
in advance from a series of keystrokes and insert the final
desired text into the document when the composition is complete.
In the Java 2 development environment, we provide the Input Method
Framework for the collaboration between text components
and input methods. By using this framework, Swing text components
can handle the input method composition on-the-spot,
or inline; in other words, the text being composed
is immediately visually and logically inserted into the
text backing store. Swing text components accomplish this
on-the-spot editing style using the client
API in the Input Method Framework. The engine
SPI in the Input Method Framework allows you to plug
your favorite input methods into any Java runtime environment.
Like other Java technology-based applications, an input
method can be deployed on any platform where the Java runtime
environment is available. Furthermore, unlike applications,
you can enjoy a common user interface across platforms;
a feature that platform-native input methods seldom provide.
In this article, you will learn how to use input methods
in your Swing text components. The information in this article
is based on the Java 2 runtime environment, version 1.4.0
or higher, and the operating systems listed in the Supported
Locales document (1.4.2 version
| 1.5.0
version ).
Here is the table of contents of this article:
Installing
Input Methods
Installing an input method is pretty easy. An input method
is provided in JAR archive form, and you need only to place
it in the extension directory. This is usually lib/ext,
but you can also specify the extension directory at runtime
by setting the java.ext.dirs system property.
We provide a sample input method named City Input Method,
available
here, to use in this article. Copy the CityIM.jar
file to the extension directory and the installation is done!
Selecting Input
Methods
Once an input method is installed in the extension directory,
you will notice an extra menu item in the System
menu on SolarisTM or Microsoft Windows when you run an application which
uses the Swing text component, as shown here:
After selecting Select Input Method, the following
popup appears:
This menu contains a list of all input methods available in
this runtime environment. Input methods that are provided
by the underlying operating system are listed in the System
Input Methods submenu. Input methods listed below the
separator line are Java technology-based input methods. In
this example, City Input Method demonstrates that it
can support multiple languages. The specific languages supported
are displayed as submenu items.
If you set the user locale to Japanese, you would see Japanese
menu items if translations are provided by the input methods.
Using Input Methods
To use an input method, select it in the popup menu. Let's
select City Input Method in the Japanese locale; this
menu item is highlighted in the previous picture. A small
popup window appears at the bottom-right corner of the screen.
This tells you that City Input Method - Japanese is
now selected.
Now, type s, f, and o from the keyboard.
You would see sfo with (dotted) underline, which means
that sfo is still in composition mode. This type of
editing is often known as pre-composing. Since City
Input Method is an input method to input city names from
airport codes, such as SFO, you can see candidate city names
in some languages by pressing the space bar, as shown here:
Once you determine the candidate you prefer, commit
that pre-composed string into the text backing store. This
is typically done by pressing the Return key.
In the example, it looks like this:
Input
Method Selection by a Hot Key
For platforms that do not have the Select Input Method
menu item in the system menu (e.g. Linux), or for applets
that are running inside a browser, we provide an alternative
way to select an input method by pressing a user-defined
hot key. This way of selecting an input method is also useful
for platforms that have the menu item in the system menu,
e.g., Solaris CDE Desktop and Microsoft Windows. If you
press the hot key, the same popup menu discussed previously
is displayed.
To define a hot key combination, download the Input
Method Hot Key tool by clicking the button below.
You can then run it as follows:
java -jar InputMethodHotKey.jar [-system]
This pops up a window like this:
After you set up your favorite hot key, press that combination
on any Swing text component. You will see the same popup
menu for selecting an input method. For multi-user platforms,
the -system option is provided. If you set
a hot key with the -system option, that hot
key is active for all users.
Other Sample
Input Methods
We've seen how the City Input Method works in the
Swing text component. We also provide the following useful,
but unsupported, input methods:
Code Point Input
Method
The Code Point Input Method is a simple input method
that allows Unicode characters to be entered via their hexadecimal
code point values. A user enters the hexadecimal code point
value using the \uxxxx notation for character
literals.
In general, the input method passes characters through unchanged.
However, when the user types a \, the input
method enters composition mode. In composition mode, the
user types the desired code point using the \uxxxx
notation, where x is one of the set [0-9a-fA-F].
When a valid sequence is entered it is converted to the
corresponding Unicode character and committed. The input
method then returns to pass-through mode until another \
character is entered.
While in composition mode, the user can use the left arrow,
right arrow, backspace and delete keys to edit the sequence.
The \u characters can only be deleted if there
are no hex digits present in the composition sequence. Deleting
the \u returns the input method to pass-through
mode.
Since the \ character triggers composition
mode to begin, a user must type two \ characters
in order for a single \ to be added to the
text. When a single \ has been entered, if
the next character is not a u, both the \
and the subsequent character are committed and the input
method returns to pass-through mode.
The Code Point Input Method can be downloaded by
clicking the button below.
A newer
version of the Code Point Input Method is now included in
the
Java 2 SDK version 1.5.0 as a demo
program. This allows the users to input supplementary characters
that have the code points outside of the Basic Multilingual
Plane, i.e., the scalar values of their code points are
between U+10000 and U+10FFFF. For more detail, please refer
to the README file in the Java 2 SDK.
Indic Input Method
This input method archive contains several writing
scripts used in India. Other than Devanagari, which is the
supported writing system since Java 2 runtime edition, version
1.4.0, it also contains the input methods for the following
writing scripts: Bengali, Gujarati, Gurmukhi, Kannada, Malayalam,
Oriya, Tamil, and Telugu. This input method basically maps
the US 101/104 keyboard layout to the INSCRIPT layouts used
in these writing scripts. Here are the diagrams
for each keyboard layout, and the mapping
table from the Latin alphabet to each writing script.
You can download the Indic input method by clicking the
button below.
Thai Input Method
This input method implements the input-sequence checking
for Thai, as defined in the Thai API Consortium's "WTT"
Input/Output Methods document. This input method also
maps the US 101/104 keyboard layout to the Thai TIS820-2538
layout. Here is the mapping
table from the Latin alphabet to the Thai writing
script.
The Thai input method is available by clicking the button
below.
Further Information
For further information on using the Input Method Framework,
see the documentation
here. Also,
you may find this
Java internationalization
forum useful.
Finally, you can send
feedback
to the Java Internationalization team.
1: The number is the sum of graphic
characters and format characters.