Keyboard Support

Contact and Search

Keyman.com Homepage

Header bottom

Keyman.com

On this page

Control keys and control characters


This document outlines current best practices for circumstances in which it can be helpful to process control keys such as Tab, Enter, and Backspace (not the Ctrl key, which is a modifier key), or work with control characters such as Carriage Return (CR, U+000D), Line Feed (LF, U+000A), or Tab (U+0009).

Control keys

When a keyboard does not have any reference to control keys, Keyman does not attempt to interpret or modify key events for these keys, and passes the event on to the application, which will handle them as normal.

For example, in Microsoft Word, Tab may insert a tab into the document, or move to the next cell in a table -- and Keyman cannot know which of these will happen. Or in a form, Enter may submit the form, rather than inserting a new line character into the current field.

Handling a control key can be helpful if you want to process a change at the end of a word which might be the last word in a line or in a text field, where the user presses Enter or Tab.

Now, Keyman does not allow keys in the output part of a rule. At first glance, this would seem to make it impossible to handle a key event from a control key and still get its default behavior in the application. However, we can make use of a pattern using multiple groups: an empty final group will pass the key event on to the application.

The following example for Greek illustrates this pattern; compare how Space and Enter have been handled:

+ 's' > $medialSigma

c Transform medial sigma at end of word to final sigma
$medialSigma + [K_SPACE] > $finalSigma ' '

c If Enter key is pressed after a medial sigma, transform that to
c a final sigma, then pass the Enter key on to the application
$medialSigma + [K_ENTER] > $finalSigma use(emit)

c ...

group(emit) using keys
c this empty group will emit the original keystroke

Backspace

It can also be helpful in Keyman to handle the Backspace key in a number of situations: decomposing characters, reordering, or deleting multiple components together. Keyman allows this with rules matching [K_BKSP].

For example, in Khmer, subscript consonants are encoded with a subscript ("coeng") marker U+17D2, followed by the consonant character. The Khmer Angkor keyboard treats these two characters as a single element in its touch keyboard, so the user will never even see the coeng marker character. When Backspace is pressed, the keyboard deletes both characters together:

platform('touch') U+17D2 any(consonant) + [K_BKSP] > nul

Pressing the Backspace key at the start of a text block in a complex document may delete a break (or formatting) before the text block; for example, it may delete a bullet from a bullet list. You should avoid overriding this behavior, which is application-specific.

nul + [K_BKSP] > nul     c DO NOT DO THIS
+ [K_BKSP] > nul         c DO NOT DO THIS EITHER

Control characters in output

It is best to never emit control characters from a Keyman rule. Applications may not be expecting control characters from the keyboard and there may be unexpected consequences, such as inserting a new line character into a single line text field. See the Control keys section for a strategy for emitting default key events instead.

This does mean that you cannot really change the behavior of control keys with Keyman (apart from suppressing the key behavior altogether). This is outside the current scope of Keyman's design.

Control characters in context

While Keyman keyboards should not emit control characters, there are use cases for detecting control characters in the context.

In a Polytonic Greek keyboard, you may wish to automatically insert breathing marks at the start of words. In a touch keyboard, you may want to recognize the start of a sentence and switch to a Shift layer.

However, you do need to be careful here. When recognizing the control character in the context, it is easy to accidentally emit the control character in the output. For example, the following rule in a Hindi keyboard may appear harmless, but actually deletes and re-inserts the LF character, which can be problematic in some applications:

  c linebreak, add full svara.
  U+000A U+094D > U+000A U+0905 c a    c DO NOT DO THIS

Instead, we can make use of how Keyman applies the context statement. When Keyman encounters the context statement at the start of the output, it does not delete and re-insert the characters found there. So we can process the required character change in a separate group, without accidentally touching the control character:

  U+000A U+094D > context use(full-svara-after-lf)

  group(full-svara-after-lf)

  c here we know that we have an LF immediately prior
  U+094D > U+0905 c a

This behavior is present from v6.0 of Keyman, but may be improved in a future version, so that working around this in the keyboard becomes unnecessary issue #14718.

Using control characters in the context in a readonly group is safe; no special handling is required.

Non-breaking space (NBSP) in context

Non-breaking space (NBSP, U+00A0) is used in the digital world to prevent automatic line breaks from being inserted between two words. Said another way, it prevents two words being separated on to two separate lines. In some applications NBSP are returned in the context to Keyman (for example, see issue with Firefox #14945). Therefore in this case NBSP should be considered in a rule that detects a blank space.

A simple example: when e is pressed, output an after a space, and output ε otherwise:

store(spaceChrs)  U+0020 U+00A0 
c The nbsp Unicode Character is U+00A0 and space is U+0020.
any(spaceChrs) + [K_E] > context U+1f10 
+ [K_E] > U+03B5

A example taken from the greek_tonzio keyboard. This keyboard uses stores; see store():

c Characters usually found before words  These characters are Space, parenthesis, quotes etc and to this we add the nbsp 
store(beforeWordChars)  U+0020 U+0028 U+002D U+0027 \
                        U+0022 U+003A U+007B U+00AB \
                        U+00B6 U+00A9 U+000A U+00A0 

c These are the characters [ ἀ ἐ ἠ ἰ ὀ ὑ ὠ Ἀ Ἐ Ἠ Ἰ Ὀ Ὑ Ὠ]                        
store(fwnWithBreathing) U+1F00 U+1F10 U+1F20 U+1F30 U+1F40 U+1F51 U+1F60 \
                        U+1F08 U+1F18 U+1F28 U+1F38 U+1F48 U+1F59 U+1F68

Then we can have a rule that adds the breathing when the encountering a "before word" character:

c ------- GENERAL RULE: AUTOMATIC PSILI AT BEGINNING OF WORD STARTING WITH VOWEL ------
any(beforeWordChars) + any(vowels) > context index(fwnWithBreathing,2)

References