# Observations, on how to insert Unicode and Emojis into text, using a KDE 4 / Plasma 5.8 -based Linux computer.

One of the earliest ‘inventions’ on the Internet, were ‘Smilies’, which were just typed in to emails, and which, when viewed as text, evoked the perception of whichever face they represented. But, graphical user interfaces – GUIs – replaced simple text even in the 1990s, and the first, natural thing which developers coded-in to email clients was, the ability to convert typed, text-based smilies, into actual images, flowed with the text. Also, simple colon-parenthesis sequences were replaced with other, more varied sequences, which could be converted by some email clients into fancier images, than simply, smiling faces.

Actually, the evolution of the early Internet was slightly more complex than that, and I have even forgotten some of the real terms that were used to describe that History.

But there is an even more recent shift in the language of the Internet, which creates a distinction between Smilies, and ‘Emojis’. In this context, even many ‘Emoticons’ were really just smilies. Emojis distinguish themselves, in that these pictograms are represented as part of text in the form of Unicode values, of which there is such a large supply, that some Unicode values represent these pictograms, instead of always representing characters of the Earth’s many languages, including Chinese, Korean, Cyrillic, etc. What some readers might ask next could be, ‘Traditionally, text was encoded as 7-bit or 8-bit ASCII, how can 16-bit or 32-bit Unicode characters simply be inserted into that?’ And the short answer is, through either UTF-8 or UTF-16 Encoding. Hence, in a body of text that mainly consists of 8-bit codes, half of which are not normally used, sequences of bytes can be encoded, which can be recognized as special, because their 8-bit values do not correspond to valid ASCII characters, and their sequences complete a Unicode character.

One fact which is good to know about these Emojis is, that they are often proprietary, which means that they are often either the intellectual property of an IT company, or part of an Open-Source project. But the actual aspect of that which can be proprietary is, the way in which Unicode values are rendered to images.

What that means is that, for example, I can put the following code into my blog: 🤐 . That is also referred to as Unicode character ‘U+1F910′. Its length extends beyond 16 bits by 1 bit, and the next 4, most-significant bits are all 1’s, as expressed by the hexadecimal digit ‘F’. It’s supposed to be a pictogram of a deceased entity, as if that were stated correctly by a head which has had certain features crossed out. But for my blog, the use of such a code can be a hazard, because it will not display equally on Android devices, as it displays on iOS devices. And, on certain Linux computers, it might not be rendered at all, instead just resulting in a famous rectangle that seems to have dots or numbers inside it. This latter result will form, when the client-program could not find the correct Font, to convert this code into an image. (:3)

Those fonts are what’s proprietary. And, they also provide some consistency in style, between Android devices, OR between iOS devices, OR between Windows devices, etc.

Well, I began this posting by musing about the early days of the Internet. During those days, some users – myself included 😊  – did some things which were truly foolish, and which included, to put background images into our HTML-composed emails, and, to decorate documents with (8-bit) dingbat fonts, just because it was fun to pass certain fancier documents around, than POT. I don’t think there is really anything wrong with potential readers, who still put background images into their emails. What I mean is that many of my contacts today, prefer emails which are not even HTML.

This earlier practice, of using dingbat fonts etc., tended to play favourably into the hands of the tech giants, because the resulting documents could only be viewed by certain applications. And so today, I needed to ask myself the question, of how often the use of Emojis can actually result in a document, which the recipient cannot read. And my conclusion is that today, such an indecipherable outcome is actually rare. So, how I would put a long story short is to say, that Commercialism is back, riding on the desire of younger people to put more-interesting content into their messages, and perhaps, without some of the younger people being aware that when they put Emojis, they are including themselves as the software-disciples of one larger group or another. But that larger group mainly seems to be drawing its profits, from the ability of certain software to insert the images, rather than, the ability of only certain software to render them at the receiving end (at all). Everybody knows that, even though the input methods on our smart-phones don’t lead to massively good prose, they almost always offer a rich supply of Smilies, plus Emojis, all displayed to the sender using his or her own font, but later displayed to the recipient, using a potentially different font.

The way Linux computers can be given such fonts, is through the installation of packages such as ‘fonts-symbola’ and ‘ttf-ancient-fonts’, or of ‘fonts-noto‘… The main drawback of the open-source ‘Symbola’ font, for example, is simply, that it often gives a more boring depiction of the same Unicode character, than the depiction which the true Colour Noto Font from Google would give.

One interesting way in which Linux users are already in on the party is, in the fact that actual Web-browsers are usually set to download fonts as they are needed, even under Linux, for the display of Web-pages. Yet, email clients do not fall into that category of applications, and whether they render Emojis depends on whether these font packages are installed.

Hence, if the ability to send Emojis from a Linux computer is where it’s at, then this is going to be the subject of the rest of my posting. I can put two and two together, you know…

(Updated 7/31/2020, 15h10… )

(As of 7/28/2020: )

How easy it is to accomplish this, essentially depends on how recent the Linux computer is. The Debian 8 / Jessie machine I’m hosting this blog from, presents considerable challenges, and mine is running KDE 4 on top of everything else. My Debian 9 / Stretch computer, which has Plasma 5.8 as its desktop manager, fares only slightly better. And, one of the trendiest features of Plasma version 5.18 is, easy insertion of Emojis. I am two generations of Linux computers behind the game, yet, determined to solve this problem somehow.

The KDE or Plasma Desktop Managers have traditionally been slightly handicapped, in comparison with GNOME, when it comes to Emojis. And one of the reasons has to do with the fact that, consistently with what I just wrote, one way to insert them is, to insert arbitrary Unicode characters. This is generally counter to what Linux users do, which is, to use their Compose Key, with which they can compose symbols according to easily remembered sequences, that combine geometrically into the required character. Hence, if I want to type a U-Umlaut, a part of the German Language, I hold down the Compose Key, and type ” … U -> Ü. If I want to type a C-Cédille, a part of the French language, I hold down the Compose Key, and type , … c -> ç. Obviously, this will not work for Emojis.

A semi-standard way that exists, under Linux, to enter an actual Unicode value as text, is <Ctrl>+<Shift>+U, and then completing the hexadecimal Unicode value. When the <Ctrl>+<Shift> combination is released, the Unicode character should replace the sequence, which is also supposed to be highlighted while it’s being typed in. The problem with this is, the fact that it’s application-specific, or, more correctly, specific to the GTK-2 or GTK-3 GUI libraries. Additionally, it’s well-supported under GNOME, and mainly not so, under KDE or Plasma. On both my KDE and Plasma 5.8 -based machines, this system actually works, on the Firefox browser, as well as within the GVim text editor, but hardly anywhere else. It does not work with LibreOffice (See Below.)

But:

• Actually entering a Unicode value requires first, that the user know, what the correct code is, for a given Emoji, And
• Some better alternatives present themselves, actually to enter the Unicode values, that are specifically Emojis.

One way to enter Emojis is specific to the IBus Input Method. But in reality, IBus tends to be incompatible with either KDE or Plasma. If a Plasma user tries to install and activate IBus, he’s likely to break something. (:2) If the user actually needs advanced input methods, in order to enter non-European characters, then ‘fcitx’ and ‘scim’ are more compatible with Plasma. And out of those two, ‘fcitx’ actually has added packages, that:

• Allow configuration from within KDE / Plasma (:1), And
• Allow Emoji Tables to be invoked, from key-combinations.

(Updated 7/30/2020, 9h20: )

But, let us suppose that we do not want to install any of the 3 mentioned Input Methods, but want to be able to add Emojis to our text easily. I can offer three solutions:

1. Mojibar, Or
2. Emoji-Keyboard (Debian 9 / Stretch Minimum), Or
3. Unicodemoticon (Debian 9 / Stretch Minimum).

All these programs are on-screen lookup tables, which the user can search with keywords, and from which the user can select and copy an Emoji to the clipboard (using the mouse). It is then up to the user, to paste the text on the clipboard into his document.

At the time I’m writing this, Programs (1) and (2) have severe bugs.

“Mojibar” is actually compatible with KDE 4 – as well as any other X-server-based system – but has as its main drawback that its UI is so sparse, that some users might just like to shoot it out the window. For example, there is no way to close down the program. My solution to that was, to create a desktop file, that opens Mojibar together with a Terminal Window, the latter of which I can minimize as easily as just click to close. And, under KDE, closing the Terminal Window does not even cause the program it’s bracketing to become a zombie! 😲  Mojibar can be evoked by right-clicking on its Notification Tray Icon, which causes a small square to appear, which looks as though the program wasn’t running properly. It is. Left-Clicking on that grey square opens the dialog of the program.

(Edit 7/30/2020, 2h15: )

Mojibar will display its Emojis in whatever font the desktop environment has available, which in certain cases will not include many Emojis. However, the Colour Noto Fonts exist as something separate from Black and White Noto Fonts, in that the colour fonts are pixel fonts, which Debian Linux will refuse to rescale by default. This means that:

• The Noto Fonts installed from the package manager are only the black and white ones, And
• The same fonts probably do not include Emojis, And
• Adding Colour Noto Fonts explicitly as user, from their TTF Files, given the GUI commands to do so, will only cause those to be rendered at preset resolutions, And
• Because, on my Debian / Jessie computers, the only applicable (black and white) Emoji Font is the ‘Symbola’ Font, this application is likely drawing most of its Emojis from that one.

Documentation for Mojibar suggests, that the user set a specific Font Policy, to allow the separately sourced, Colour Noto Fonts to be rescaled. But, aside from what the desktop manager normally does, certain other applications (than Mojibar) will also supply their own fonts as resources, which in turn will be scaled to a larger size, than fonts which generically get included in documents.

I find the advice given in that documentation to be highly risky, and choose not to apply it.

(End of Edit, 7/30/2020, 2h15.)

“Emoji-Keyboard” comes to Linux users as an AppImage, but must be invoked with the option ‘--no-sandbox‘ to avoid one type of crash. And then, the other type of crash must be avoided, by changing its settings, so that instead of replacing text which the user is typing, it ‘only’ copies the Emoji to the clipboard. This program displays JavaScript Error Messages, when starting.

How well the Emojis look on the recipients’ computers depends on their software, and not on the sender’s software.

“Unicodemoticon” is a Python 3 program, that strikes me as more viable, than the first two above…

This ‘pip3′ installation in root mode creates the executable ‘/usr/local/bin/unicodemoticon’, which needs to be started once as a command, and, if started from a terminal window, the command-line should be followed by the ‘ &’ character, because the terminal window can be closed, with the Notification Tray Icon hidden by default. The behaviour of the Tray Icon is clean, and when the program is maximized, its behaviour is semi-clean.

What confuses most about this program is the fact that each Search may succeed, but will only replace a limited number of beginning search results, from any previous search. Hence, if I carry out a second Search for ‘food’, doing so will only change the first 2 result fields shown above, so that the remaining 25 result fields will keep the results from the ‘cat’ search. The side effect of that to unaccustomed users is, the false perception that the second search did not succeed. The results of the earlier searches can only be overwritten completely, if later searches give at least as many results (or, if the program is quit and restarted).

Clicking on a search result field copies its contents to the clipboard fine, but then also causes the main panel of the window to go blank. When hovering over the tabs, there are some artifacts in updating the main panel. Hovering over the Search tab again, restores the previous contents of the main panel – with the search results of whichever search produced the most results, still preserved.

The apparent rendering artifacts consist of “Tab Previews” (rectangles with dots below the tabs), that seem to serve no purpose, but that the author felt would look nice. One problem with those previews is the fact that under some circumstances, they do not toggle off again, and just stay put. Even when that happens, the window is still usable, but to get rid of ‘stuck’ previews requires a restart of this program.

Hovering over each search result field produces a description as well as the Emoji.

I like this solution best out of the three. If its functioning remains stable, I will create a startup-entry for it.

(Update 7/29/2020, 10h40: )

1:)

In a way similar to ‘fcitx’, ‘scim’ used to have a package called ‘skim’, that eased its integration into KDE 3. However, that package no longer exists. Yet, the ‘scim’ package itself offers a GUI front-end, to configure and use it, without being specific to one desktop manager. This front-end is ‘GTK+’ -based.

However, I do not see specific support for Emojis, within ‘scim’.

One distinction which needs to be made, if installing a package such as ‘scim-qt-immodule’, is, that this module provides a back-end, for entering text on applications that use the ‘Qt’ GUI library, and does not offer any configuration utilities. There are similarly named, related packages, for altering text-input into ‘GTK’ applications and other applications…

2:)

What certain third-party Debian packages do, is to pull in ‘IBus’ packages as dependencies, and this mainly happens on the assumption that the Linux computers are going to be GNOME-based, Ubuntu computers (that normally ship with IBus installed and activated). What will prevent Plasma-based Debian users from breaking their systems with this, is the fact that, simply to have these IBus-related packages installed, does not, by itself, activate IBus.

In order for IBus to become active on a Debian-based computer, 3 environment variables need to be set either in the file ‘/etc/profile’, or in a file, in the sub-directory ‘/etc/profile.d’, and additionally to that, the actual IBus Daemon needs to be set to start, with every user log-in (as a user process, not as root). The user would need to reboot, and then determine whether doing this broke anything.

One assumption which differs under Ubuntu, from the assumptions under Debian, is the assumption of a single-user system. Thus, the additional question would need to be asked under Debian, of whether the IBus Daemon will function correctly, if it has been launched from more than one user-session concurrently.

(Paragraph deleted 7/29/2020, 15h05 because in error.)

(Update 7/29/2020, 14h50: )

When using IBus:

If the package ‘ibus-table-emoji’ is installed, until very recent versions, the key-combination <Ctrl>+<Shift>+E, then <Space>, would bring up a keyboard-based Emoji picker. However, in the most recent versions, the user needs to cause ‘ibus emoji‘ to be executed somehow, to bring up an Emoji table, from which Emojis can be copied to the clipboard. (:4)

Erratum 7/29/2020, 22h55:

It seems that the Emoji support for IBus under Debian, only came after Debian 9 / Stretch. The way I can tell that I don’t have it is, that the command ‘ibus emoji‘ produces an error message, stating ‘emoji’ is not a valid input to ‘ibus’. If the feature were enabled, then to press <Ctrl>+<Shift>+E would cause an underlined @ to appear, as a kind of prompt for the name of one Emoji. After typing that, the user is expected to type <Space>, causing that one emoji to replace the underlined character sequence.

When using ‘fcitx':

If the package ‘fcitx-table-other’, or the package ‘fcitx-table-emoji’ is installed, The key-combination of interest is <Ctrl>+<Alt>+<Shift>+U .

When using KDE 5.18:

If the correct packages are installed, the key combination <Meta>+ . is used to bring up the Emoji picker.

When using LibreOffice 5:

Even if no special Input Method is installed, the key-combination <Ctrl>+<Shift> + U works in principle. But in practice, entry is limited to 16-bit Unicode characters. Because the Emojis in question are 17-bit Unicode characters, they would require that 5 hexadecimal digits be entered, which LibreOffice cuts short after 4.

(Update 7/30/2020, 0h05: )

When using IBus together with LibreOffice:

There seems to be a known compatibility issue, noticeable under the KDE / Plasma Desktop Manager(s), but not under GNOME. If it is affecting the reader, the suggested workaround is, to uninstall ‘libreoffice-kde’ and to install ‘libreoffice-gtk2′. Always install LibreOffice packages belonging to the same version.

(Update 7/30/2020, 6h20: )

3:)

Actually, that summary of the Emoji display style was almost correct. Web-sites are allowed to state a non-default font using Cascading Stylesheets, and, if the Web-browser cannot find the font locally, it will download that font off the Web.

Actually, there is a technical difference, between merely declaring the font in the CSS, and publishing that font as a Web-font. If the font was merely declared, and the browser did not have it cached locally, then what the browser would do next is, either to display the requested character using one of the fallback fonts that render it, via “font stacking“, display the character instead using the default font set in the browser, or display the generic sign of an unidentified character.

The CSS needs to include explicit declarations, of where the browser may download fonts from, which could either be the host of the current Web-page, or a “font repository” external to the present page (that declares itself to be the source, in its published stylesheets).

Some users may be suspicious of this latter process, because in the earlier years of Web-browsers, those were sometimes hacked, by maliciously crafted fonts. Those users can usually disable the feature altogether. But I would think that most Web-browsers coded ?after 2010? are no longer vulnerable to such an attack, so that I’d suggest that with mainstream browsers, such as Firefox or Chrome, it makes sense to leave the feature enabled.

What can really go wrong, is that due to Web-authoring errors in the stylesheets, an unauthentic font could end up getting loaded, and thus ruining the view of the requested page.

My blog is written with the “Twentyfifteen” theme, Which also decides that it is to make use of Google’s Noto Fonts. From what I read, Google is allowing everybody to use that font family for free, and also acts as a font repository. It’s only if I were to change to some other theme, or, select a font by name, the owners of which do not give this permission, that the process could get dicey (for me). To do this correctly, I’d first obtain a paid account with a font repository, preferably one that has impressive Emojis in the subscribed font-family. And at that point, the use of (non-free) Emojis would pay off, for the operators of such a font repository. The appearance of my site’s Emojis would change.

As long as I stick with this theme, the fonts should have a corresponding, uniform appearance, in all browsers that can display the Noto Fonts in question.

BTW, This font-family’s text fonts can also be installed under Debian Linux, from the package manager. And if it is so, then major browsers should be able to display it, without having to download it.

(A clarification on 7/31/2020, 15h10: )

What I am trying to say, implicitly means that, because this blog is using the “Twentyfifteen” Theme, like any other WordPress blog that does so, its HTML includes a mere link to Google’s Font Repository. Like most Web-authors, I am not licensed to host any fonts from my own server.

(Update 7/29/2020, 19h00: )

4:)

The whole IBus usage of the package ‘ibus-table-emoji’ was not working for me. And so, what I did instead was, to navigate my terminal window to my programs folder, and then, to issue the command:


\$ git clone https://github.com/salty-horse/ibus-uniemoji.git



What this command did was, to check out a piece of software from GitHub, which I was able to compile, due to having installed the development packages for IBus. After having done so, as root, I was also able to give the command ‘make install’. This added a new Input Method to IBus, so that, in the IBus preferences window, I was able to add an Emoji Input Method. When this input method, belonging to IBus, is selected, all text typed, is being used to select Emojis. And, I left the default, of the key-combination <Super>+Space switching between IBus Input Methods, so that in theory, I could switch back and forth, between entering text, and entering Emojis, quickly.

Hence, I added a custom-compiled Input Method within IBus, even though I have no serious intention of using IBus. Activating such an Input Method really only makes sense for users who need to enter non-Western text on a daily basis. It’s overkill, just for adding Emojis to some documents.

However, I was able to test this setup, using the KDE Text Editor called ‘KWrite’. There is nothing that prevents me from starting the IBus daemon, and then, setting the environment variables, and then, just launching one application from the command-line, with these environment variables set to non-default values. The result was as shown here:

What this experiment shows is that in fact, I could activate IBus if I wanted to. This only leaves the question of, how many individual programs might be misbehaving, after I reboot.

Further, while KWrite did not, by itself, respond to <Ctrl>+<Shift>+U…, once this application was running with the IBus environment variables set, pressing <Ctrl>+<Shift>+U, then releasing those keys, then typing in the hexadecimal digits, and then pressing <Enter>, caused the highlighted Unicode value to be replaced with the corresponding character.

(Update 7/30/2020, 6h05: )

One of the concepts which I’ve touched on in this posting – even though this was not the main subject – was, that an attempt could be made to allow IBus to work together with Plasma 5.8 . I don’t really plan to go ahead with this. But in case the reader wishes to do that, I have an observation to offer, of how quickly a ‘point of failure’ can become evident, even though I never made a full switch, by setting the environment variables globally, and actually rebooting.

Normally, Plasma 5 has a Keyboard Switching Applet in the Notification Tray, and the following is an example of what it might look like, when 3 keyboard-layouts were configured, and when right-clicked:

By default, when the IBus daemon is started, this context menu changes, so that the list of keyboard layouts has a top half, a divider, and a bottom half. This is the peculiar way the Plasma 5 Layout Switcher has of stating, that out of the 3 layouts shown, only one is available for switching to, by way of the <CTRL>+<Alt>+K key-combination. After the IBus daemon is shut down again, each of the 3 layouts shown need to be left-clicked in this applet, before that divider disappears, and before switching is possible by way of the Plasma 5 key-combination again.

This happened because at one point, IBus selected a layout, independently of what this applet selected.

IBus has a relevant setting, which is opposite to its default as shown below:

If this setting is selected, before IBus is restarted, but after the Tray Applet has been restored, subsequent IBus restarts no longer mess up the Plasma 5 Tray Applet.

Beware, beware.

Dirk