Observations, on how to insert Unicode and Emojis into text, using a KDE 4 / Plasma 5.8 -based Linux computer.

One of the earliest ‘inventions’ on the Internet, were ‘Smilies’, which were just typed in to emails, and which, when viewed as text, evoked the perception of whichever face they represented. But, graphical user interfaces – GUIs – replaced simple text even in the 1990s, and the first, natural thing which developers coded-in to email clients was, the ability to convert typed, text-based smilies, into actual images, flowed with the text. Also, simple colon-parenthesis sequences were replaced with other, more varied sequences, which could be converted by some email clients into fancier images, than simply, smiling faces.

Actually, the evolution of the early Internet was slightly more complex than that, and I have even forgotten some of the real terms that were used to describe that History.

But there is an even more recent shift in the language of the Internet, which creates a distinction between Smilies, and ‘Emojis’. In this context, even many ‘Emoticons’ were really just smilies. Emojis distinguish themselves, in that these pictograms are represented as part of text in the form of Unicode values, of which there is such a large supply, that some Unicode values represent these pictograms, instead of always representing characters of the Earth’s many languages, including Chinese, Korean, Cyrillic, etc. What some readers might ask next could be, ‘Traditionally, text was encoded as 7-bit or 8-bit ASCII, how can 16-bit or 32-bit Unicode characters simply be inserted into that?’ And the short answer is, through either UTF-8 or UTF-16 Encoding. Hence, in a body of text that mainly consists of 8-bit codes, half of which are not normally used, sequences of bytes can be encoded, which can be recognized as special, because their 8-bit values do not correspond to valid ASCII characters, and their sequences complete a Unicode character.

One fact which is good to know about these Emojis is, that they are often proprietary, which means that they are often either the intellectual property of an IT company, or part of an Open-Source project. But the actual aspect of that which can be proprietary is, the way in which Unicode values are rendered to images.

What that means is that, for example, I can put the following code into my blog: 🤐 . That is also referred to as Unicode character ‘U+1F910′. Its length extends beyond 16 bits by 1 bit, and the next 4, most-significant bits are all 1’s, as expressed by the hexadecimal digit ‘F’. It’s supposed to be a pictogram of a deceased entity, as if that were stated correctly by a head which has had certain features crossed out. But for my blog, the use of such a code can be a hazard, because it will not display equally on Android devices, as it displays on iOS devices. And, on certain Linux computers, it might not be rendered at all, instead just resulting in a famous rectangle that seems to have dots or numbers inside it. This latter result will form, when the client-program could not find the correct Font, to convert this code into an image. (:3)

Those fonts are what’s proprietary. And, they also provide some consistency in style, between Android devices, OR between iOS devices, OR between Windows devices, etc.

Well, I began this posting by musing about the early days of the Internet. During those days, some users – myself included 😊  – did some things which were truly foolish, and which included, to put background images into our HTML-composed emails, and, to decorate documents with (8-bit) dingbat fonts, just because it was fun to pass certain fancier documents around, than POT. I don’t think there is really anything wrong with potential readers, who still put background images into their emails. What I mean is that many of my contacts today, prefer emails which are not even HTML.

This earlier practice, of using dingbat fonts etc., tended to play favourably into the hands of the tech giants, because the resulting documents could only be viewed by certain applications. And so today, I needed to ask myself the question, of how often the use of Emojis can actually result in a document, which the recipient cannot read. And my conclusion is that today, such an indecipherable outcome is actually rare. So, how I would put a long story short is to say, that Commercialism is back, riding on the desire of younger people to put more-interesting content into their messages, and perhaps, without some of the younger people being aware that when they put Emojis, they are including themselves as the software-disciples of one larger group or another. But that larger group mainly seems to be drawing its profits, from the ability of certain software to insert the images, rather than, the ability of only certain software to render them at the receiving end (at all). Everybody knows that, even though the input methods on our smart-phones don’t lead to massively good prose, they almost always offer a rich supply of Smilies, plus Emojis, all displayed to the sender using his or her own font, but later displayed to the recipient, using a potentially different font.

The way Linux computers can be given such fonts, is through the installation of packages such as ‘fonts-symbola’ and ‘ttf-ancient-fonts’, or of ‘fonts-noto‘… The main drawback of the open-source ‘Symbola’ font, for example, is simply, that it often gives a more boring depiction of the same Unicode character, than the depiction which the true Colour Noto Font from Google would give.

One interesting way in which Linux users are already in on the party is, in the fact that actual Web-browsers are usually set to download fonts as they are needed, even under Linux, for the display of Web-pages. Yet, email clients do not fall into that category of applications, and whether they render Emojis depends on whether these font packages are installed.

Hence, if the ability to send Emojis from a Linux computer is where it’s at, then this is going to be the subject of the rest of my posting. I can put two and two together, you know…

(Updated 7/31/2020, 15h10… )

Continue reading Observations, on how to insert Unicode and Emojis into text, using a KDE 4 / Plasma 5.8 -based Linux computer.

Finally getting the LanguageTool Grammar-Checker to work under Debian.

One of the software products which Linux has traditionally been lacking in, is grammar-checking software, that runs off-line. And one such package that I’ve been keen on getting to work, under Debian / Stretch as well as under the older Debian / Jessie, is LanguageTool. Why this LibreOffice extension? Because I would mainly want German and English support, while the product also offers some French support, the last of which can always come in handy in a Canadian province, which is officially French-speaking. The need can always arise, to write some letter in French, and a letter in which the grammar will need to be correct.

There do exist other proprietary solutions, but none which add full support for German.

When I write that I’ve been ‘keen on getting (this) to work’ , I’m referring to the fact that this extension can be a bit temperamental. As I was activating version 4.3 on my Debian / Jessie laptop ‘Klystron’ , I noticed that I had versions 2.3 and 3.8 already-downloaded there, that were left on my hard drive from past, forgotten failures. So, what I learned while installing v4.3 on ‘Plato’ , my main, Debian / Stretch computer, would be put to the test. If the attempt to install v4.3 on ‘Klystron’ also, worked on my first try, then these would be valid observations made, when previously working to get the same version working on ‘Plato’ . And my result was, that I could get v4.3 to work on ‘Klystron’ as well, on the first try! :-D

screenshot_20181222_212045

( Screen-Shot from the computer ‘Plato’ . )

During my recently-failed attempt to get this extension working on ‘Plato’ , I had gotten to a series of error-messages while clicking on ‘Tools -> LanguageTool -> Options…’ , that amounted to Java Null-Pointer Exceptions. Usually, such exceptions would indicate some serious programming error. But what I found was that I could get the extension to work, if I followed 4 basic guidelines:

(Updated 12/25/2018, 14h20 … )

Continue reading Finally getting the LanguageTool Grammar-Checker to work under Debian.

The failings of low-end consumer software, to typeset Math as (HTML) MathML.

One of the features which HTML5 has, and which many Web-browsers support, is the ability to typeset Mathematical formulae, which is known as ‘MathML’. Actually, MathML is an extension of XML, which also happens to be supported when inserted into HTML.

The “WiKiPedia” uses some such solution, partially because they need their formulae to look as sharp as possible at any resolution, but also, because they’d only have so much capacity, to store many, many image-files. In fact, the WiKiPedia uses a number of lossless techniques, to store sharp images as well as formulae. ( :1 )

But from a personal perspective, I’d appreciate a GUI, which allows me to export MathML. It’s fine to learn the syntax and code the HTML by hand, but in my life, the number of syntax-variations I’d need to invest to learn, would be almost as great, as the total number of software-packages I have installed, since each software-package, potentially uses yet-another syntax.

What I find however, is that if our software is open-source, very little of it will actually export to MathML. It would be very nice if we could get our Linux-based LaTeX engines, to export to this format, in a way that specifically preserves Math well. But what I find is, even though I posses a powerful GUI to help me manage various LaTeX renderings, that GUI being named “Kile”, that GUI relies back on a simple command-line tool named ‘latex2html’. Whatever that command-line outputs, that’s what all of Kile will output, if we tell it to render LaTeX specifically to HTML. ‘latex2html’ in turn, depends on ‘netpbm’, which counts as very old, legacy software.

One reason ‘latex2html’ will fail us, is the fact that in general, its intent is to render LaTeX, but not Math in any specific way. And so, just to posses the .TEX Files, will not guarantee a Linux user, that his resulting HTML will be stellar. ‘latex2html’ will generally output PNG Images, and will embed those images in the HTML File, on the premise that aside from the rasterization, PNG Format is lossless. Further, if the LaTeX code was generated by “wxMaxima”, using its ‘pdfLaTeX’ export format, we end up with incorrectly-aligned syntax, just because that dialect of LaTeX has been optimized by wxMaxima, for use in generating .PDF Files next.

(Updated 05/27/2018 : )

Continue reading The failings of low-end consumer software, to typeset Math as (HTML) MathML.

Getting the Orca Screen-Reader to work under Plasma 5

In case some readers might not know, even though computing is heavily visual, certain advanced desktop-managers can be set up for impaired people to use – which also falls under the category of “Accessibility”. This includes the ability of the computer to speak a robotic description of what’s happening on the screen, in case a user can hear, but not see properly.

There are some users who feel they should stick with Windows, because Accessibility can be so hard to set up under Linux.

There are other users who are sorry they every clicked on “Accessibility”, because now they cannot turn it off.

If a visually-impaired user wants Accessibility set up on a Linux computer, I’d definitely suggest letting a trusted other person set it up, because until it’s set up, complicated things may need to be done, and accessibility will not be set up, so that the end-user will not benefit from Accessibility, while trying to set it up.

Some regular users find screen-readers trying for their patience, because of the fast, robotic voice, until they manage to shut it down again. Personally, I only find screen-readers trying, If I happen to have set one up late at night, because the voice could cause some sort of noise-complaint from my neighbors, droning on until I manage to disable it again. In the middle of the day, I don’t find these experiments trying.

I guess that a good question which some people might ask me, would be why I even do such an experiment, given that I’m not visually impaired and don’t need it. And what I do is set everything up until it works, and then disable it again.

On my recently-installed Debian / Stretch computer named ‘Plato’, which is also running Plasma 5 as its desktop-manager, I just did manage to get this feature to work, and then to disable it again.

(Updated 15h50, 1/17/2018 : )

The first thing I had to do, was install a long list of packages. The list below includes what was installed, but it should not really be necessary to give the command to the package-manager manually, to install everything here, because some of these packages will follow as dependencies from other packages. But here is a roundabout list:

Continue reading Getting the Orca Screen-Reader to work under Plasma 5