tex4ht / mk4ht Broken (Problem Solved).

As of 05/26/2018 :

I have just been experimenting with a GUI front-end to LaTeX, that is called ‘LyX’, and it tries to be a WYSIWYG LaTeX Editor.

LyX tries to give editing capabilities for LaTeX documents, using an editor style similar to most word processors. Mind you, this task cannot always succeed 100%, because by its nature, LaTeX will encode the logical structure of a document-to-be-typeset, while conventional word processors try to control the appearance of documents.

And so one feature that LyX does have, is to import and export documents of various formats, most of which revolve around different LaTeX coding-styles, or around ‘rendering’ our LaTeX document to such formats as PDF or DVI, just because those two output-formats have arbitrarily emerged as standard publishing formats. DVI is really only interesting as a legacy Linux graphics format.

And so what some people will want to do, is convert documents from LaTeX either to OpenDocument format, or even to MS Word Format. These formats are initially visible in the Export Menu, if the user has command-line tools such as ‘mk4ht’ installed.

What can frustrate some people who are new to Linux, is that the command-line itself may be defective in some way, meaning that it malfunctions, and in my own experience, trying to get ‘mk4ht’ to work can be futile, when it does not work out-of-the-box. And then, trying to fiddle with the GUI of LyX is also to no avail, because the GUI can finally only work as well as the command-line, back-end that it has detected.

So instead of trying to repair ‘mk4ht’ – which, if it was working, could just as easily be tested from the command-line:


mk4ht oolatex somefile.tex

 

(Edited 05/27/2018 : )

I would propose that any readers of this blog, who have run into such a problem, and who are running Debian / Stretch, try instead, to install a Debian package called “pandoc”, as well as “pandoc-citeproc”. When LyX recognizes these programs as installed, they will become available as ways to export to or import from .DOCX as well as .ODT formats.

(Updated 09/26/2018, 21h50 … )

(As of 05/28/2018, 1h00 : )

However, as recently as Debian / Jessie, LyX did not have the converter configured yet, to use ‘Pandoc’. In this case, ‘Pandoc’ can still be invoked from the command-line, since it is a command-line tool, provided that we have exported our document to the LaTeX File ‘somefile.tex’ again, in the example below:

 


pandoc -s -f latex -t odt -o somefile.odt somefile.tex
pandoc -s -f latex -t docx -o somefile.docx somefile.tex


 

Finally, it can happen that LyX fails to display these options as available in its Export / Import Menus, even under Debian / Stretch, simply because LyX may fail to refresh the user-settings. And so one thing we can do under Linux, is just, close LyX, then:

 


cd ~
mv .lyx .lyx.bak

 

And then, restart LyX. If we do this, we will lose any customizations we have given LyX, including Converters and File-Types we have defined ourselves, and including the Recent Documents list. This brainwashes the user-space data belonging to LyX itself, but does not delete any documents we may already have saved. And once we have done this, we should be able to see, the ‘Pandoc’ importers and exporters, that would allow us to exchange documents with LibreOffice and/or MS Word. Further, it might be the case that our user-space configuration of LyX has gotten buggy, in which case this Linux-trick can just reset it.

Alternatively, in the LyX GUI, we can use the menu-command ‘Tools -> Reconfigure’ to refresh the list of recognized programs, and thereby keep our existing customizations.

Tada!

NB:

If other users do not like using the command-line often, because they’re on a Debian / Jessie system, a capability exists within ‘LyX’, to define custom conversion filters, under ‘Tools -> Preferences -> File Handling” :

screenshot_20180527_122016

Here I’m showing the conversion-rule that ships with LyX, under Debian / Stretch. Well it can also be added to LyX, under Debian / Jessie. The detail to look out for however, is to create a custom File Format first, which has a slightly different name from the built-ins, so that when the user next clicks in the Export Menu, to export to his custom File Format, the program will also know unambiguously, which conversion rule to apply.

(Update 05/27/2018, 17h30 : )

I have received a comment from an interested reader, who says that with the LaTeX engine called “TexLive”, which I’m using, ‘mk4ht’ should not be broken. And, because more-detailed information was asked for at some point, I dug a little deeper into what happens. I could be starting with a file-name such as ‘tc2techstd.tex’, and give the command shown above. Lengthy output ensues, which essentially states, that ‘mk4ht’ tries to create the directory:

 


sxw-tc2techstd.dir/Pictures

 

Before creating the directory:


sxw-tc2techstd.dir

 

What ensues is, that numerous temporary files are dropped into the PWD, and that an .ODT File is generated, which evokes an error-message from LibreOffice, when I try to open it. So, I next gave the commands:

 


mkdir sxw-tc2techstd.dir
mk4ht oolatex tc2techstd.tex

 

This results in cleaner-looking output, as well as the automated removal of the temporary directory, ‘sxw-tc2techstd.dir’. But that’s the only benefit. After that, when I try to open the .ODT File that is generated in the PWD, this time, LibreOffice does not give me an error message, but only displays a blank page. This resulting blank page stays the same, when different .TEX Files are used, and when the corresponding, temporary directory is first created by me manually.

(Update 05/27/2018, 22h00 : )

This problem seems to be an exact match, of This, already-identified Problem.

I am also using the Debian Default Java, and not Sun’s Non-Free Java.

(Update 05/28/2018, 1h00 : )

I have now installed the official Oracle JDK, v8u172, but the problem needed additional work to go away:

 


In the file:

/usr/share/texlive/texmf-dist/tex4ht/base/unix/tex4ht.env

The stanza needed to be fixed:

<ooxtpipes>
.4oo mv %%0.4oo %%0.tmp
.4oo java -classpath /usr/share/texlive/texmf-dist/tex4ht/bin/tex4ht.jar xtpipes -i /usr/share/texlive/texmf-dist/tex4ht/xtpipes/ -o %%0.4oo %%0.tmp
.4om mv %%1.4om %%1.tmp
.4om java -classpath /usr/share/texlive/texmf-dist/tex4ht/bin/tex4ht.jar xtpipes -i /usr/share/texlive/texmf-dist/tex4ht/xtpipes/ -o %%1.4om %%1.tmp
</ooxtpipes>

In such a way that full path-names needed to replace the prefixes:

%%~/ ...



 

‘mk4ht’ Works Now! Yaay! :-)


 

(Update 09/26/2018, 21h50 : )

The status of this bug has good news and bad news.

The good news:

There is no longer any reason to install the proprietary Java JRE. The state of Debian Java 8 is such, that the Open Source JRE will handle ‘xtpipes’ well enough.

The bad news:

Depending on the nature of the .TEX-File, a malfunction can take place when exporting to .ODT-Format, that mimics the above error.

I.e., I had found that even though the fix I applied above has not been rolled back, due to recent updates to LaTeX itself under Debian / Stretch, certain .TEX-Files exhibit this error again. This is particularly the case, if I used ‘Writer2Latex’ to obtain the .TEX-File in question, for an existing .ODT Document (which tends to produce copious and overbearing .TEX-Files in its effort to capture the formatting of the .ODT-File), and if I then try to use ‘mk4ht’, to convert that back into .ODT Format.

However, when it comes to simpler .TEX-Files, that I did create myself using LyX, the fix above still holds.

I think that this partial regression is due to the following file having been recompiled (by package maintainers):

 


/usr/share/texlive/texmf-dist/tex4ht/bin/tex4ht.jar

 

Dirk

 

Print Friendly, PDF & Email

6 thoughts on “tex4ht / mk4ht Broken (Problem Solved).”

  1. Thanks, a short bug report to Debian BTS would have been helpful. Karl Berry pointed me here. I will fix the Debian package (texlive-bin) which provides the binary part of tex4ht to correctly expand %%~

    Should be in Debian/unstable in a few days.

    Norbert

  2. What kind of error did you get with mk4ht oolatex? It used to be broken on Miktex, but it should work in the current version. There should be no issues in TeX Live.

    1. I was trying the repository-version, from Debian 9. There was no error message when exporting, but when I tried to open the .ODT File in LibreOffice, I got the error-message that the ODT was corrupted, and not repairable. Just to be sure this was not the fault of LyX, I also tried the command-line, as you just suggested, with identical results as far as I could tell.

      But in all honesty, I’m accustomed to having issues, if trying to export .TEX to .ODT with mk4ht.

      Dirk

      1. It is really strange that this issue happens on Debian, as it’s version of TeX Live should be pretty much identical to the official version, which doesn’t show such issues. We had similar issues with tex4ht.env on Miktex, but never got any reports from Debian. Thanks!

        1. In retrospect, I don’t even know that I needed to switch to the Oracle JDK. It may be that the default, OpenJDK solution for Debian, possesses ‘xtpipes’ by now. What matters more, is that in the file

          /usr/share/texlive/texmf-dist/tex4ht/base/unix/tex4ht.env

          The variable by name ‘%%~/‘ evaluates to just ‘//‘ . This will cause the software, not to find the correct folders, which make use of ‘xtpipes’. I needed to replace all those sequences with:

          /usr/share/texlive

          All the software-updating in the world won’t fix this, if the scripts cannot state the correct directory, in which Java is to find the class-libraries etc..

          Dirk

          1. This seems like a bug in tex4ht binary in Debian. I’ve already posted bug report to tex4ht issue tracker, I hope we can fix this.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.