pnmtofiasco Bug (Solved)

One of the things which some nerds do, is not just to present a photo of themselves, but to present a photo of themselves, which has been distorted in some specific way. In this vein, some of my readers may already have wondered, how I created the following image of myself:

dirk-fract1

This photo was actually obtained by using a Linux program that no longer works. That program started with a pixel-map, and compressed it into a so-called fractal representation, which is also known as ‘Fiasco’ format.

It may be common knowledge by now, that fractals can represent complex geometries, which take up very little data that way, but which exhibit self-similarity. And, fractals can be converted into pixel-maps to varying depths of recursion. But what some people did a long time ago, was explore, how pixel-maps can be compressed as fractals instead.

(Updated 3/10/2019 … )

(As of 05/17/2018 : )

And the approach which was taken, was just to create a tile, which reduces an image by a factor such as 8 or by 16, and then to perform a regression test of sorts, to see how well the pixel-values in different parts of the image, resemble the pixel-values of the tile. On that basis, the degree was computed, to which an image could be reconstructed first as a grid of colors, corresponding to a regular grid of adjacent tile-widths, and by which the tile could be added to the image-colors already there. (Positively or negatively.) These operations would then be applied subtractively, when encoding, but applied additively, when decoding. ( :1 )

(Edit 05/25/2018 :

Even though correlations which exceed ±1.0 or which are negative are sometimes computable, if it was to happen that the pixel-values inside one of the blocks, resemble the negative of the entire image to some degree, the programmers may well have felt that this does not correspond to the spirit of a fractal, since fractals are more about geometry, than they are about statistical correlations. Thus, the original programmers may have clamped the range of allowable correlations to [ 0.0 .. 1.0 ] . And if they did, a practical benefit of doing so would have been, to ease storing the correlations as 8-bit values, while still preserving overall 8-bit accuracy. If greater range was needed, then either to store the correlation as a larger field of bits, or to reduce eventual accuracy, would also become necessary. )

(Edit 05/18/2018 :

It may or may not be that ‘pnmtofiasco’ halves the size of the tile further, the number of times specified by the ‘-z’ command-line parameter, which is also the number of optimization passes and defaults to zero. In any case it’s difficult to estimate how many times the image was initially tiled, based on visual inspection of the decoded image, because if, for the sake of argument, the tiling was initially 8×8, then the color-blocks would also initially have been 8×8, which would next be reduced in size 8x, just to regenerate the tile when decoding the image. And then, if the tile was applied 8×8 times, this would lead to 64×64 apparent blocks. Further, If the tile-size was halved an additional number of times, again it would lead to smaller, apparent color-blocks, just due to self-similarity being simulated to some finite depth.

Apparently, this process is applied, wherever correlations in the encoded image exceed some correlation that follows from the chosen quality-level.

Also, I assume that if several passes are to be made over the image, the algorithm won’t just halve the size of an existing tile when decoding. Instead, the tile would be recomputed from the emerging image, at reduced pixel-sizes, with each pass. This means that when the Fiasco-formatted image is decoded back into a pixel-based image, the latter can be given a much higher resolution than the original image had – “Magnified” – just so that the complexity of the fractals can be appreciated.

The way I just described the encoding procedure so far, would not match exactly to the way the Fiasco-formatted image will be decoded again. The reason is the fact that when decoding, a reduced tile of the entire image is not available, as it was when encoding. To make up for that, ‘pnmtofiasco’ supports the ‘–prediction’ flag, which means that the encoder will use the tile which the decoder would use, to compute correlation and then subtract the product of that from the source image, instead of using the reduced version of the full image.

(As of 05/17/2018 : )

The resulting file-size, to store the 800×800 image above, is ‘only’ 1.3kB !

But, because Web-browsers that visit my site do not know how to display the .WFA File, what I did next was to convert that back into a JPEG File. And another thing I did with this one, was change the parameters, to exaggerate the ways in which fractals are inaccurate, and fractal-like.

But, to create the .WFA File, we’d need a Debian package installed which is named ‘netpbm’, and which contains literally hundreds of conversion tools which run from the command-line. And the command we’re looking for is ‘pnmtofiasco’.

Well, that was then, this is now. When I type in the correct command, given a .PPM File which I know is not corrupted, this is what I get:

 


dirk@Plato:~$ cd ~/Pictures/Fractals
dirk@Plato:~/Pictures/Fractals$ ls
logo-dirk.gif  logo-dirk.ppm
dirk@Plato:~/Pictures/Fractals$ which pnmtofiasco
/usr/bin/pnmtofiasco
dirk@Plato:~/Pictures/Fractals$ pnmtofiasco -z 2 -t 3 -i logo-dirk.ppm -o dirk-fract1.wfa
*** buffer overflow detected ***: pnmtofiasco terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7fda7b193bfb]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7fda7b21c1f7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf7330)[0x7fda7b21a330]
pnmtofiasco(+0x27d4c)[0x56173826bd4c]
pnmtofiasco(+0x27f85)[0x56173826bf85]
pnmtofiasco(+0x57b2)[0x5617382497b2]
pnmtofiasco(+0x2e72)[0x561738246e72]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fda7b1432e1]
pnmtofiasco(+0x324a)[0x56173824724a]
======= Memory map: ========
561738244000-561738275000 r-xp 00000000 08:01 3027143                    /usr/bin/pnmtofiasco
561738474000-561738475000 r--p 00030000 08:01 3027143                    /usr/bin/pnmtofiasco
561738475000-561738476000 rw-p 00031000 08:01 3027143                    /usr/bin/pnmtofiasco
561738476000-5617384ab000 rw-p 00000000 00:00 0 
561738513000-561738580000 rw-p 00000000 00:00 0                          [heap]
7fda7aec5000-7fda7aedb000 r-xp 00000000 08:01 15731552                   /lib/x86_64-linux-gnu/libgcc_s.so.1
7fda7aedb000-7fda7b0da000 ---p 00016000 08:01 15731552                   /lib/x86_64-linux-gnu/libgcc_s.so.1

(...)


 

I get this result, regardless of whether the command-line parameters are given or not. I.e., the same thing happens without the ‘-z’ or the ‘-t’ .

This bug has been known to exist since 2012, and nobody has ever touched it.

 


(Update 3/10/2019 : )

I just received a Comment on this subject from a reader named Murtaza, who wanted to know how to use Fractal Compression. And as far as I can tell, the original bug has been fixed by the developers, If we download the source-code from Source-Forge. The following page describes how to do so:

http://netpbm.sourceforge.net/getting_netpbm.php

On a Debian / Stretch machine, the “Stable” version can be downloaded and compiled, and when the ‘./configure’ command is run, the Option can be chosen to use ‘Static’ library linkage. This would not be the default, but if chosen, will allow the binary files ‘pnmtofiasco’ and ‘fiascotopnm’ to be copied by themselves to ‘/usr/local/bin’. After that, the new and improved Fractal Compression Utilities can be run from the command-line.

But I must warn the readers that the new version of these programs isn’t geared to producing intentionally-distorted images, but rather, to producing accurate images as much as possible, while achieving the compression levels associated with Fractal Compression. One really needs to change the settings now, to obtain distorted images. :-P

(End of Update 3/10/2019.)

(Edit 05/19/2018 : )

I’ve decided to have a try, at examining the source code for ‘pnmtofiasco’. And what I found, is that there are issues with it, which are technical in nature, and which were simply careless on the part of the original programmers.

I was able to modify ‘pnmtofiasco’ in such a way, that it compiles and works. The original programmers had written in straight C, but always allocated the smallest amount of memory in bytes, that their code needed, to work. Well in one specific place, they made a slight error in their Math, and wrote a ‘calloc()’ function call, that allocated slightly less memory than was needed. I was able to modify their code, to allocate slightly more memory than it really needs, but always to make sure their code gets enough memory to work, and without changing anything else about their code.

I changed Line 243 in the source-file ‘pnmtofiasco.c’ .


 

The reason I needed to do this, is the fact that on my machine,

sizeof (char *)

Evaluates to 4 (bytes), while

sizeof (void *)

Evaluates to 8 (bytes). But, the original code inserted one or more characters strings, and then appended a ‘NULL’ to the array, which therefore had twice the size of the regular entries. For that reason, I also needed to go back and change Line 252. So now I can give the command

pnmtofiasco -i logo-dirk.ppm

All by itself, and watch the character garbage come out, which would be a .WFA File! :-)


 

The result is that I can create .WFA Files again. ( :2 )


 

But in order for those to be of any value to the user, there needs to be a way to convert .WFA Files back into pixel-maps of some kind. This is usually done with another ‘netpbm’ program named ‘fiascotopnm’. What I found, was that in its original form, that program also segfaults. But, there was another easy fix.

‘fiascotopnm’ analyzes the output-file’s intended name, and tries to separate it into a basename, and a suffix, according to what was typed on the command-line. This was meant to serve two purposes:

  1. If the user had made the mistake of giving the file the suffix ‘.pgm’ and it was a color-file, or, if the user had made the mistake of giving the file the suffix ‘.ppm’ and it was a gray-scale file, ‘fiascotopnm’ would reverse this mistake and save it out to the correct name, with the corrected suffix,
  2. If a video-stream was to be output to sequentially-numbered file-names, ‘fiascotopnm’ needs to know their base-name, so that the program can assemble systematically-concatenated, numbered output-files, corresponding to one frame each.

Well something initially goes wrong, when the program tries to extract the basename and the suffix from the intended output-file. I’m not sure exactly what goes wrong there. But, the memory-allocation function to store the correct file-name, uses the length of the basename, as well as the length of the recognized suffix, to decide how many bytes could be used to store eventual output-file-names. If at that point, either the base-name or the suffix are garbled, according to what the earlier function in the program tried to determine, a buffer-overrun will also take place.

And so, without analyzing exactly why the code does not work 100% as intended, it was easy for me to replace the memory-allocation with one, that just allocates the number of bytes which the originally-typed output-file-name took up, which adds another (1 + 20) characters, and which next uses that, as the buffer to store output-file-names. Further, in line with this hypothesis, I changed the code, for writing 1 output-frame, to code which does not try to assemble a basename, a dot, and a suffix, but which rather just string-copies the originally-typed output-file-name to the buffer.

And the result was, that I can now convert .WFA Files back into pixel-maps as I wish! :-D  There are now two problems left with ‘fiascotopnm’ as it stands:

  1. If I name my PNM File incorrectly, it will stay named that way,
  2. Consistently with my theory of what goes wrong, outputting a video-stream as sequentially-numbered files, should still not work.

But, on one of my computers, at least Fiasco-compression and decompression, for single frames, works again! :-)


 

(Edit 05/19/2018, 16h15 : )

The fact was knawing at me, that I had a version of ‘fiascotopnm’, which would work almost all the time, but only for single frames, when in fact this code is supposed to work for multi-frame streams as well. And as I had written, the original problem seemed to lie, in a C function named ‘get_output_template()’, which extracted the base-name and the suffix, for the file to store output pixel-maps to.

This code used to work! Why was it suddenly not working? Well I found out. Here is a copy of what this function looks like:

 

static void
get_output_template (const char *image_name, const char *wfa_name,
                     bool_t color, char **basename, char **suffix)
/*
 *  Generate image filename template for output of image sequences.
 *  'wfa_name' is the filename of the WFA stream.
 *  Images are either saved with filename 'basename'.'suffix' (still images)
 *  or 'basename'.%03d.'suffix' (videos).
 *
 *  No return value.
 *
 *  Side effects:
 *  '*basename' and '*suffix' is set.
 */
{
    if (!wfa_name || streq (wfa_name, "-"))
        wfa_name = "stdin";       
    /*
     *  Generate filename template
     */
    if (!image_name || streq (image_name, "") || streq (image_name, "-"))
    {
        *basename = strdup (wfa_name);
        *suffix   = NULL;
    }
    else
    {
        *basename = strdup (image_name);
        *suffix   = strrchr (*basename, '.');
    }
    
//    if (*suffix)         /* found name 'basename.suffix' */
//    {
//        **suffix = 0;         /* remove dot */
//        (*suffix)++;
//        if (**suffix == 0)
//            *suffix = strdup (color ? "ppm" : "pgm");
//    }
//    else             /* no suffix found, generate one */
//        *suffix = strdup (color ? "ppm" : "pgm");

    if (*suffix)         /* found name 'basename.suffix' */
    {
        strcpy(*suffix, "");         /* remove dot */
        *suffix = strdup (color ? "ppm" : "pgm");
    }
    else             /* no suffix found, generate one */
        *suffix = strdup (color ? "ppm" : "pgm");
}


 

I commented out the part of the code, which was malfunctioning. The offense took place on line 34, as presented above. The problem there was, that the code assigned the integer zero, to one character of a string. This happened to work, because the code was previously being compiled to run on a 32-bit system, and because the suffix which was already present, was already a 3-byte suffix! Hence, what C will do on a 32-bit system, is copy a 4-byte integer zero, to the point where the dot was, and past that point.

On a 64-bit system, the assignment will copy an 8-byte integer representation of zero, which will go right past the string, which ‘strdup()’ just allocated memory for! Hence, corruption will take place on a 64-bit system.

My corrected code will just use ‘strcpy()’ instead, to remove the dot, will assume that nothing more will follow it, and will just recreate the correct suffix, on that assumption.

The code works, and I should now also be able to output sequentially-numbered .PPM Files, in the case of input streams.

But, this type of problem – a 64-bit system instead of a 32-bit system – may also be a reason why the tight allocators used elsewhere, may have failed. I just glossed over that issue in the other example, by writing a more-generous use of ‘calloc()’ .

(Edit 19h00 : )

I just reverified that the above code-correction was correct. I did so, by making sure that I was using it, recompiling, copying to the root file system, and testing again.

A fact which is worth mentioning, is that on 64-bit CPUs, 8-bit operations are deprecated. In fact, many 16-bit operations are also deprecated, even though “Discipulus” creates Genetic Algorithms, which consist of 16-bit instructions – some of which are no longer available. Addresses are expected to be 32-bit aligned for optimal performance.

The reason why the legacy C functions like ‘strcpy()’ still exist, is because for 64-bit machines, they have been hand-coded in assembler, to preserve the 8-bit logic of their operation. In fact, legacy, 8-bit ASCII is partially deprecated.

Line 34 of the code above, could also be rewritten like so:

 

        **suffix = (unsigned char) ( 0 );         /* remove dot */

 

One negative side-effect of that would be, that ‘suffix’ will be sent back as starting on an odd byte-address 50% of the time, which would at the least cause performance penalties…

The reason why omitting this causes problems, has to do with the additional fact that unlike its successor, C++, C is not “type-safe”. That means that unlike how it is for C++, the data-type and unit-width of what is being computed, is determined entirely by what has been stated explicitly, on the right-hand size of the assignment operator. It then gets assigned to what’s on the left – without taking into consideration, what’s on the left, as long as there is a loose correspondence. In other words, because a ‘char’, a ‘short int’, an ‘int’, and a ‘long int’ are all integers, the compiler will allow one to be assigned to the other, but will not perform any implicit type-conversions in the background.

And then, in order for the constant literal ‘0’ to be equivalent to ‘NULL’, they both need to have 64-bit definitions, while ‘\0′ is defined as an 8-bit entity.


 

(Update 05/21/2018 : )

One concept which I mentioned above somewhat loosely, was that a correlation can be determined, between a reference tile, and the corresponding pixels in a target area in the image. Of course, in order to write working code, we don’t just need such loose descriptions, but Mathematically correct algorithms. And so, I turn to my reference-sheet on Regression Analysis:

http://dirkmittler.homeip.net/Regression-Analysis.pdf


(Additional comments deleted 3/10/2019.)

Dirk

 

 

Print Friendly, PDF & Email

3 thoughts on “pnmtofiasco Bug (Solved)”

  1. I just discovered fractal compression and wanted to play with fiasco but alas couldn’t find any other libraries and this one is afflicted by the bug. I guess i *could* go into the source code for netpbm but do you know of any other libraries to do fractal compression? I may just end up developing my own (and my own format then) but it would be cool to play around a bit first

    1. Hello Murtaza,

      I deleted the second of your comments because I wouldn’t want other people to see your contact info.

      Unfortunately, unless you’re also willing to get your hands dirty with source code, there are few ways to achieve fractal compression. The images look pretty messed up, and similar degrees of bit-compression can be achieved by other means, that don’t damage the appearance as much. For that reason, fractal compression with the aim of actually compressing file-size, has been neglected by the mainstream, and you’ll find few if any libraries.

      AFAICT, Version 10 of the libraries still have this bug.

      Dirk

Leave a Reply to Dirk Mittler Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.