Wavelet Decomposition of Images

One type of wavelet which exists, and which has continued to be of some interest to computational signal processing, is the Haar Wavelet. It’s thought to have a low-pass and a high-pass version complementing each other. This would be the low-pass Haar Wavelet:

[ +1 +1 ]

And this would be the high-pass version:

[ +1 -1 ]

These wavelets are intrinsically flawed, in that if they are applied to audio signals, they will produce poor frequency response each. But they do have as an important Mathematical property, that from its low-pass and its high-pass component, the original signal can be reconstructed fully.

Now, there is also something called a wavelet transform, but I seldom see it used.

If we wanted to extend the Haar Wavelet to the 2D domain, then a first approach might be, to apply it twice, once, one-dimensionally, along each axis of an image. But in reality, this would give the following low-frequency component:

[ +1 +1 ]

[ +1 +1 ]

And only, the following high-frequency component:

[ +1 -1 ]

[ -1 +1 ]

This creates an issue with common sense, because in order to be able to reconstruct the original signal – in this case an image – we’d need to arrive at 4 reduced values, not 2, because the original signal had 4 distinct values.

gpu_1_ow

And so closer inspection should reveal, that the wavelet reduction of images has 3 distinct high-frequency components: ( :1 )

[ +1 +1 ]

[ -1 -1 ]

And,

[ +1 -1 ]

[ +1 -1 ]

And finally,

[ +1 -1 ]

[ -1 +1 ]

And so I would hazard a guess, that when images are put through Wavelet-Reduction, what gets done for the sake of simplicity, is that the low-frequency component is computed of the entire image, and that then, that gets subtracted from the original samples, to result in their high-frequency component. ( :2 )

There are actually two forms of the low-frequency component imaginable:

  1. As provided in graphics-editing software such as GIMP, the lower-frequency layers have the same resolution as the original image, and are produced as convolutions,
  2. As used in signal-compression, the lower-frequency layers would actually have resolution halved in each direction, and would be computed independently for each of the resulting pixels.
  • (Edit 12/27/2017 : Which type of low-frequency result is computed, will then also affect the high-frequency result. )

A good question the reader might ask would be, ‘Of what practical benefit can wavelet-reduction of images be?’

In practice, this methodology results in a series of layers, starting from the highest-frequency, proceeding to the lowest-frequency, but ending in some layer below which no further reduction takes place.

And one place this does get used, is in JPEG-2000 Images, where the user can specify for each layer, how much quantization to give the Cosine Transform. In other words, it allows the user to specify that for the highest frequencies, he or she would like the greatest quantization, because supposedly, Humans only see those spatial frequencies less-subtly, while allowing finer, meaning less quantization, for the lower frequencies, which Humans can discern as having subtler shades.

In short, through Wavelet-Reduction, JPEG-2000 format gives the user more control, and eventually also allows him to store very-high-resolution images, in which ‘an ant’ can be discerned, and yet nevertheless to compress those images considerably.

(Edit 12/27/2017 : )

1: )

If the goal was, to reduce every 2×2 patch of samples into single values, losslessly, then:

  • A 2-plane intermediate result would need to be computed, in which each row of the original image was first passed through the low-pass wavelet, and then through the high-pass wavelet,
  • A 4-plane final result would need to be computed, where each column, of each plane of the intermediate result, was first passed through the low-pass wavelet, and then through the high-pass wavelet, giving the same results I showed above.

But I do not know about practical examples of wavelet reduction of images, that lead to 4 planes per stage.


 

Now I suppose that there’s one aspect to the encoding of signed values, in standard pixel-formats, which are unsigned by default, by consumer graphics applications, which I might point out:

When it’s known that a pixel-channel is supposed to be signed, then a value of 128/255 is taken to represent zero, so that the signed result can be recovered as:

( v – 128 ) / 127.0

2: )

If the true purpose of wavelet decomposition was data-reduction, then Software Engineers might find, that having to compress only 3 samples, is better than having to compress 4 of them. But in that case, people should also recognize, that in whichever directions the signal has been put through the high-pass wavelet, its frequencies will also be inverted. Those ‘layers’ of the image would no longer be spatially correct. And so in general, Humans would not be able to recognize features belonging to the corresponding high-frequency components, just by looking at them.

Dirk

 

Print Friendly, PDF & Email

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>