Obviously, the Computing world is replete with numerous Codecs for Sound as well as for Video. I think that my main subject, in the part of the blog which you commented on, was a part of my own attempt to understand why any of them even work. If it was assumed that the sampling windows need to overlap, and that each sampling window needed to have as many frequency-coefficients as its has time-domain samples, then the first stage of our compression scheme would already double the number of data-points, and that is the opposite of compression. So right off the bat, the industry developed a Modified Discrete Cosine Transform, which allows an Audio Stream to be converted from time-domain into frequency-domain, and which preserves the number of data-points.

In the case of Video, a Discrete Cosine Transform is usually also used, only that being a 2D transform instead of a one-dimensional transform. In the 2D case, there is no assumed overlap. And then, a method of interpolation is applied, by which some of the frames are basically encoded ~like JPEGs~ , those becoming reference frames – aka key-frames – and the rest of the frames are either intra-predictive or forward-predictive, or bi-predictive, with respect to the key-frames. But a common mistake which some people make, is to expect that the interpolation would be a pixel-wise differentiation or subtraction.

The method of interpolation is usually based on some sort of macro-block structure, which is really a motion-following methodology.

Long story short, many of the frequency-domain-based methods of stream-compression, lossy, are based on almost the same principles, over and over again. If somebody is truly interested in Computing, then somewhere along the line it becomes important to understand the underlying system.

