One fact which I have described in my blog, is that when Audio Engineers set the sampling rate at 44.1kHz, they were taking into account a maximum perceptible frequency of 20kHz, but that if the signal was converted from analog to digital format, or the other way around, directly at that sampling rate, they would obtain strong aliasing as their main feature. And so a concept which once existed was called ‘over-sampling’, in which then, the sample-rate was quadrupled, and by now, could simply be doubled, so that all the analog filters still have to be able to do, is suppress a frequency which is twice as high, as the frequencies which they need to pass.
The interpolation of the added samples, exists digitally as a low-pass filter, the highest-quality variety of which would be a sinc-filter.
All of this fun and wonderful technology has a main weakness. It actually needs to be incorporated into the devices, in order to have any bearing on them. That MP3-player, which you just bought at the dollar-store? It has no sinc-filter. And therefore, whatever a sinc-filter would have done, gets lost on the consumer.
Further, if a device is instructed to play an audio file with a sample-rate of 48kHz, but output the sound at 44.1kHz, then the correct way to accomplish that is first to interpolate the samples to a virtual output sample-rate of 88.2kHz. Then, a digital low-pass filter needs to be applied to that, which does not pass frequencies higher than 22.05kHz. Only then, should the stream be sub-sampled down to 44.1kHz. (Revised 04/12/2017, see below… )
But the designers of many smart-phones decided that this was too much effort and money. And so they cut every corner, and produced their output sample-rate of 44.1kHz,
using a direct interpolation of every output sample, with no low-pass filter.
In fact, if they were to perform a correct down-conversion, in practice they would have needed a sound-chip, which accepts a digital stream at 88.2kHz, so that a low-pass filter presumably in the sound-chip, would also complete the down-conversion.
Now, a Daubechies Wavelet, or more correctly, the low-pass component of that which is also known as its Scaling Function, can perform this task with less computation than a sinc-filter. But then, this approach would also be much better, than a mere linear interpolation.
This argument can be taken further. It is easy to criticize a dollar-store MP3-player. But we may not need to stoop so low, in order to find deficiencies in more-expensive equipment.
We could be buying an expensive A/D converter designed as our USB-sound-device. And we might be asking ourselves, whether we should sample our audio at 88.2kHz, or even at 192kHz. If the device indicates an availability of 32-bit sample-depth under Linux, this was clearly just a malfunction, brought on by the fact that Linux applies one-size-fits-all device-drivers. We need to ignore the least-significant 8 bits in that case…
The question of whether we need to sample our audio at a higher sampling rate than 44.1kHz, is answered by how untrusting we are, of the circuitry or lack thereof, in our USB-sound-device. If we can assume that this device applies a sinc-filter in hardware, we might not need any faster than 44.1kHz or 48kHz. But some devices only draw as much current, as a USB 2 -port can supply them with, because they are not self-powered. And at that point pessimism may set in, as to whether the hardware even applies a proper low-pass filter.
And if the A/D device does not contain adequate low-pass filters, this would be the real reason, why people choose to run them at 88.2 or at 192kHz. Because in that case, we need to supply that low-pass filter from our CPU instead.
If I connect my own USB-sound-device to my computer, and set its sample-rate to 192kHz, it is not my assumption that I will suddenly be able to capture an 80kHz sine-wave from the analog input. The reason is my assumption, that the same resistors and capacitors are still connected there, as were, when I selected a sample-rate of 44.1kHz .
The question of whether or not I will also be capturing distortion-products, depends largely on whether the microphone I use as signal-source, outputs much of a signal at frequencies higher than 40kHz. It might go up to 30kHz, but not to 80kHz.
And the requirements on recording audio are much more strict than they are on playing it back. I will know, when I need to instruct my CPU, to down-convert the audio-track, let us say before giving it to one of my friends. But, if aliasing ever got in to my recording procedure, then the filtering applied by my friend will no longer be able to repair this damage.
(Edit 03/08/2017 : )
I could be an optimist about my USB-sound-device, and speculate, that its actual A/D converter constantly runs either at 192kHz or at 176.4kHz, and that although we can ask for digital streams at those rates, when we ask for a stream at 88.2kHz, it has already been passed through a digital low-pass filter – one that works numerically. That way, the analog filters in the device can stay fixed, to pass 40kHz but suppress 80kHz.
If that were true, then the only weakness of the device might be, that it uses the Daubechies Wavelet / Scaling Function, instead of using a Sinc-Filter in real time.
But the library which some of my software uses, that gets applied by my CPU, does in fact use a Sinc-Filter. Yet, we understand that in such a case, this was not the point and I was being a bit over-cautious.
If this were true, then my device would be drawing more current when set to deliver samples at 48kHz, than it does to deliver 192kHz. Guess what. When it overloaded the USB port of my tablet, it was set to 48kHz. But, when I first tested it on my laptop at
192kHz, nothing overloaded.
The USB ports on my laptop are USB 3 however, and within that specification the designers have allowed for more current than 500mA to be drawn.
(Content Deleted 03/21/2017 Because Inaccurate. )
(Edit 04/13/2017 : )
Actually, if 48kHz-sampled sound is to be down-sampled to 44.1kHz, then a higher-quality approach is possible, and encouraged. The problem with the method I wrote above is, that the linear interpolation between 48kHz samples will seem to produce noise-components, which can easily extend below the cutoff-frequency of 22.05kHz.
These sound-products will seem to appear as modulations of a frequency of 44.1kHz, because the interpolated sample-rate is 88.2kHz. But, because the input stream has a higher Nyquist-Frequency than the output stream, it can input once-intended frequency-components as high as 24kHz. This would cause unintended frequency-components to appear in the output stream, as low as 20.1kHz. According to some people, the misinterpretation of legal signal-content is just not perfect.
The higher-quality approach would be, first to up-sample the stream to 96kHz, after which a half-band filter can be applied, with a cutoff-frequency of 24kHz. After that, the interpolation can be made to 88.2kHz, and a half-band filter applied again, to cut off everything above 22.05kHz. And then, to sub-sample.
That way, frequency-components originally in the stream above 22.05kHz, will remain so by the time they reach the second low-pass filter, which will be able to filter them out.
The problem with doing all that would be the relatively high amount of CPU usage, for which reason it really does not work well in real-time, on a phone. Here, the hardware maximally facilitates the output of samples at 88.2kHz, followed by whatever low-pass filtering is normally applied.