## Sound Fonts: How something that I blogged, was still not 100% accurate.

Sometimes it can happen that, in order to explain a subject 100% accurately, would seem to require writing an almost endless amount of text, and that, with short blog postings, I’m limited to always posting a mere approximation of the subject. The following posting would be a good example:

(Link to an earlier posting.)

Its title clearly states, that there are exactly two types of interpolation-for-use-in-resampling (audio). After some thought, I realized that a third type of interpolation might exist, and that it might especially be useful for Sound Fonts.

According to my posting, the situation can exist, in which the relationship between the spacing of (interpolated) output samples and that of (Sound Font) input samples, is an irrational relationship, and then plausibly, the approach would be, to derive a polynomial’s actual coefficients from the input sample-values (which would be the result of one matrix multiplication), and compute the value of the resulting polynomial at (x = t), (0.0 <= t < 1.0).

But there is an alternative, which is, that the input samples could be up-sampled, arriving at a set of sub-samples with fixed positions, and then, every output sample could arise as a mere linear interpolation, between two sub-samples.

It would seem then, that a third way to interpolate is feasible, even when the spacing of output samples is irrational with respect to that of input samples.

Also, with Sound Fonts, the possibility presents itself, that the Sound Font itself could have been recorded professionally, at a sample rate considerably higher than 44.1kHz, such as maybe at 96kHz, just so that, if the Sound Font Player did rely on linear interpolation, doing so would not mess up the sound as much, as if the Sound Font itself had been recorded at 44.1kHz.

Further, specifically with Sound Font Players, the added problem presents itself, that the virtual instrument could get bent upward in pitch, even though its recording already had frequencies approaching the Nyquist Frequency, so that those frequencies could end up being pushed ‘higher than the output Nyquist Frequency’, thereby resulting in aliasing – i.e., getting reflected back down to lower frequencies – even though each output sample could have been interpolated super-finely by itself.

These are the main reasons why, as I did experience, to play a sampled sound at a very high, bent pitch, actually just results in ‘screeching’.

Yet, the Sound Font Player could again be coded cleverly, so that it would also sub-sample its output sample rate. I.e., if expected to play the virtual instrument at a sample rate of 44.1kHz, it could actually compute interpolated samples closer together than that, corresponding to 88.2kHz, and then the Sound Font Player could compute each ‘real output sample’ as the average between two ‘virtual, sub-sampled output samples’. This would effectively insert a low-pass filter, which would flatten the screeching that would result from frequencies higher than 22kHz, being reflected below 22kHz, and eventually, all the way back down to 0kHz. And admittedly, the type of (very simple) low-pass filter such an arrangement would imply, would be The Haar Wavelet again.

If you asked me what the best was, which a Soundblaster sound card from 1998 would have been able to do, I’d say, ‘Just compute each audio sample as a linear interpolation between two, Sound Font samples.’ Doing so would have required an added lookup into an address in (shared) RAM, a subtraction, a multiplication, and an addition. In fact, basing this speculation on my estimation of how much circuit-complexity such an early Soundblaster card just couldn’t have had, I’d say that those cards would need to have applied integer arithmetic, with a limited number of fractional bits – maybe 8 – to state which Sound Font sample-position, a given audio sample was being ‘read from’, ?  It would have been up to the driver, to approximate the integer fed to the hardware. And then, if that sound card was poorly designed, its logic might have stated, ‘Just truncate the Sound-Font sample-position being read from, to the nearest sample.’

In contrast, when Audio software is being programmed today, one of the first things the developer will insist on, is to apply floating-point numbers wherever possible…

Also, if a hypothetical, superior Sound Font Player did have as logic, ‘If the sample rate of the loaded Sound Font (< 80kHz), up-sample it 2x; if that sample rate is actually (< 40kHz), up-sample it 4x…’, just to simplify the logic to the point of making it plausible, this up-sampling would only take place once, when the Sound Font is actually being loaded into RAM. By contrast, the oversampling of the output of the virtual instrument, as well as the low-pass filter, would need to be applied in real-time… ‘If the output sample rate is (>= 80kHz), replace adjacent Haar Wavelets with overlapping Haar Wavelets.’

Food for thought.

Sincerely,
Dirk

## Comparing two Bose headphones, both of which use active technology.

In this posting I’m going to do something I rarely do, which is, something like a product review. I have purchased the following two headphones within the past few months:

The first set of headphones has an analog 3.5mm stereo input cable, which has a dual-purpose Mike / Headphone Jack, and comes either compatible with Samsung, or with Apple phones, while the second uses Bluetooth to connect to either brand of phone. I should add that the phone I use with either set of headphones is a Samsung Galaxy S9, which supports Bluetooth 5.

The first set of headphones requires a single, AAA alkaline battery to work properly. And this not only fuels its active noise cancelling, but also an equalizer chip that has become standard with many similar middle-price-range headphones. The second has a built-in rechargeable Lithium-Ion Battery, which is rumoured to be good for 10-15 hours of play-time, which I have not yet tested. Like the first, the second has an equalizer chip, but no active noise cancellation.

I think that right off the bat I should point out, that I don’t approve of this use of an equalizer chip, effectively, to compensate for the sound oddities of the internal voice-coils. I think that more properly, the voice-coils should be designed to deliver the best frequency response possible, by themselves. But the reality in the year 2019 is, that many headphones come with an internal equalizer chip instead.

What I’ve found is that the first set of headphones, while having excellent noise cancellation, has two main drawbacks:

• The jack into which the analog cable fits, is poorly designed, and can cause bad connections,
• The single, AAA battery can only deliver a voltage of 1.5V, and if the actual voltage is any lower, either because a Ni-MH battery was used in place of an alkaline cell, or, because the battery is just plain low, the low-voltage equalizer chip will no longer work fully, resulting in sound that reveals the deficiencies in the voice-coil.

The second set of headphones overcomes both these limitations, and I fully expect that its equalizer chips will have uniform behaviour, that my ears will be able to adjust to in the long term, even when I use them for hours or days. Also, I’d tend to say that the way the equalizer arrangement worked in the first set of headphones, was not complete in fulfilling its job, even when the battery was fully charged. Therefore, If I only had the money to buy one of the headphones, I’d choose the second set, which I just received today.

But, having said that, I should also add that I have two 12,000BTU air conditioners running in the Summer months, which really require the noise-cancellation of the first set of headphones, that the second set does not provide.

Also, I have an observation of why the EQ chip in the second set of headphones may work better than the similarly purposed chip in the first set…

(Updated 9/28/2019, 19h05 … )

## Is it valid that audio equipment from the 1970s sound better than modern equipment?

That depends on which piece of audio equipment from the 1970s, is being compared with which piece of equipment from today.

If the equipment consists of a top-quality turntable from the late 1970s, compared to the most basic MP3-player from today, and if we assume for the moment that the type of sound file which is being played on the Portable Audio Player, is in fact an MP3 File recorded at a bit-rate of 128kbps, then the answer would be Yes. Top-quality turntables from the late 1970s were able to outperform that.

OTOH, If the audio equipment from today is a Digital Audio Player, that boasts 24-bit sound, that only happens to be able to play MP3 Files, but that is in fact playing a FLAC File, then it becomes very difficult for even the better audio equipment from the 1970s to match that.

Top-Quality Audio Equipment from the late 1970s, would have cost over $1000 for one component, without taking into account, how many dollars that would have been equivalent to today. The type of Digital Audio Player I described cost me C$ 140.- plus shipping, plus handling, in 2018.

Also, there is a major distinction, between any sort of equipment which is only meant to reproduce an Electronic signal, and equipment which is Electromechanical in nature, including speakers, headphones, phonographs… ‘The old Electromechanical technology’ was very good, except for the basic limitation, that they could not design good bass-reflex speakers, which require computers to design well. With no bass-reflex speakers, the older generations tended to listen to stereo on bigger, expensive speakers. But their sound was good, with even bass.

## There exists HD Radio.

In Canada and the USA, a relatively recent practice in FM radio has been, to piggy-back a digital audio stream, onto the carriers of some existing, analog radio carriers. This is referred to as “HD Radio”. A receiver as good as the broadcasting standard should cost slightly more than \$200. This additional content isn’t audible to people who have standard, analog receivers, but can be decoded by people who have the capable receivers. I like to try evaluating how well certain ‘Codecs’ work, which is an acronym for “Compressor-Decompressor”. Obviously, the digital audio has been compressed, so that it will take up a narrower range of radio-frequencies than it offers audio-frequencies. In certain cases, either a poor choice, or an outdated choice of a Codec in itself, can leave the sound-quality injured.

There was an earlier blog posting, in which I described the European Standard for ‘DAB’ this way. That uses ‘MPEG-1, Layer 2′ compression (:1). The main difference between ‘DAB’ and ‘HD Radio’ is the fact that, with ‘DAB’ or ‘DAB+’, a separate band of VHF frequencies is being used, while ‘HD Radio’ uses existing radio stations and therefore the existing band of frequencies.

The Codec used in HD Radio is proprietary, and is owned by a company named ‘iBiquity’. Some providers may reject the format, over an unwillingness to enter a contractual relationship with one commercial undertaking. But what is written is, that The Codec used here resembles AAC. One of the things which I will not do, is to provide my opinion about a lossy audio Codec, without ever having listened to it. Apple and iTunes have been working with AAC for many years, but I’ve neither owned an iPhone, nor an OS/X computer.

What I’ve done in recent days was to buy an HD Radio -capable Receiver, and this provides me with my first hands-on experience with this family of Codecs. Obviously, when trying to assess the levels of quality for FM radio, I use my headphones and not the speakers in my echoic computer-room. But, it can sometimes be more relaxing to play the radio over the speakers, despite the loss of quality that takes place, whenever I do so. (:2)

What I find is that the quality of HD Radio is better than that of analog, FM radio, but still not as good as that of lossless, 44.1kHz audio (such as, with actual Audio CDs). Yet, because we know that this Codec is lossy, that last part is to be expected.

(Updated 8/01/2019, 19h00 … )