HBP, Thanks for your answer but do you see any link between what I'm hearing and the 2nd and 3rd harmonic distortion in the authors test's?
2nd, 3rd and sometimes many higher-order harmonics...
Investigate the
Harmonic Series. Compared to a tone, the 2nd harmonic is an octave higher, the 4th harmonic is 2 octaves higher, the 6th harmonic is 2 octaves and a fifth higher, and the 8th harmonic is 3 octaves higher. All of these are fairly consonant with the original tone and explain why lower-order even harmonics are considered benign.
For the odd harmonics, the 3rd harmonic is an octave and a fifth, the 5th harmonic is an octave and a major 3rd, the 7th harmonic is 2 octaves and an out-of-tune flat-7th, and the 9th harmonic is 3 octaves and a 2nd (or 2 octaves & a 9th, however you prefer to look at it). Odd order harmonics (and all higher-order harmonics) are generally considered to add "bite" to a sound, and in large amounts could sound "grating".
As for your results: all caps you tested except the Musicap, Auricap and OD 715 are polyester dielectric; the caps I mentioned are polypropylene dielectric, with the expectation they'd perform better (sound cleaner in this case).
As for results? We'd expect any cap that sounded "warm" to have more even harmonic distortion and/or distortion products that skew towards the lower-order harmonics being louder. We'd expect that caps that sounded "bright" to either have more odd harmonic distortion (than even), higher-order harmonics, or even a cleaner sound that didn't have excessive low-order distortion. However, we can't know any of these guesses are correct without duplicating the article author's test rig, but adding a bias voltage for testing that simulates the preceding tube's plate voltage. The author didn't test with bias voltages greater than 100v because he was using a bunch of batteries as his source of d.c. voltage (to eliminate that as a variable).
He also talked a little bit a few times about inter modulation distortion. How is that different in what I'm hearing than 2nd/3rd harmonic distortion?
Harmonic distortion is musically related to the original single tone, and is always a higher frequency.
Intermodulation is when two tones are passed through a non-linear impedance (often a distorting active stage, like a tube); the output of the non-linear medium is made up of the two original tones, plus 2 new frequencies: the mathematical sum and difference of the two frequencies. So 100Hz and 1kHz would have IM products of 900Hz, 1.1kHz, as well as the original tones of 100Hz and 1kHz. The new tones are almost guaranteed to not be musically related to the original tones. Also, IM can be a source of subharmonics below the original notes, since one of the products is the difference between the tones.
Not only do the 2 original tones create 2 new tones, but any distortion of the original tone also interacts with the tones passed through the nonlinear impedance. Things become a mess real fast.
You may not have noticed, but the author used 1kHz for the bulk of distortion testing, then switch to 100Hz and 1kHz for IM testing, but still displayed only frequencies above 1kHz for his distortion plots. When there was THD of the single 1kHz tone, you saw peaks at 2kHz (2nd harmonic), 3kHz (3rd harmonic), 4kHz, 5kHz, 6kHz, etc. When he did IM testing, IM distortion showed up mostly as "pickets" 100hZ to either side of a harmonic distortion frequency. So there was a little peak at 1.9kHz and 2.1kHz (IM centered around the 2nd harmonic, with 100Hz sum/difference frequencies), 2.9kHz and 3.1kHz (IM centered around the 3rd harmonic), etc.
So IM distortion is always considered to be the greatest evil, because all sorts of non-musical trash winds up in your sound.
We can guess that "clean sound" could be a lack of distortion, or that "edgy/biting sound" could be odd-order or IM distortion, but we really don't know without doing some testing. And note the author was trying to use statistically-relevant sample sizes in his tests (like running all tests, under all conditions, on 20-30 caps of a single brand/type to verify the results were a bad cap or bad batch).