Let’s say you have a sound sample of a violin playing a G and one of the same violin playing a G an octave higher on the same string, and you would like to simulate a bend between the two. Many music programs will take the first sample and increase its pitch to the higher G, but there are a couple of problems with that: (1) what if you don’t know the pitches? (2) the high note sounds awful and distorted (since you didn’t use the high sample at all). So how do you do it right?
Well, I don’t know the answer. I have been passively interested in this question for at least a year. Something in my math modeling class today re-ignited the idea. Essentially, I’m looking for interpolation of Fourier spectra. Given two spectra (which you can essentially think of simply as functions) f(ω) and g(ω), I’m looking for an interpolator h(ω, α) such that h(ω, 0) = f(ω) and h(ω, 1) = g(ω).
Sharp readers will notice that the linear interpolation function will do just that: h(ω, α) = (1-α) f(ω) + α g(ω). But that is not good enough: if you do that on Fourier spectra, what you essentially get is a crossfade effect. Picture the spectra: one function has a peak at 391 Hz, the other has a peak at 783 Hz. Half-way through the interpolation, you’ll have two peaks of equal height, one at each of those frequencies. Instead what we want is a peak at 587 Hz (i.e. the peak should “shift” across rather than shrinking and growing somewhere else).
I will explain my method procedurally before giving the calculus definition. Say we want to know what the amplitude of our interpolation is at 587 Hz. Put your left finger at 587 Hz on f and your right at 587 Hz on g, and multiply their amplitudes together (so if they are both high, you will get a big number). Then move your left finger to the left and your right finger to the right at equal rates, continually multiplying their amplitudes. Then add up all the products you got, which is the value of the function at 587 Hz. Basically it says “if there is a peak to the left in f and to the right in g at equal distances, then there should be a peak here”, but in a continuous way.
I’ll denote this interpolation I(f,g,α), where α is the interpolation parameter. The mathematical definition is:
I(f,g,α)(ω) = ∫f(ω – αt) g(ω + (1-α)t) dt
where that integral extends from -∞ to ∞ (because the typesetting looks horrible if I try to do sub-superscript infinities).
I have a conjecture that this transform is well-behaved, in a sense:
Conjecture: if ∫f(ω)dω = ∫g(ω)dω = 1, then ∫I(f,g,α)(ω)dω = 1 for every α ε [0,1].
That is, if both input functions are normalized, then the output will be normalized, too.
So what we get out of this is a whole series of Fourier spectra representing the internals of the violin bend. In order to reconstruct this bend sound, however, I need to know how to take a big series of Fourier spectra and merge them into a single sound. A naive approach would be to take each of their inverse Fourier transforms, and then sample across those, concatenating the result together. But I think that would suck, because each time you moved from one sample to the next, you’d hear a click because it would be discontinuous. There’s got to be a better way.