Partial tracking is the process of detecting single sinusoidal components in a signal and obtaining their individual amplitude, frequency and phase trajectories. For monophonic signals, this is a straight-forward procedure:
- 1: STFT
- 2: Fundamental Frequency Estimation
- 3: Peak Detection
- 4: Peak Continuation
The following plots show the amplitude, frequency and phase trajectories for a violin sound.
Partial Amplitude Trajectories¶
Partial Frequency Trajectories¶
Partial Phase Trajectories¶
Partial Peak Estimation¶
For every analysis frame $X_n$ we first estimate the fundamental frequency $f_0$. It makes senste to zero-pad the frames before FFT and peak picking to allow for a more accurate peak detection of the partials.
Afterwards, we can get the amplitude $a_i$ based on $f_0$ and the fundamental frequency:
$$ a_i = \left| H \left[ i \frac{f_s}{\mathrm{Lag}_{max}} \right] \right| $$
We need to define a search area around the calculated position of each peak to compensate for errors in the $f_0$ estimation and signal deviations.
In a strictly harmonic model, we do not have to estimate the partial frequencies. We can still achieve decent synthesis results with this restriction.
The phase of a partial cannot simply be read from the peak in most cases. It can best be estimated by minimizing the error between the original signal and partials with the estimated amplitude and frequency at phases between $\pi$ and $\pi$.
Quadratic Interpolation¶
The detection of local maxima in a spectrum is limited to the DFT support points without further processing. Even with zero-padding, further steps can improve the pitch estimation. The following example shows this for a 25 Hz sinusoid at a sampling rate of 100 Hz.
Quadratic or parabolic interpolation can be used to estimate the true peak of the sinusoid. using the detected maximum $a$ and its upper and lower neighbor bin.
$$ p = 0.5 (\alpha-\gamma)/(\alpha-2\beta+\gamma) $$
$$ a^* = \beta-1/4(\alpha-\gamma) $$