F0SIFT

This atom uses an autocorrelation method to extract the fundamental frequency from the speech signal X. The method is a modified version of the "simplified inverse filter tracking" algorithm published by J.D.Markel and A.H.Gray (J.D.Markel and A.H.Gray (1976), "Linear Prediction of Speech"; Springer, p206).

The speech signal is low-pass filtered and downsampled. The downsampling is applied to the signal to reduce the number of speech formants to 2. Because a minimum signal bandwidth of 2.FMAX is required for the extraction algorithm, the downsampling factor is selected as follows:

signal bandwidth	b = max(2.FMAX, 2000Hz)
downsampling factor	d = int(SR / (2.b))

The inverse filter coefficients are computed using the LPC method. The inverse filter (order 4) is applied to the downsampled speech signal to remove the formant structure.

The pitch period (= 1/f0) is measured in the autocorrelation function of the filtered signal. For the pitch period measurement the location of the highest autocorrelation peak in the range 1/FMAX..1/FMIN is used. To get a better frequency resolution the peak-location is corrected by a parabolic interpolation.

A tracking and correction procedure is used to correct the pitch value (e.g. octave jumps) and to remove incorrect voiced/unvoiced frames. For this procedure the last 3 frames are used.

Step 4 can be enabled or disabled via the input MODE (0|NO = disabled, 1|YES = enabled). If tracking is enabled, the output values are delayed by two frames and the last 2 stored values are always set to zero (because the tracking needs 3 frames). If the inputs TABLE and COL are connected, all values stored in the output F0 are also stored in the column COL of the table starting at entry 0.

Notes:

In order to simulate the original S.I.F.T. algorithm described in "Linear Prediction of Speech" the following parameter settings must be used:

sampling rate	SR=10000 (10kHz)
frequency range	FMIN=50, FMAX=250 (50..250Hz)
tracking enabled	MODE=1

X	signal vector (speech)
SR	sampling rate in Hz
FMIN	minimum f0
FMAX	maximum f0
MODE	tracking mode
TABLE	name of output shell-table (note this must be an extended table) see NEW TABLE.
COL	index of table column (note that this column must be of type NUMBER)

F0SIFT

Contents

F0SIFT - f0 extraction with autocorrelation

Usage:

Inputs:

Outputs:

Function:

Notes:

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools