IRRECKEL
Contents
IRRECKEL - irrelevance filter
Usage:
IRRECKEL X SR OFS L U T NB RHT
Inputs:
X | fft power spectrum | no default |
SR | sampling rate in Hz | no default |
OFS | offset to relevance threshold in dB | 0dB |
L | lower slope of spreading function in dB/Bark | 27dB/Bark |
U | upper slope of spreading function in dB/Bark | 24dB/Bark |
T | spreading function damping factor | 0.3 |
NB | approximate length of bark scale; depends on length of X (automatic correction if too small) | |
RHT | determines how much a frequency masks itself. The central bin of the masking filter is set to this value. Should be between 0 and 1. |
Outputs:
Y | filtered/masked power spectrum |
M | power spectrum of irrelevance threshold |
MC | actual masking coefficient (relative number of masked spectrum components) |
MA | average masking coefficient |
Function:
This atom implements the computation of the "relevance threshold" and the "relevance filter". The method was developed and published in 1989 by G. Eckel in his Masters thesis. The relevance threshold is a spectral masking function TR(f). All spectral components of a sound with amplitudes below TR(f) are inaudible and can therefore be removed from the sound without changing one's perception of it.
Notes:
For the computation of the threshold, a simplified model of "simultaneous masking" is used. The masking algorithm is controlled by the following parameters:
- The masking pattern (spreading function) is defined by the parameters L, U (lower and upper slope) and T (damping factor). If these parameters are modified the auditory filter function is changed.
- The offset level O is added to the relevance threshold before the filter/masking is applied. This parameter can be used to shift the threshold along the magnitude scale. If a value greater than zero is assigned to O, an "overmasked" version of the input signal is generated. This means that components with amplitudes above the relevance threshold are removed as well, and the perception of the signal is therefore changed.
In order to validate the present model of the psychoacoustic irrelevance threshold (algorithm) many hearing experiments have been performed since its first implementation in 1989. Redundantly the following model parameters have been proven to produce reliable results on different types of signals (speech music etc.), which can be extrapolated to a variety of sampling rates:
signal sampling rate: | 16000 Hz | |
length of Bark spectrum: | 512 samples | |
offset level: | 0 dB | |
phase vocoder: | window- and fft-length: | 256 samples |
decimation/interpolation length: | 32 samples | |
spreading function: | lower slope: | 27 dB/Bark |
upper slope: | 24 dB/Bark | |
damping factor: | 0.3 |
In order to simulate G. Eckel's original data, these parameters must be used. In case of a sampling rate different to 16 kHz it is necessary to adjust the phase vocoder parameters and the length of the bark scale transformation accordingly.
Figure: simplified block-diagram of the algorithm for computation of the psychoacoustic irrelevance threshold and simultaneous masking (see Eckel G. 1989).