ASEG1

From STX Wiki
Jump to navigationJump to search

ASEG1 - automatic segmentation

Usage:

ASEG1 DT TABP TABC TABS TABM TABO P0 P1 P2 P3

Inputs:
DT time step
TABP name of parameter definition table. Note: all tables are extended shell-table
TABC name of channel definition table
TABS name of segment definition table
TABM name of channel-to-segment mapping table
TABO name of output table
P0..P3 segmentation parameter inputs (numbers or vectors)
Outputs:

none, the output data (segment names and addresses) are stored in (appended to) the output table

Function:

The segmentation parameters P0-P3 can be any type of parameters (e.g.: RMS, band-RMS, F0) but must be extracted time-synchronous from the signal. It is assumed, that the parameters processed in the evaluation cycle t (t=0,1,....) are extracted from the signal time-interval [t.DT, (t+1).DT]. Each parameter value (each number and each vector element) must be defined by an entry of the table TABP. E.g. if P0 is a vector of three band-RMS values and P1 is the F0 value, the table TABP must consist of four entries (entry 0..2 for P0[0,1,2] and entry 3 for P1).

Table 1: TABP: parameter definitions (2 fields, 1 entry / parameter track){| |- |Field |Content / Description |- |0 |defines the parameter track type (0 or 1):0: the parameter track is a continuous function (e.g. RMS); for each frame a value is computed1: the parameter track is not a continuous function (e.g. F0, Formants); for some frames no parameter value can be computed (missing values) |- |1 |missing value indicator (any number):for not-continuous parameters this field defines the number that is used to indicate a missing parameter value in a frame (e.g. for F0 tracks the missing value indicator is set to 0) |}

The first segmentation step is implemented by a simple feature detection. For this purpose each parameter is assigned to one or more segmentation channel(s). A channel is used to detect a signal/segment feature based on the values of the assigned parameter. Two detection modes are defined:

p(t) value of assigned parameter in the frame t
s(t) status of the channel in the frame t

Interval method:

s(t) = 'on' if minimum <= p(t) <= maximum
s(t) = 'off' otherwise

Threshold method:

s(t) = 'on' if s(t-1) = 'off' and p(t) >= on-threshold
s(t) = 'off' if s(t-1) = 'on' and p(t) <= off-threshold
s(t) = s(t-1) otherwise

Table 2: TABC: channel definitions (5 fields, 1 entry / channel, multiple channels / parameters are possible){| |- |Field |Content / Description |- |0 |index of the assigned parameter (0 to number_of_parameters-1) |- |1 |selects the type of condition (0 or 1) 0: detect/filter an interval of parameter values 1: split parameter values using a threshold |- |2 |lower interval boundary (minimum) or on-threshold |- |3 |upper interval boundary (maximum) or off-threshold |- |4 |selects the processing used for frames with missing values (0, 1 or 2) 0: set channel to 'off' 1: set channel to 'on' 2: do not change channel (stay) |}

In the second segmentation step the channels (features) are combined. A mapping (mask) is used to define the feature configuration of a segment.

mi,j mapping (mask) for segment i to channel j
mi,j = 0 channel j must be 'off'
1 channel j must be 'on'
2 channel j is ignored

If a channel configuration <c0(t)..cn-1(t)> (n = number of channels) matches the conditions defined by the mapping <mi,0..mi,n-1>, the segment status si(t) is set to 'on', otherwise it is set to 'off'. If multiple mappings are defined for a channel, the results of the matches are logically ORed (only one mapping must match).

Table 3: TABS: segment definitions (5 fields, 1 segment / entry){| |- |Field |Content / Description |- |0 |minimum segment 'on' duration (>0) |- |1 |maximum segment 'off' (pause) duration inside the segment (>=0) |- |2 |pre-offset = time subtracted from the detected beginning (>=0) |- |3 |post-offset = time added to the detected end (>=0) |- |4 |number of channel/segment mappings used for the segment (>0) |}

The mappings for all segments are stored in the mapping table TABM. This table consists of only one field and table entry for each mapping of a segment. E.g. if the 1st segment uses 3 mappings and the 2nd segment uses 1 mapping, the entries 0, 1 and 2 of TABM are the mappings for the 1st segment and the entry 3 is the mapping for the 2nd segment.

In the third step some decisions/processing based on the durations and offsets defined for a segment are performed:

If the stream of 'off' frames (pause) between two streams of 'on' frames (signal) is shorter than the defined maximum segment 'off' duration inside a segment, the two 'on' streams are connected.

If the duration of the 'on'-stream is shorter than the minimum segment 'on' duration, the 'on'-stream is ignored (not saved)

Before the segment is saved, the begin- and end-time are corrected to 'detected-begin-time – pre-offset' and end-time = detected-end-time + post-offset

The checked and corrected segment definitions are saved into the output table TABO. For each segment the index of the segment definition entry (for identification) and the begin- and end-time is saved.

Table 4: TABO: output table (3 fields, 1 field / segment){| |- |Field |Content / Description |- |0 |identification = index of segment definition entry |- |1 |begin time |- |2 |end time |}

Navigation menu

Personal tools