Programmer Guide/SPU Reference/ASEG1: Difference between revisions

From STX Wiki
Jump to navigationJump to search
(initial import)
m (1 revision: Initial import)

Revision as of 17:31, 18 November 2010

ASEG1 - automatic segmentation



DT time step
TABP name of parameter definition table. Note: all tables are extended shell-table
TABC name of channel definition table
TABS name of segment definition table
TABM name of channel-to-segment mapping table
TABO name of output table
P0..P3 segmentation parameter inputs (numbers or vectors)

none, the output data (segment names and addresses) are stored in (appended to) the output table


The segmentation parameters P0-P3 can be any type of parameters (e.g.: RMS, band-RMS, F0) but must be extracted time-synchronous from the signal. It is assumed, that the parameters processed in the evaluation cycle t (t=0,1,....) are extracted from the signal time-interval [t.DT, (t+1).DT]. Each parameter value (each number and each vector element) must be defined by an entry of the table TABP. E.g. if P0 is a vector of three band-RMS values and P1 is the F0 value, the table TABP must consist of four entries (entry 0..2 for P0[0,1,2] and entry 3 for P1).

Table 1: TABP: parameter definitions (2 fields, 1 entry / parameter track){| |- |Field |Content / Description |- |0 |defines the parameter track type (0 or 1):0: the parameter track is a continuous function (e.g. RMS); for each frame a value is computed1: the parameter track is not a continuous function (e.g. F0, Formants); for some frames no parameter value can be computed (missing values) |- |1 |missing value indicator (any number):for not-continuous parameters this field defines the number that is used to indicate a missing parameter value in a frame (e.g. for F0 tracks the missing value indicator is set to 0) |}

The first segmentation step is implemented by a simple feature detection. For this purpose each parameter is assigned to one or more segmentation channel(s). A channel is used to detect a signal/segment feature based on the values of the assigned parameter. Two detection modes are defined:

p(t) value of assigned parameter in the frame t
s(t) status of the channel in the frame t

Interval method:

s(t) = 'on' if minimum <= p(t) <= maximum
s(t) = 'off' otherwise

Threshold method:

s(t) = 'on' if s(t-1) = 'off' and p(t) >= on-threshold
s(t) = 'off' if s(t-1) = 'on' and p(t) <= off-threshold
s(t) = s(t-1) otherwise

Table 2: TABC: channel definitions (5 fields, 1 entry / channel, multiple channels / parameters are possible){| |- |Field |Content / Description |- |0 |index of the assigned parameter (0 to number_of_parameters-1) |- |1 |selects the type of condition (0 or 1) 0: detect/filter an interval of parameter values 1: split parameter values using a threshold |- |2 |lower interval boundary (minimum) or on-threshold |- |3 |upper interval boundary (maximum) or off-threshold |- |4 |selects the processing used for frames with missing values (0, 1 or 2) 0: set channel to 'off' 1: set channel to 'on' 2: do not change channel (stay) |}

In the second segmentation step the channels (features) are combined. A mapping (mask) is used to define the feature configuration of a segment.

mi,j mapping (mask) for segment i to channel j
mi,j = 0 channel j must be 'off'
1 channel j must be 'on'
2 channel j is ignored

If a channel configuration <c0(t)> (n = number of channels) matches the conditions defined by the mapping <mi,0..mi,n-1>, the segment status si(t) is set to 'on', otherwise it is set to 'off'. If multiple mappings are defined for a channel, the results of the matches are logically ORed (only one mapping must match).

Table 3: TABS: segment definitions (5 fields, 1 segment / entry){| |- |Field |Content / Description |- |0 |minimum segment 'on' duration (>0) |- |1 |maximum segment 'off' (pause) duration inside the segment (>=0) |- |2 |pre-offset = time subtracted from the detected beginning (>=0) |- |3 |post-offset = time added to the detected end (>=0) |- |4 |number of channel/segment mappings used for the segment (>0) |}

The mappings for all segments are stored in the mapping table TABM. This table consists of only one field and table entry for each mapping of a segment. E.g. if the 1st segment uses 3 mappings and the 2nd segment uses 1 mapping, the entries 0, 1 and 2 of TABM are the mappings for the 1st segment and the entry 3 is the mapping for the 2nd segment.

In the third step some decisions/processing based on the durations and offsets defined for a segment are performed:

If the stream of 'off' frames (pause) between two streams of 'on' frames (signal) is shorter than the defined maximum segment 'off' duration inside a segment, the two 'on' streams are connected.

If the duration of the 'on'-stream is shorter than the minimum segment 'on' duration, the 'on'-stream is ignored (not saved)

Before the segment is saved, the begin- and end-time are corrected to 'detected-begin-time – pre-offset' and end-time = detected-end-time + post-offset

The checked and corrected segment definitions are saved into the output table TABO. For each segment the index of the segment definition entry (for identification) and the begin- and end-time is saved.

Table 4: TABO: output table (3 fields, 1 field / segment){| |- |Field |Content / Description |- |0 |identification = index of segment definition entry |- |1 |begin time |- |2 |end time |}

Navigation menu

Personal tools