Mathematik und Signalverarbeitung in der Akustik - Laufende Projekte

Scientific and Technological Cooperation between Austria and Serbia (SRB 01/2018)

Duration of the project: 01.07.2018 - 30.06.2020

 

Project partners:

Acoustics Research Institute, ÖAW (Austria)

University of Vienna (Austria)

University of Novi Sad (Republic of Serbia)

 

Link to the project website: http://nuhag.eu/anacres

General Information

Funded by the Vienna Science and Technology Fund (WWTF) within the  "Mathematics and …2016"  Call (MA16-053)

Principal Investigator: Georg Tauböck

Co-Principal Investigator: Peter Balazs

Project Team: Günther Koliander, José Luis Romero  

Duration: 01.07.2017 – 01.07.2021

Abstract

Signal processing is a key technology that forms the backbone of important developments like MP3, digital television, mobile communications, and wireless networking and is thus of exceptional relevance to economy and society in general. The overall goal of the proposed project is to derive highly efficient signal processing algorithms and to tailor them to dedicated applications in acoustics. We will develop methods that are able to exploit structural properties in infinite-dimensional signal spaces, since typically ad hoc restrictions to finite dimensions do not sufficiently preserve physically available structure. The approach adopted in this project is based on a combination of the powerful mathematical methodologies frame theory (FT), compressive sensing (CS), and information theory (IT). In particular, we aim at extending finite-dimensional CS methods to infinite dimensions, while fully maintaining their structure-exploiting power, even if only a finite number of variables are processed. We will pursue three acoustic applications, which will strongly benefit from the devised signal processing techniques, i.e., audio signal restoration, localization of sound sources, and underwater acoustic communications. The project is set up as an interdisciplinary endeavor in order to leverage the interrelations between mathematical foundations, CS, FT, IT, time-frequency representations, wave propagation, transceiver design, the human auditory system, and performance evaluation.

Keywords

compressive sensing, frame theory, information theory, signal processing, super resolution, phase retrieval, audio, acoustics

Video

Link

 

Scientific and Technological Cooperation with Macedonia 2016-18
Project duration: 01.07.2016 – 30.06.2018

The main aim of the project is to combine the research areas of Frame Theory and Generalized Asymptotic Analysis.

Project partner institutions:
Acoustics Research Institute (ARI), Austrian Academy of Sciences, Vienna, Austria
Ss. Cyril and Methodius University, Skopje, The Former Yugoslav Republic of Macedonia

Project members:
Diana T. Stoeva (Project coordinator Austria), Peter Balazs, Nicki Holighaus, Zdenek Prusa
Katerina Hadzi-Velkova Saneva (Project coordinator FYROM), Sanja Atanasova, Pavel Dimovski, Zoran Hadzi-Velkov, Bojan Prangoski, Biljana Stanoevska-Angelova, Daniel Velinov, Jasmina Veta Buralieva


Project Workshops and Activities:

1) Nov. 24-26, 2016, Ss. Cyril and Methodius University, Skopje

Project Kickoff-workshop

Program of the workshop

2) Nov. 15-19, 2017, ARI, Vienna

Research on project-related topics

3) April 14-19, 2018, ARI, Vienna

Research on project-related topics

and

ARI-Guest-Talk given at ARI on the 17th of April, 2018: Prof. Zoran Hadzi-Velkov, "The Emergence of Wireless Powered Communication Networks"

4) May 25-30, Ss. Cyril and Methodius University, Skopje

Research on project-related topics

and

Workshop "Women in mathematics in the Balkan region" (May 28 - May 29, Ss. Cyril and Methodius University, Skopje)

5) June 14-18, Ss. Cyril and Methodius University, Skopje

Research on project-related topics

and

Summer course "An Introduction to Frame Theory and the Large Time/Frequency Analysis Toolbox" (June 14-15), Lecturers: Diana Stoeva and Zdenek Prusa (from ARI)

6) Mini-Symposium "Frame Theory and Asymptotic Analysis" organized at the European Women in Mathematics General Meeting 2018, Karl-Franzens-Universität Graz, Austria, 3-7 September 2018.

Link to Conference website

7) November 17-20, 2018, ARI, Vienna

Work on project-related topics

 

 

 

Multilateral Scientific and Technological Cooperation in the Danube Region 2017-2018
Austria, Czech Republic, Republic of Serbia, and Slovak Republic
Project duration: 01.01.2017 - 31.12.2018

Project website: nuhag.eu/tifmofus

S&T cooperation project 'Amadee' Austria-France 2013-14, "Frame Theory for Sound Processing and Acoustic Holophony", FR 16/2013

Project Partner: The Institut de recherche et coordination acoustique/musique (IRCAM)

French-Austrian bilateral research project funded by the French National Agency of Research (ANR) and the Austrian Science Fund (FWF, project no. I 1362-N30). The project involves two academic partners, namely the Laboratory of Mechanics and Acoustics (LMA - CNRS UPR 7051, France) and the Acoustics Research Institute. At the ARI, two research groups are involved in the project: the Mathematics and Signal Processing in Acoustics and the Psychoacoustics and Experimental Audiology groups.

Principal investigators: Thibaud Necciari (ARI), Piotr Majdak (ARI) and Olivier Derrien (LMA).

Running period: 2014-2017 (project started on March 1, 2014).

Abstract:

One of the greatest challenges in signal processing is to develop efficient signal representations. An efficient representation extracts relevant information and describes it with a minimal amount of data. In the specific context of sound processing, and especially in audio coding, where the goal is to minimize the size of binary data required for storage or transmission, it is desirable that the representation takes into account human auditory perception and allows reconstruction with a controlled amount of perceived distortion. Over the last decades, many psychoacoustical studies investigated auditory masking, an important property of auditory perception. Masking refers to the degradation of the detection threshold of a sound in presence of another sound. The results were used to develop models of either spectral or temporal masking. Attempts were made to simply combine these models to account for time-frequency (t-f) masking effects in perceptual audio codecs. We recently conducted psychoacoustical studies on t-f masking. They revealed the inaccuracy of those models which revealed the inaccuracy of such simple models. These new data on t-f masking represent a crucial basis to account for masking effects in t-f representations of sounds. Although t-f representations are standard tools in audio processing, the development of a t-f representation of audio signals that is mathematically-founded, perception-based, perfectly invertible, and possibly with a minimum amount of redundancy, remains a challenge. POTION thus addresses the following questions:

  1. To what extent is it possible to obtain a perception-based (i.e., as close as possible to “what we see is what we hear”), perfectly invertible, and possibly minimally redundant t-f representation of sound signals? Such a representation is essential for modeling complex masking interactions in the t-f domain and is expected to improve our understanding of auditory processing of real-world sounds. Moreover, it is of fundamental interest for many audio applications involving sound analysis-synthesis.
  2. Is it possible to improve current perceptual audio codecs by considering a joint t-f approach? To reduce the size of digital audio files, perceptual audio codecs like MP3 decompose sounds into variable-length time segments, apply a frequency transform, and use masking models to control the sub-quantization of transform coefficients within each segment. Thus, current codecs follow mainly a spectral approach, although temporal masking effects are taken into account in some implementations. By combining an efficient perception-based t-f transform with a joint t-f masking model in an audio codec, we expect to achieve significant performance improvements.

Working program:

POTION is structured in three main tasks:

  1. Perception-based t-f representation of audio signals with perfect reconstruction: A linear and perfectly invertible t-f representation will be created by exploiting the recently developed non-stationary Gabor theory as a mathematical background. The transform will be designed so that t-f resolution mimics the t-f analysis properties by the auditory system and possibly no redundancy is introduced to maximize the coding efficiency.
  2. Development and implementation of a t-f masking model: Based on psychoacoustical data on t-f masking collected by the partners in previous projects and on literature data, a new, complex model of t-f masking will be developed and implemented in the computationally efficient representation built in task 1. Additional psychoacoustical data required for the development of the model, involving frequency, level, and duration effects in masking for either single or multiple maskers will be collected. The resulting signal processing algorithm should represent and re-synthesize only the perceptually relevant components of the signal. It will be calibrated and validated by conducting listening tests with synthetic and real-world sounds.
  3. Optimization of perceptual audio codecs: This task represents the main application of POTION. It will consist in combining the new efficient representation built in task 1 with the new t-f masking model built in task 2 for implementation in a perceptual audio codec.

More information on the project can be found on the POTION web page.

Publications:

  • Chardon, G., Necciari, Th., Balazs, P. (2014): Perceptual matching pursuit with Gabor dictionaries and time-frequency masking, in: Proceedings of the 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014). Florence, Italy, 3126-3130. (proceedings) ICASSP 2014: Perceptual matching pursuit results

Related topics investigated at the ARI:

Objective:

Numerous implementations and algorithms for time frequency analysis can be found in literature or on the internet. Most of them are either not well documented or no longer maintained. P. Soendergaard started to develop the Linear Time Frequency Toolbox for MATLAB. It is the goal of this project to find typical applications of this toolbox in acoustic applications, as well as incorporate successful, not-yet-implemented algorithms in STx.

Method:

The linear time-frequency toolbox is a small open-source Matlab toolbox with functions for working with Gabor frames for finite sequences. It includes 1D Discrete Gabor Transform (sampled STFT) with inverse. It works with full-length windows and short windows. It computes the canonical dual and canonical tight windows.

Application:

These algorithms are used for acoustic applications, like formants, data compression, or de-noising. These implementations are compared to the ones in STx, and will be implemented in this software package if they improve its performance.

Partners:

  • H. G. Feichtinger et al., NuHAG, Faculty of Mathematics, University of Vienna
  • B. Torrèsani, Groupe de Traitement du Signal, Laboratoire d'Analyse Topologie et Probabilités, LATP/ CMI, Université de Provence, Marseille
  • P. Soendergaard, Department of Mathematics, Technical University of Denmark

Objective:

During the current project of efficiently calculating a resynthesis window and an iterative scheme for a finite element method algorithm for vibrations in soils and liquids, it became apparent that block matrices are a powerful tool to find numerically efficient algorithms.

Method:

In this project, the focus should be the investigation of the numeric features of block matrices. How can this structure be used to calculate or approximate the inverse of a matrix or its norm? How can this be used to speed up iterative schemes?

Application:

The results will be used for the two projects mentioned below:

  • double preconditioning for Gabor frames
  • vibrations in random layers

Objective:

In signal processing, synthesis is important in addition to analysis. This is especially true for the modification of data. For the Short-Time Fourier Transformation, the synthesis is often done using a simple overlap add (OLA), which is the sum of the outputs of the filter. Also, the output is re-weighted with the analysis window, such as occurs when using the phase vocoder. It is often presumed that with standard windows this will give satisfactory results.

Aside from Gabor frame theory, if the well-known construction of synthesis windows was possible, it would guarantee perfect reconstruction. However, this method is not used often in signal processing algorithms.

Method:

In this project, we will systematically investigate if and for which parameters the respective OLA synthesis with the original window gives good reconstruction. We will compare it to the reconstruction with the dual window, introducing and motivating it as perfect reconstruction overlap add (PROLA). We will show that this method is always preferable to others and that it can be calculated very efficiently.

Application:

This is currently being implemented in STx. There the phase vocoder will have the option to guarantee perfect reconstruction, either with dual or tight windows.

Partners:

Department of Mathematics, University of Wisconsin-Eau Claire