This paper is a contribution to the old problem of representing a signal in the coordinates of time and frequency. As the starting point, we abandon Gabor's complex extension and re-evaluate fundamental principles of time-frequency analysis. We provide a multicomponent model of a signal that enables rigorous definition of instantaneous frequency on a per-component basis. Within our framework, we have shifted all uncertainty of the latent signal to its quadrature. In this approach, uncertainty is not a fundamental limitation of analysis, but rather a manifestation of the limited view of the observer. With the appropriate assumptions made on the signal model, the instantaneous amplitude and instantaneous frequency can be obtained exactly, hence exact representation of a signal in the coordinates of time and frequency can be achieved. However, uncertainty now arises in obtaining the correct assumptions, i.e.~how to correctly choose the quadrature of the components.
The following work is copyrighted by the IEEE.
S. Sandoval, P. L. De Leon, and J. M. Liss “Hilbert spectral analysis of vowels using intrinsic mode functions,” 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.569-575, Dec. 2015.
The official version can be obtained at DOI: 10.1109/ASRU.2015.7404846.Click here to access a copy of this work.
In recent work, we presented mathematical theory and algorithms for time-frequency analysis of non-stationary signals. In that work, we generalized the definition of the Hilbert spectrum by using a superposition of complex AM--FM components parameterized by the Instantaneous Amplitude (IA) and Instantaneous Frequency (IF). Using our Hilbert Spectral Analysis (HSA) approach, the IA and IF estimates can be far more accurate at revealing underlying signal structure than prior approaches to time-frequency analysis. In this paper, we have applied HSA to speech and compared to both narrowband and wideband spectrograms. We demonstrate how the AM--FM components, assumed to be intrinsic mode functions, align well with the energy concentrations of the spectrograms and highlight fine structure present in the Hilbert spectrum. As an example, we show never before seen intra-glottal pulse phenomena that are not readily apparent in other analyses. Such fine-scale analyses may have application in speech-based medical diagnosis and automatic speech recognition (ASR) for pathological speakers.