|
About
AMP Lab
Projects
Downloads
Publications
People
Links
Audio-Visual
Speech Processing
Speech production and perception is inherently bimodal. Of late there has been
increased interest in using the visual modality in combination with the normally used acoustic modality for improved speech processing. This field of
study has gained the title of audio-visual speech processing (AVSP). Traditional acoustic based speech processing systems have attained a high level
of performance in recent years, but the performance of these systems is heavily
dependent on a match between train and test conditions. In the presence of mismatched conditions (i.e. acoustic noise) the performance of acoustic speech
processing applications can degrade markedly. The visual speech modality is independent to most possible degradations in the acoustic modality. This
independence, along with the bimodal nature of speech, naturally allows the visual speech modality to act in a complementary capacity to the acoustic
speech modality. It is hoped that the integration of these two speech modalities will aid in the creation of more robust and effective speech
processing applications in the future.
Our
research effort is concentrated on speech and speaker recognition. In
particular we have been conducting active research for the visual speech
modality concerning feature extraction and classifier design. The problem of
audio-visual integration is also an active component of our effort.

(Integration
strategies in AVSP)
Related papers:-
- S. Lucey, T. Chen, S. Sridharan, and V. Chandran, "Integration strategies for audio-visual speech processing: Applied to text
dependent speaker recognition," IEEE Trans. on Multimedia, 2004. [similar
technical report]
- S. Lucey, "An evaluation of visual speech features for the tasks of speech and speaker recognition," presented at International Conference of Audio- and Video-Based Person Authentication (AVBPA), pp. 260-267, Guildford, U.K., 2003.
[similar technical report]
- S. Lucey and T. Chen, "Improved audio-visual speaker recognition via the use of a hybrid combination strategy," presented at International Conference of Audio- and Video-Based Person Authentication (AVBPA), pp. 929-936, Guildford, U.K., 2003.
[similar technical report]
- S. Lucey, "Audio-visual speech processing," Ph.D. thesis, in School of Electrical & Electronic Systems Engineering. Brisbane: Queensland University of Technology, 2002, pp. 243.
[thesis]
(Page is still under construction)
|