Speaker identification from shouted speech: Analysis and compensation

Kinnunen, Tomi; Saeidi, Rahim; Pohjalainen, Jouni; Alku, Paavo

Publication:
Speaker identification from shouted speech: Analysis and compensation

Files

Hanilçi_vd_2013.pdf (563.22 KB)

Date

2013

Authors

Hanilçi, Cemal

Ertaş, Figen

Authors

Kinnunen, Tomi

Saeidi, Rahim

Pohjalainen, Jouni

Alku, Paavo

Type

Proceedings Paper

Publisher:

IEEE

Abstract

Text-independent speaker identification is studied using neutral and shouted speech in Finnish to analyze the effect of vocal mode mismatch between training and test utterances. Standard mel-frequency cepstral coefficient (MFCC) features with Gaussian mixture model (GMM) recognizer are used for speaker identification. The results indicate that speaker identification accuracy reduces from perfect (100 %) to 8.71 % under vocal mode mismatch. Because of this dramatic degradation in recognition accuracy, we propose to use a joint density GMM mapping technique for compensating the MFCC features. This mapping is trained on a disjoint emotional speech corpus to create a completely speaker- and speech mode independent emotion-neutralizing mapping. As a result of the compensation, the 8.71 % identification accuracy increases to 32.00 % without degrading the non-mismatched train-test conditions much.

Description

Bu çalışma, 26-31 Mayıs 2013 tarihleri arasında Vancouver[Kanada]’da düzenlenen IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)’da bildiri olarak sunulmuştur.

Keywords

Acoustics, Engineering, Speaker identification, Shouted speech, Loudspeakers, Mapping, Signal processing, Speech, Emotional speech, Gaussian mixture model, Identification accuracy, Mapping techniques, Mel-frequency cepstral coefficients, Recognition accuracy, Speaker identification, Text-independent speaker identification, Speech recognition

Citation

Hanilçi, C. vd. (2013). “Speaker identification from shouted speech: Analysis and compensation”. International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8027-8031.

URI

https://doi.org/10.1109/ICASSP.2013.6639228
http://hdl.handle.net/11452/32501

Collections

İndeksli Yayınlar / Indexed Publications

Full item page

Publication:
Speaker identification from shouted speech: Analysis and compensation

Files

Date

Organizational Units

Authors

Authors

Advisor

Language

Type

Publisher:

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Source:

Keywords:

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

18

Downloads

Publication: Speaker identification from shouted speech: Analysis and compensation

Files

Date

Organizational Units

Authors

Authors

Advisor

Language

Type

Publisher:

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Source:

Keywords:

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

18

Downloads

Publication:
Speaker identification from shouted speech: Analysis and compensation