Publication:
Speaker identification from shouted speech: Analysis and compensation

dc.contributor.authorKinnunen, Tomi
dc.contributor.authorSaeidi, Rahim
dc.contributor.authorPohjalainen, Jouni
dc.contributor.authorAlku, Paavo
dc.contributor.buuauthorHanilçi, Cemal
dc.contributor.buuauthorErtaş, Figen
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.departmentElektrik Elektronik Mühendisliği Bölümü
dc.contributor.researcheridAAH-4188-2021
dc.contributor.researcheridS-4967-2016
dc.contributor.scopusid35781455400
dc.contributor.scopusid24724154500
dc.date.accessioned2023-05-03T10:43:45Z
dc.date.available2023-05-03T10:43:45Z
dc.date.issued2013
dc.descriptionBu çalışma, 26-31 Mayıs 2013 tarihleri arasında Vancouver[Kanada]’da düzenlenen IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)’da bildiri olarak sunulmuştur.
dc.description.abstractText-independent speaker identification is studied using neutral and shouted speech in Finnish to analyze the effect of vocal mode mismatch between training and test utterances. Standard mel-frequency cepstral coefficient (MFCC) features with Gaussian mixture model (GMM) recognizer are used for speaker identification. The results indicate that speaker identification accuracy reduces from perfect (100 %) to 8.71 % under vocal mode mismatch. Because of this dramatic degradation in recognition accuracy, we propose to use a joint density GMM mapping technique for compensating the MFCC features. This mapping is trained on a disjoint emotional speech corpus to create a completely speaker- and speech mode independent emotion-neutralizing mapping. As a result of the compensation, the 8.71 % identification accuracy increases to 32.00 % without degrading the non-mismatched train-test conditions much.
dc.description.sponsorshipInst Elect & Elect Engineers
dc.description.sponsorshipInst Elect & Elect Engineers Signal Proc Soc
dc.identifier.citationHanilçi, C. vd. (2013). “Speaker identification from shouted speech: Analysis and compensation”. International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8027-8031.
dc.identifier.endpage8031
dc.identifier.issn1520-6149
dc.identifier.scopus2-s2.0-84890452416
dc.identifier.startpage8027
dc.identifier.urihttps://doi.org/10.1109/ICASSP.2013.6639228
dc.identifier.urihttp://hdl.handle.net/11452/32501
dc.identifier.wos000329611508038
dc.indexed.wosCPCIS
dc.language.isoen
dc.publisherIEEE
dc.relation.collaborationYurt dışı
dc.relation.journalInternational Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectAcoustics
dc.subjectEngineering
dc.subjectSpeaker identification
dc.subjectShouted speech
dc.subjectLoudspeakers
dc.subjectMapping
dc.subjectSignal processing
dc.subjectSpeech
dc.subjectEmotional speech
dc.subjectGaussian mixture model
dc.subjectIdentification accuracy
dc.subjectMapping techniques
dc.subjectMel-frequency cepstral coefficients
dc.subjectRecognition accuracy
dc.subjectSpeaker identification
dc.subjectText-independent speaker identification
dc.subjectSpeech recognition
dc.subject.scopusWhispers; Speech Recognition; Public Speaking
dc.subject.wosAcoustics
dc.subject.wosEngineering, electrical & electronic
dc.titleSpeaker identification from shouted speech: Analysis and compensation
dc.typeProceedings Paper
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi/Elektrik Elektronik Mühendisliği Bölümü
local.indexed.atScopus
local.indexed.atWOS

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Hanilçi_vd_2013.pdf
Size:
563.22 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: