Publication:
Using group delay functions from all-pole models for speaker recognition

dc.contributor.authorRajan, Padmanabhan
dc.contributor.authorKinnunen, Tomi H.
dc.contributor.authorPohjalainen, Jouni
dc.contributor.authorAlku, Paavo
dc.contributor.authorBimbot, F.
dc.contributor.authorCerisara, C.
dc.contributor.authorFougeron, C.
dc.contributor.authorGravier, G.
dc.contributor.authorLamel, L.
dc.contributor.authorPellegrino, F.
dc.contributor.authorPerrier, P.
dc.contributor.buuauthorHanilçi, Cemal
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.departmentElektrik Elektronik Mühendisliği Bölümü
dc.contributor.researcheridS-4967-2016tr_TR
dc.contributor.scopusid35781455400tr_TR
dc.date.accessioned2022-12-30T11:58:03Z
dc.date.available2022-12-30T11:58:03Z
dc.date.issued2013
dc.descriptionBu çalışma, 25-29 Ağustos 2013 tarihlerinde Lyon[Fransa]'da düzenlenen 14. Annual Conference of the International Speech Communication Association [Interspeech 2013]'da bildiri olarak sunulmuştur.tr_TR
dc.description.abstractPopular features for speech processing, such as mel-frequency cepstral coefficients (MFCCs), are derived from the short-term magnitude spectrum, whereas the phase spectrum remains unused. While the common argument to use only the magnitude spectrum is that the human ear is phase-deaf, phase-based features have remained less explored due to additional signal processing difficulties they introduce. A useful representation of the phase is the group delay function, but its robust computation remains difficult. This paper advocates the use of group delay functions derived from parametric all-pole models instead of their direct computation from the discrete Fourier transform. Using a subset of the vocal effort data in the NIST 2010 speaker recognition evaluation (SRE) corpus, we show that group delay features derived via parametric all-pole models improve recognition accuracy, especially under high vocal effort. Additionally, the group delay features provide comparable or improved accuracy over conventional magnitude-based MFCC features. Thus, the use of group delay functions derived from all-pole models provide an effective way to utilize information from the phase spectrum of speech signals.en_US
dc.description.sponsorshipAcademy of Finland (253120)en_US
dc.description.sponsorshipInt Speech Commun Associationen_US
dc.description.sponsorshipAmazonen_US
dc.description.sponsorshipMicrosoften_US
dc.description.sponsorshipGoogleen_US
dc.description.sponsorshipTcL SYTRALen_US
dc.description.sponsorshipEuropean Language Resources Associationen_US
dc.description.sponsorshipOuaeroen_US
dc.description.sponsorshipImaginoveen_US
dc.description.sponsorshipVOCAPIA Researchen_US
dc.description.sponsorshipAcapelaen_US
dc.description.sponsorshipSpeech Oceanen_US
dc.description.sponsorshipALDEBARANen_US
dc.description.sponsorshipOrangeen_US
dc.description.sponsorshipVecsysen_US
dc.description.sponsorshipIBM Researchen_US
dc.description.sponsorshipRaytheon BBN Technologyen_US
dc.description.sponsorshipVoxygenen_US
dc.identifier.citationRajan, P. vd. (2013). "Using group delay functions from all-pole models for speaker recognition". 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 1-5, 2488-2492.en_US
dc.identifier.endpage2492tr_TR
dc.identifier.issn2308-457X
dc.identifier.scopus2-s2.0-84906257507tr_TR
dc.identifier.startpage2488tr_TR
dc.identifier.urihttp://faculty.iitmandi.ac.in/~padman/papers/padman_gdAllPole_interspeech2013.pdf
dc.identifier.urihttp://hdl.handle.net/11452/30193
dc.identifier.volume1-5tr_TR
dc.identifier.wos000395050001036
dc.indexed.scopusScopusen_US
dc.indexed.wosCPCISen_US
dc.language.isoenen_US
dc.publisherIsc-Int Speech Communication Associationen_US
dc.relation.journal14th Annual Conference of the International Speech Communication Association (Interspeech 2013)en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararasıtr_TR
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectComputer scienceen_US
dc.subjectEngineeringen_US
dc.subjectSpeaker verificationen_US
dc.subjectGroup delay functionsen_US
dc.subjectHigh vocal efforten_US
dc.subjectAdditive noiseen_US
dc.subjectVerificationen_US
dc.subjectDiscrete Fourier transformsen_US
dc.subjectGroup delayen_US
dc.subjectPolesen_US
dc.subjectSignal processingen_US
dc.subjectSpeech processingen_US
dc.subjectDirect computationsen_US
dc.subjectGroup delay functionsen_US
dc.subjectMel-frequency cepstral coefficientsen_US
dc.subjectRecognition accuracyen_US
dc.subjectSpeaker recognitionen_US
dc.subjectSpeaker recognition evaluationsen_US
dc.subjectSpeaker verificationen_US
dc.subjectVocal effortsen_US
dc.subjectSpeech recognitionen_US
dc.subject.scopusSpeaker Verification; Speech Enhancement; Attacken_US
dc.subject.wosComputer science, artificial intelligenceen_US
dc.subject.wosEngineering, electrical & electronicen_US
dc.titleUsing group delay functions from all-pole models for speaker recognitionen_US
dc.typeProceedings Paper
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi/Elektrik Elektronik Mühendisliği Bölümütr_TR

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hanilci_vd_2013.pdf
Size:
123.35 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: