Yayın:
Boosting audio replay attack detection using silence-based blind channel impulse response estimation

dc.contributor.authorBekiryazıcı, Şule
dc.contributor.authorHanilçi, Cemal
dc.contributor.authorÖzcan, Neyir
dc.contributor.buuauthorBEKİRYAZICI, ŞULE
dc.contributor.buuauthorÖZCAN SEMERCİ, NEYİR
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.departmentElektrik-Elektronik Mühendisliği Bölümü
dc.contributor.scopusid57221818392
dc.contributor.scopusid7003726676
dc.date.accessioned2025-11-28T08:01:50Z
dc.date.issued2026-01-01
dc.description.abstractReplay attacks pose a major threat to automatic speaker verification (ASV) systems, considerably degrading performance. Since replayed utterances are captured and reproduced using external microphones and speakers, they inherently reflect these acoustic influences. Such acoustic distortions serve as valuable cues for differentiating between genuine and spoofed speech, provided they can be effectively extracted and modeled. In this context, blind channel impulse response estimation has been shown to be an effective approach in replay attack detection, as it enables the characterization of the acoustic path through which the signal has propagated without requiring explicit knowledge of the original source or environment. Furthermore, prior studies have highlighted the importance of silence segments in this task, noting that these regions, being free of speech content, primarily capture the characteristics of the transmission channel. As such, silence segments offer a unique and robust opportunity for extracting channel-related features that are less influenced by speaker variability and phonetic content, thereby improving the discriminability between bonafide and replayed signals. In this paper, we argue that channel impulse response estimates derived from silence parts contain more discriminative information than those obtained from the entire signal or voiced parts. To exploit this insight, we propose to use log-magnitude channel frequency response estimated from the silence parts for replay attack detection. Experiments on ASVspoof 2019 and 2021 datasets show that utilizing silence-based channel response features reduces the EER from 4.21% to 3.17% and from 29.16% to 24.43%, respectively, compared to using the entire signal.
dc.identifier.doi10.1007/978-3-032-07956-5_24
dc.identifier.endpage344
dc.identifier.isbn[9783032079558]
dc.identifier.issn0302-9743
dc.identifier.scopus2-s2.0-105020240587
dc.identifier.startpage333
dc.identifier.urihttps://hdl.handle.net/11452/56868
dc.identifier.volume16187 LNCS
dc.indexed.scopusScopus
dc.language.isoen
dc.publisherSpringer
dc.relation.journalLecture Notes in Computer Science
dc.relation.tubitak123E384
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectResNet
dc.subjectReplay attack detection
dc.subjectASVspoof 2021
dc.subjectASVspoof 2019
dc.subject.scopusCountermeasures Against Speech Spoofing Attacks
dc.titleBoosting audio replay attack detection using silence-based blind channel impulse response estimation
dc.typeConference Paper
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi/Elektrik-Elektronik Mühendisliği Bölümü
local.indexed.atScopus
relation.isAuthorOfPublication70e6a885-bf09-46ed-80a2-a21f17b94e40
relation.isAuthorOfPublication10af6085-3f72-4edc-84b9-01c6b1d121f7
relation.isAuthorOfPublication.latestForDiscovery70e6a885-bf09-46ed-80a2-a21f17b94e40

Dosyalar