Publication:
Effective early termination techniques for text similarity join operator

dc.contributor.authorUlusoy, Özgür
dc.contributor.authorYolum, Pınar
dc.contributor.authorGüngör, T.
dc.contributor.authorGürgen, Fikret
dc.contributor.authorÖzturan, Can
dc.contributor.buuauthorÖzalp, Selma Ayşe
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.departmentEndüstri Mühendisliği Bölümü
dc.contributor.orcid0000-0001-9201-6349
dc.contributor.researcheridG-1584-2018
dc.contributor.researcheridI-9828-2018
dc.contributor.scopusid6603978393
dc.date.accessioned2022-03-21T06:13:03Z
dc.date.available2022-03-21T06:13:03Z
dc.date.issued2005
dc.descriptionBu çalışma, 26-28 Ekim 2005 tarihleri arasında İstanbul[Türkiye]'da düzenlenen 20. International Symposium on Computer and Information Sciences'da bildiri olarak sunulmuştur.
dc.description.abstractText similarity join operator joins two relations if their join attributes are textually similar to each other, and it has a variety of application domains including integration and querying of data from heterogeneous resources; cleansing of data; and mining of data. Although, the text similarity join operator is widely used, its processing is expensive due to the huge number of similarity computations performed. In this paper, we incorporate some short cut evaluation techniques from the Information Retrieval domain, namely Harman, quit, continue, and maximal similarity filter heuristics, into the previously proposed text similarity join algorithms to reduce the amount of similarity computations needed during the join operation. We experimentally evaluate the original and the heuristic based similarity join algorithms using real data obtained from the DBLP Bibliography database, and observe performance improvements with continue and maximal similarity filter heuristics.
dc.description.sponsorshipInst Elec & Elect Engineers, Turkey Sect
dc.description.sponsorshipBoğaziçi Üniversitesi
dc.identifier.citationÖzalp, S. A. ve Ulusoy, Ö. (2005). "Effective early termination techniques for text similarity join operator". ed. P. Yolum vd. Computer and Information Sciences (ISCIS 2005)- Lecture Notes in Computer Science, 3733, 791-801.
dc.identifier.endpage801
dc.identifier.isbn3-540-29414-7
dc.identifier.issn0302-9743
dc.identifier.issn1611-3349
dc.identifier.scopus2-s2.0-33646503003
dc.identifier.startpage791
dc.identifier.urihttps://doi.org/10.1007/11569596_81
dc.identifier.urihttps://link.springer.com/chapter/10.1007/11569596_81
dc.identifier.urihttp://hdl.handle.net/11452/25198
dc.identifier.volume3733
dc.identifier.wos000234179600079
dc.indexed.wosSCIE
dc.indexed.wosCPCIS
dc.language.isoen
dc.publisherSpringer
dc.relation.collaborationYurt içi
dc.relation.journalComputer and Information Sciences (ISCIS 2005) - Lecture Notes in Computer Science
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi
dc.relation.tubitak100U024
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectComputer science
dc.subjectMetadata
dc.subjectBibliographic retrieval systems
dc.subjectComputation theory
dc.subjectComputer operating procedures
dc.subjectData mining
dc.subjectData reduction
dc.subjectInformation retrieval
dc.subjectIntegration
dc.subjectQuery languages
dc.subjectApplication domains
dc.subjectData querying
dc.subjectFilter heuristics
dc.subjectText similarity
dc.subjectText processing
dc.subject.scopusInverted Index; Query Processing; Caching
dc.subject.wosComputer science, information systems
dc.subject.wosComputer science, theory & methods
dc.titleEffective early termination techniques for text similarity join operator
dc.typeArticle
dc.wos.quartileQ4
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi/Endüstri Mühendisliği Bölümü
local.indexed.atScopus
local.indexed.atWOS

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Özalp_Ulusoy_2005.pdf
Size:
55.93 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: