Yayın:
Unsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information

dc.contributor.buuauthorÖzcan, Giyasettin
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.departmentBilgisayar Mühendisliği Bölümü
dc.contributor.orcid0000-0002-1166-5919
dc.contributor.researcheridZ-1130-2018
dc.contributor.scopusid15770103700
dc.date.accessioned2024-01-23T05:42:38Z
dc.date.available2024-01-23T05:42:38Z
dc.date.issued2018-05
dc.description.abstractIn this study, we consider unsupervised learning from multi-dimensional dataset problem. Particularly, we consider k-means clustering which require long duration time during execution of multi-dimensional datasets. In order to speed up clustering in an accurate form, we introduce a new algorithm, that we term Canopy+. The algorithm utilizes canopies and statistical techniques. Also, its efficient initiation and normalization methodologies contributes to the improvement. Furthermore, we consider early termination cases of clustering computation, provided that an intermediate result of the computation is accurate enough. We compared our algorithm with four popular clustering algorithms. Results denote that our algorithm speeds up the clustering computation by at least 2X. Also, we analyzed the contribution of early termination. Results present that further 2X improvement can be obtained while incurring 0.1% error rate. We also observe that our Canopy+ algorithm benefits from early termination and introduces extra 1.2X performance improvement.
dc.description.sponsorshipDumlupınar Üniversitesi - BAP-2012-34
dc.identifier.citationÖzcan, G. (2018). ''Unsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information''. International Journal of Information Technology and Decision Making, 17(3), 841-856.
dc.identifier.doi10.1142/S0219622018500141
dc.identifier.eissn1793-6845
dc.identifier.endpage856
dc.identifier.issn0219-6220
dc.identifier.issue3
dc.identifier.scopus2-s2.0-85045117701
dc.identifier.startpage841
dc.identifier.urihttps://www.worldscientific.com/doi/abs/10.1142/S0219622018500141
dc.identifier.urihttps://hdl.handle.net/11452/39231
dc.identifier.volume17
dc.identifier.wos000446157500006
dc.indexed.wosSCIE
dc.language.isoen
dc.publisherWorld Scientific Publ Co Pte Ltd
dc.relation.journalInternational Journal of Information Technology and Decision Making
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectComputer science
dc.subjectOperations research & management science
dc.subjectData mining
dc.subjectMulti-dimensional datasets
dc.subjectK-means clusteringcanopies
dc.subjectNormalization
dc.subjectEarly termination
dc.subject.scopusSimilarity Search; Join; Data Cleaning
dc.subject.wosComputer science, artificial intelligence
dc.subject.wosComputer science, information systems
dc.subject.wosComputer science, interdisciplinary applications
dc.subject.wosOperations research & management science
dc.titleUnsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information
dc.typeArticle
dc.wos.quartileQ2
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi/Bilgisayar Mühendisliği Bölümü
local.indexed.atPubMed
local.indexed.atScopus

Dosyalar

Lisanslı seri

Şimdi gösteriliyor 1 - 1 / 1
Placeholder
Ad:
license.txt
Boyut:
1.71 KB
Format:
Item-specific license agreed upon to submission
Açıklama