Yayın: Unsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information
| dc.contributor.buuauthor | Özcan, Giyasettin | |
| dc.contributor.department | Mühendislik Fakültesi | |
| dc.contributor.department | Bilgisayar Mühendisliği Bölümü | |
| dc.contributor.orcid | 0000-0002-1166-5919 | |
| dc.contributor.researcherid | Z-1130-2018 | |
| dc.contributor.scopusid | 15770103700 | |
| dc.date.accessioned | 2024-01-23T05:42:38Z | |
| dc.date.available | 2024-01-23T05:42:38Z | |
| dc.date.issued | 2018-05 | |
| dc.description.abstract | In this study, we consider unsupervised learning from multi-dimensional dataset problem. Particularly, we consider k-means clustering which require long duration time during execution of multi-dimensional datasets. In order to speed up clustering in an accurate form, we introduce a new algorithm, that we term Canopy+. The algorithm utilizes canopies and statistical techniques. Also, its efficient initiation and normalization methodologies contributes to the improvement. Furthermore, we consider early termination cases of clustering computation, provided that an intermediate result of the computation is accurate enough. We compared our algorithm with four popular clustering algorithms. Results denote that our algorithm speeds up the clustering computation by at least 2X. Also, we analyzed the contribution of early termination. Results present that further 2X improvement can be obtained while incurring 0.1% error rate. We also observe that our Canopy+ algorithm benefits from early termination and introduces extra 1.2X performance improvement. | |
| dc.description.sponsorship | Dumlupınar Üniversitesi - BAP-2012-34 | |
| dc.identifier.citation | Özcan, G. (2018). ''Unsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information''. International Journal of Information Technology and Decision Making, 17(3), 841-856. | |
| dc.identifier.doi | 10.1142/S0219622018500141 | |
| dc.identifier.eissn | 1793-6845 | |
| dc.identifier.endpage | 856 | |
| dc.identifier.issn | 0219-6220 | |
| dc.identifier.issue | 3 | |
| dc.identifier.scopus | 2-s2.0-85045117701 | |
| dc.identifier.startpage | 841 | |
| dc.identifier.uri | https://www.worldscientific.com/doi/abs/10.1142/S0219622018500141 | |
| dc.identifier.uri | https://hdl.handle.net/11452/39231 | |
| dc.identifier.volume | 17 | |
| dc.identifier.wos | 000446157500006 | |
| dc.indexed.wos | SCIE | |
| dc.language.iso | en | |
| dc.publisher | World Scientific Publ Co Pte Ltd | |
| dc.relation.journal | International Journal of Information Technology and Decision Making | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Computer science | |
| dc.subject | Operations research & management science | |
| dc.subject | Data mining | |
| dc.subject | Multi-dimensional datasets | |
| dc.subject | K-means clusteringcanopies | |
| dc.subject | Normalization | |
| dc.subject | Early termination | |
| dc.subject.scopus | Similarity Search; Join; Data Cleaning | |
| dc.subject.wos | Computer science, artificial intelligence | |
| dc.subject.wos | Computer science, information systems | |
| dc.subject.wos | Computer science, interdisciplinary applications | |
| dc.subject.wos | Operations research & management science | |
| dc.title | Unsupervised learning from multi-dimensional data: A fast clustering algorithm utilizing canopies and statistical information | |
| dc.type | Article | |
| dc.wos.quartile | Q2 | |
| dspace.entity.type | Publication | |
| local.contributor.department | Mühendislik Fakültesi/Bilgisayar Mühendisliği Bölümü | |
| local.indexed.at | PubMed | |
| local.indexed.at | Scopus |
Dosyalar
Lisanslı seri
1 - 1 / 1
