Publication:
Analysis of large data logs: an application of Poisson sampling on excite web queries

dc.contributor.buuauthorÖzmutlu, H. Cenk
dc.contributor.buuauthorSpink, A.
dc.contributor.buuauthorÖzmutlu, Seda
dc.contributor.departmentMühendislik Fakültesi
dc.contributor.researcheridAAH-4480-2021
dc.contributor.researcheridABH-5209-2020
dc.date.accessioned2021-07-06T08:57:29Z
dc.date.available2021-07-06T08:57:29Z
dc.date.issued2002-07
dc.description.abstractSearch engines are the gateway for users to retrieve information from the Web. There is a crucial need for tools that allow effective analysis of search engine queries to provide a greater understanding of Web users' information seeking behavior. The objective of the study is to develop an effective strategy for the selection of samples from large-scale data sets. Millions of queries are submitted to Web search engines daily and new sampling techniques are required to bring these databases to a manageable size, while preserving the statistically representative characteristics or the entire data set. This paper reports results from a study using data logs from the Excite Web search engine, We use Poisson sampling to develop a sampling strategy. and show how sample sets selected by Poisson sampling statistically effectively represent the characteristics of the entire dataset. In addition, this paper discusses the use of Poisson sampling in continuous monitoring of stochastic processes, such as Web site dynamics.
dc.identifier.citationÖzmutlu, H. C. vd. (2002). "Analysis of large data logs: an application of Poisson sampling on excite web queries". Information Processing & Management, 38(4), 473-490.
dc.identifier.endpage490
dc.identifier.issn0306-4573
dc.identifier.issue4
dc.identifier.scopus2-s2.0-0036643012
dc.identifier.startpage473
dc.identifier.urihttps://doi.org/10.1016/S0306-4573(01)00043-7
dc.identifier.urihttps://www.sciencedirect.com/science/article/pii/S0306457301000437
dc.identifier.urihttp://hdl.handle.net/11452/21118
dc.identifier.volume38
dc.identifier.wos000175479100002
dc.indexed.scopusScopus
dc.indexed.wosSCIE
dc.indexed.wosSSCI
dc.language.isoen
dc.publisherPergamon-Elsevier Science
dc.relation.journalInformation Processing &Management
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectComputer science
dc.subjectInformation science & library science
dc.subjectPoisson sampling
dc.subjectUsers
dc.subjectLarge-scale in depth data analysis
dc.subjectWeb user modeling
dc.subjectSearch engine queries
dc.subjectData mining
dc.subject.wosComputer science
dc.subject.wosInformation systems
dc.subject.wosInformation science & library science
dc.titleAnalysis of large data logs: an application of Poisson sampling on excite web queries
dc.typeArticle
dc.wos.quartileQ1
dspace.entity.typePublication
local.contributor.departmentMühendislik Fakültesi
local.indexed.atWOS
local.indexed.atScopus

Files

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: