A clickstream-based web page significance ranking metric for web crawlers
The unpredictable fast growing dimension of the World Wide Web and its non-static nature causes considerable obstacles for Web crawlers including the presence of some incorrect and irrelevant answers among search results set and the scaling issues. Hence, solutions that are more promising are in dem...
محفوظ في:
المؤلفون الرئيسيون: | , |
---|---|
التنسيق: | Conference or Workshop Item |
منشور في: |
2011
|
الوصول للمادة أونلاين: | http://eprints.utm.my/id/eprint/45469/ http://dx.doi.org/10.1109/MySEC.2011.6140674 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
الملخص: | The unpredictable fast growing dimension of the World Wide Web and its non-static nature causes considerable obstacles for Web crawlers including the presence of some incorrect and irrelevant answers among search results set and the scaling issues. Hence, solutions that are more promising are in demand to provide more accurate search outcomes. Because implementing existed Web page importance metrics either link based or context based within a parallel crawler can not be an absolute solution for the coverage of authorized fresh Web content and the accuracy concerns, so employing these metrics is not the final approach within search engines' architecture. This paper proposes an analysis on clickstream data in order to discover the popularity of Web pages in crawl frontier through proposing the metric itself and presenting the experimental results on ranking the UTM Web pages based on the proposed discussed metric. |
---|