Efficient Data Cleaning Algorithm, Innovative and Rapid Unique User Identification Algorithm Using Modified Hashing and Binary Search Techniques for Web Usage Mining

Ranjena Sriram and Dr.S. Sheeja


Abstract:

The overall study focuses on proposing a new Data cleaning and Unique User Identification processing and Unique User Identification algorithms for Web Usage Mining to discover and analyse the user�s access pattern through mining of log files or log databases and the associated data from a particular website. Pre-Processing technique is to clean the data and user identification process to identify unique users. Since number of users interacting with web sites around the world are increasing day by day, the amount of data generated and information gathered could help the organizations to improve their business according to the customers� needs and behavior. To some extent the Hash function proposed in the previous work to identify Unique User lacked in speed and accuracy to search and compare the existing IP with existing IP in web log server. To eradicate this issue, this work fine-tunes the existing Hash function by inclusion of some factors. The modified Hashing Function used in the User Identification Algorithm is evaluated by comparing with existing algorithms to prove its accuracy and efficiency. The modified Unique User Identification Algorithm is evaluated with various datasets from Murdoch University, Emirates College of Management and Information Technology United Arab Emirates and Nehru College of Arts and Science. Various comparative analyses is also done with other related works and algorithm�s to prove the efficiency of the proposed work. Significant results are produced, which is a significant authentication to this work.

Issue: 01-Special Issue

Year: 2017

Pages: 29-42

Purchase this Article