Open Access Open Access  Restricted Access Subscription Access

A Fast Detection of Duplicates Using Progressive Methods

B. Bhagya Lakshmi

Abstract


In any database large amount of data will be present and as different people use this data, there is a chance of occurring quality of data problems, representing similar objects in different forms called as ‘duplicates’ and identifying these duplicates is one of the major problems. In now-a-days, different methods of duplicate - detection need to process huge datasets in shorter amounts of time and at same time maintaining the quality of a dataset which is becoming difficult. In existing system, methods of duplicate - detection like Sorted Neighborhood Method (SNM) and Blocking Methods are used for increasing the efficiency of finding duplicate records. In this paper, two new Progressive duplicate - detection algorithms are used for increasing the efficiency of finding the duplicate records and to eliminate the identified duplicate records if there is a limited time for duplicate - detection process. These algorithms increase the overall process gain by delivering complete results faster. In this paper am comparing the two progressive algorithms and results are displayed.


Full Text:

PDF

Refbacks

  • There are currently no refbacks.