The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Strategies for Large-Scale Entity Resolution Based on Inverted Index Data Partitioning
Abstract
Inverted indexing is a commonly used technique for improving the performance of entity resolution algorithms by reducing the number of pair-wise comparisons necessary to arrive at acceptable results. This chapter describes how inverted indexing can also be used as a data partitioning strategy to perform entity resolution on large datasets in a distributed processing environment. This chapter discusses the importance of index-to-rule alignment, pre-resolution index closure, post-resolution link closure, and workflows for record-based identity capture and update, and attribute-based identity capture and update in a distributed processing environment.
Related Content
Dina Darwish.
© 2024.
48 pages.
|
Dina Darwish.
© 2024.
51 pages.
|
Smrity Prasad, Kashvi Prawal.
© 2024.
19 pages.
|
Jignesh Patil, Sharmila Rathod.
© 2024.
17 pages.
|
Ganesh B. Regulwar, Ashish Mahalle, Raju Pawar, Swati K. Shamkuwar, Priti Roshan Kakde, Swati Tiwari.
© 2024.
23 pages.
|
Pranali Dhawas, Abhishek Dhore, Dhananjay Bhagat, Ritu Dorlikar Pawar, Ashwini Kukade, Kamlesh Kalbande.
© 2024.
24 pages.
|
Pranali Dhawas, Minakshi Ashok Ramteke, Aarti Thakur, Poonam Vijay Polshetwar, Ramadevi Vitthal Salunkhe, Dhananjay Bhagat.
© 2024.
26 pages.
|
|
|