Automated Duplicate Image Detection and Removal Tool with Multiprocessing

تفاصيل العمل

Developed a Python application for detecting and removing duplicate images within a dataset. Leveraged image hashing techniques including pHash, dHash, wHash, and average hash (avHash) using the ‘imagehash‘ library and implemented multiprocessing for enhanced efficiency.

– Automated loading and hashing of images using ‘PIL‘ and ‘imagehash‘ libraries.

– Utilized multiprocessing to parallelize hash calculation across CPU cores.

– Detection of duplicate images based on configurable similarity thresholds.

– Removal of identified duplicates from the dataset using OS-level file operations.

بطاقة العمل

اسم المستقل Abdallah A.
عدد الإعجابات 0
عدد المشاهدات 5
تاريخ الإضافة

المهارات المستخدمة