Hi there,
I remember reading in the mail that we received for upgrade to AEM 2.5.0.2. that there is a new feature that detects book duplicates based on MD5 (or maybe another algorithm).
Can you please elaborate how it works?
Does this new feature detect duplicates on MD5 when the books are imported into the library? Can I automatically delete those newly added books which have the same MD5 as an existing book?
thanks for the answer
b.r./ Igor, Croatia



> Does this new feature detect duplicates on MD5 when the books are imported into the library? Can I automatically delete those newly added books which have the same MD5 as an existing book?
Yes, that's right. In the Scan tool you can check the box to define file MD5 and than using the drop-down list in the footer, you can remove all the files, that are detected as duplicates by MD5
"...and than using the drop-down list in the footer, you can remove all the files, that are detected as duplicates by MD5"
Could you please answer these two questions:
1) Are those duplicates detected for the same ISBN number? I mean does the detection of a MD5 duplicate mean that the duplicate has the same ISBN no. as the file in the SCAN window which is being imported? If yes that means AEM has to calculate MD5 for ALL existing files in the database?
2) "drop-down list in the footer"
I guess this is in the SCAN window, right?
I also guess that it removes only NEW files, i.e. those which are being imported, NOT the existing ones in the library, right?
thanks
1. In order to detect duplicates by MD5 you need to calculate MD5 for these book that are already in the database. You can do it using the File Parser tool. At this the book ISBN that is stored in the database can differ from the ISBN in the book, that you import
2. Yes, it's about the Scan window. It removes only NEW files from the Scan window list. It doesn't remove any books from the database, and doesn't remove any files from your computer