Checking for file duplicates with MD5sum
Posted: Mon Oct 06, 2008 3:01 pm
Recently I had a drive fail on me, as a last ditch effort i used a dd copy with ddrescure to get whatever I could off the drive between failures.
I also had the contense of this disk copied to DVDs before it went bad. Some DVDs where lost/broke/scratched.
I have a mixture of files most of which are identical copies.
Objective:
(1) Remove Duplicate files
Tools I wish to use:
() Shell based commands or scripts
() md5sum to create a hash of all the files in a directory, exported with filename and md5sum
() Comparitor to check for multiple entries of a hash
if entriy exists, move one entry to a to_be_deleted list.
So First step, how to I have md5 sum make a hash for each file?
I also had the contense of this disk copied to DVDs before it went bad. Some DVDs where lost/broke/scratched.
I have a mixture of files most of which are identical copies.
Objective:
(1) Remove Duplicate files
Tools I wish to use:
() Shell based commands or scripts
() md5sum to create a hash of all the files in a directory, exported with filename and md5sum
() Comparitor to check for multiple entries of a hash
if entriy exists, move one entry to a to_be_deleted list.
So First step, how to I have md5 sum make a hash for each file?