Find and remove duplicate file

It is a very common requirement to remove duplicate file. If you have a large collection of file, then finding and removing such duplicate file will be a tedious task. But this is where GNU/Linux makes the difference- it provides the ‘fdupes’ utility, which finds duplicate files by comparing their check- sum.

Let us install the ‘fdupes’ command using an aptitude package manager.

$ sudo apt-get install fdupes




Now, let’s suppose that we have a large collection of files and we want to remove all duplicate files. We can use the ‘fdupes’ command with the ‘-d’ option, which displays the list of duplicate files and prompts users to select which files they want to preserve. Let us check it out with an example.

First create a few duplicate files

$ for I in {1.10}; do echo “ Opensource for You.” >
F${i}.txt; done




Now find and remove them.

$ fdupes –d .

[1] ./f2.txt
[2] ./f3.txt
[3] ./f5.txt
[4] ./f9.txt
[5] ./f6.txt
[6] ./f8.txt
[7] ./f4.txt
[8] ./f1.txt
[9] ./f10.txt
[10] ./f7.txt
Set 1 of 1, preserve files [1-10,all]: 1 # Preserve only one copy of file
[+] ./f2.txt
[-] ./f3.txt
[-] ./f5.txt
[-] ./f9.txt
[-] ./f6.txt
[-] ./f8.txt
[-] ./f4.txt
[-] ./f1.txt
[-] ./f10.txt
[-] ./f7.txt

In the output , the ‘+’ sign before the file name means the file has been preserved and the ‘-‘ sign means the file has been deleted.

Leave a Reply

Your email address will not be published.