Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Mon 24 Nov 2014, 01:27
All times are UTC - 4
 Forum index » Advanced Topics » Hardware
how to remove duplicate files?
Moderators: Flash, Ian, JohnMurga
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [9 Posts]  
Author Message
simon67

Joined: 18 Feb 2014
Posts: 1

PostPosted: Tue 18 Feb 2014, 08:34    Post subject:  how to remove duplicate files?  

Hi guys
Can Anyone suggest me what is best solution to find and remove duplicate files.?
one if my friend suggest me DuplicateFilesDeleter, Is he right?
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2578
Location: www.eussenheim.de/

PostPosted: Tue 18 Feb 2014, 09:15    Post subject: Re: how to remove duplicate files?  

simon67 wrote:
Hi guys
Can Anyone suggest me what is best solution to find and remove duplicate files.?
one if my friend suggest me DuplicateFilesDeleter, Is he right?

I don't know about DuplicateFilesDeleter.

Best for find is ..... find

exemple
Code:
find / -name README


If you wish to remove them
Code:
rm `find / -name README`
(after reaming the one you want to keep)
Back to top
View user's profile Send private message 
Ted Dog


Joined: 13 Sep 2005
Posts: 2470
Location: Heart of Texas

PostPosted: Tue 18 Feb 2014, 11:38    Post subject:  

there are two old pets that do this availablle on forum both need a update. One only needs a link for library and worked for me. I hooked up as many harddrives and let it cook for a day found thousands of dups about 80G.
Back to top
View user's profile Send private message 
Ted Dog


Joined: 13 Sep 2005
Posts: 2470
Location: Heart of Texas

PostPosted: Tue 18 Feb 2014, 12:12    Post subject:  

dupfgui.pet needs a link for libtiff.so.4 to whichever version you have. in slacko5.6.5 its linked to libtiff.so.3.9.7 in /usr/lib/


gives options to delete all but one with a click. (I disabled are you sure msg with settings otherwise its two clicks.

Last edited by Ted Dog on Tue 18 Feb 2014, 13:13; edited 1 time in total
Back to top
View user's profile Send private message 
jamesbond

Joined: 26 Feb 2007
Posts: 2230
Location: The Blue Marble

PostPosted: Tue 18 Feb 2014, 12:24    Post subject:  

Assuming your underlying filesystem doesn't already do de-dup at block-level, you can do a file-level duplicate detection like this:

Code:
 find / -type f -print0 | xargs -0 md5sum | sort -k1 | awk '{ if ($1==prevmd5) { if (prevfile) print prevfile; print $0; prevfile=""} else { prevmd5=$1; prevfile=$0 }}


where "/" is the start of directory you want to look for duplicate files. "/" works if you want to search on all of your mounted all your disks (like Ted Dog did); or you can do use "/mnt/sda1" to limit the search to /mnt/sda1.

Or you can even split the process - save everything up to "md5sum" to a file; and later one you can cat all of these md5sum files, sort them and run the awk script on them.

Beware, this will read *every file on your disk* (on the root path you specified) and perform a rather heavy md5sum computation on all of them; so it will:
a) grind your disk
b) saturate your I/O
c) tax your CPU
all at the same time (though a and b are probably the worse of them all). a) is especially bad for your harddisk health.

Things get more complicated if you want to look for duplicates within zip files, within tarballs, within rar files, attachments within your mails ... Wink

My recommendation is - instead of looking for duplicate files, just get a bigger external harddisk (or backup to bluray like Ted Dog does) and don't worry about duplicates.

If you do worry about duplicates (e.g. every file you keep is in gigabytes range); then may as well invest the time to use de-dup capable filesystem in the first place (e.g. zfs/btrfs).

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
Ted Dog


Joined: 13 Sep 2005
Posts: 2470
Location: Heart of Texas

PostPosted: Tue 18 Feb 2014, 12:43    Post subject:  

the program I posted does not read everyfile but sorts by size first and compares bytes until no matched bytes. After about a few megs of matches it md5sums remainer and works better with lots ot memory.
The reason I needed this is I got out of multisession for a long spell.. Im a data packrat.. Needed to consolidate space to move stuff around and free up a harddrive.
Sadly it does not match files with same names that are different like those hundreds of save files puppylinux litters about. Embarassed
big warning about scanning windows partitions... the bloat and different names for the same files are just wow... Shocked
Back to top
View user's profile Send private message 
Ted Dog


Joined: 13 Sep 2005
Posts: 2470
Location: Heart of Texas

PostPosted: Tue 18 Feb 2014, 14:49    Post subject:  

http://www.murga-linux.com/puppy//viewtopic.php?p=209361#209361

found it for you. Very Happy Very Happy
Back to top
View user's profile Send private message 
ariel


Joined: 03 Jul 2009
Posts: 97

PostPosted: Tue 18 Feb 2014, 15:37    Post subject:  

hi, you can also use fdupes a command-line program. type fdupes --help for more information. place the program in /usr/local/bin
fdupes.tar.gz
Description 
gz

 Download 
Filename  fdupes.tar.gz 
Filesize  10.03 KB 
Downloaded  138 Time(s) 
Back to top
View user's profile Send private message 
elnelson

Joined: 24 Feb 2014
Posts: 4
Location: Argentina

PostPosted: Fri 02 May 2014, 19:04    Post subject:  

fslint

You can install fslint from PPM
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [9 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Advanced Topics » Hardware
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0617s ][ Queries: 13 (0.0051s) ][ GZIP on ]