Copy files from complex directory structure to one directory...

Pastcal · Post by **Pastcal** » Sat Mar 09, 2024 2:56 pm

At work we have an expensive medical scanner, which dumps its images into a hierarchy of directories. I'd like to copy the files into a single directory preserving the meta data as much as possible* and if necessary renaming duplicate files.

* Yes, its a Windows only scanner and the software is terrible, quelle suprise. I've managed to access the images directory/folder over the Samba network.

axrusar · Post by **axrusar** » Sat Mar 09, 2024 4:25 pm

You can just use the file manager for this.
In the root directory where you have the files search by file extension for example *.jpg
You will get the list of files inside each folder and subfolder. Select them all in the search result window, copy and paste them to a new directory. If here are duplicated names, tick the checkbox to do the same for all the conflicts and then click the "Duplicate" button. The duplicated files will get (copy), (second copy), (3rd copy) and so on added to the filenames. You can then use something else like a file rename program or a script to change the names of the files if needed.

Pastcal · Post by **Pastcal** » Sun Mar 10, 2024 9:55 am

Thanks for that. I was really looking for a non-interactive method, a command line or shell script.

Post by **xenopeek** » Sun Mar 10, 2024 1:01 pm

I think you'll have to script it if you want to do it on the command line.

If files in the source were guaranteed to have a unique name you could copy all files from 'source' directory and its subdirectories into 'target' directory with:

Code: Select all

find source -type f -exec cp --archive --no-clobber '{}' target \;

Or if you will be repeatedly doing this copy action and all files stay on source, so you only want to copy files not yet in target or that have been updated since the last copy you could it do it like this to avoid needlessly copying the same files again (but this does clobber files with the same name):

Code: Select all

find source -type f -exec rsync --no-R --no-implied-dirs '{}' target \;

The rsync needs more options to preserve attributes, there are a bunch of options for things you can preserve.

More complex but you could write a small script to loop over the find results, check whether the filename already exists at target and if so change the target name and then copy the file. I don't know of a tool that does this for you but I wouldn't be surprised if it exists. Maybe somebody else knows of a command line tool to do what you want.

Or you could use another command beforehand to ensure all filenames are unique, and if there are any non-unique names fix that manually before running the cp or rsync command. Maybe like so; this will print filenames (without directory) that aren't unique -- if all filenames are unique it will print nothing:

Code: Select all

find source -type f -printf '%f\n' | sort | uniq -c | grep -v '^[[:blank:]]*1[[:blank:]]'

Shiva · Post by **Shiva** » Sun Mar 10, 2024 2:19 pm

Pastcal wrote: ⤴Sat Mar 09, 2024 2:56 pm At work we have an expensive medical scanner, which dumps its images into a hierarchy of directories. I'd like to copy the files into a single directory preserving the meta data as much as possible* and if necessary renaming duplicate files.

* Yes, its a Windows only scanner and the software is terrible, quelle suprise. I've managed to access the images directory/folder over the Samba network.

Do you intend to run regularly the "copy" piece of code or just once ? Because if you intend to do it regularly, you may quickly get tons of duplicate files, the reason being that all files will be copied to your destination directory each time and be renamed (that is, of course, if I'm correctly understanding your problem).

Something beginning with :

Code: Select all

find SourceDir -type f -newerct <some timestamp>

could select files newly created since last time you backed up your files.

mikeflan · Post by **mikeflan** » Mon Mar 11, 2024 8:54 pm

This might be handy to tell if any duplicate filenames exist. It works for me:

Code: Select all

find -name '*' | awk -F"/" '{a[$NF]++}END{for(i in a)if(a[i]>1)print i,a[i]}'

Taken from this website:
https://stackoverflow.com/questions/163 ... -filenames

Pastcal · Post by **Pastcal** » Thu Mar 14, 2024 12:21 pm

>Do you intend to run regularly the "copy" piece of code or just once ?
I'm hoping it's just a one off. Time will tell...

Shiva · Post by **Shiva** » Thu Mar 14, 2024 2:12 pm

Pastcal wrote: ⤴Thu Mar 14, 2024 12:21 pm >Do you intend to run regularly the "copy" piece of code or just once ?
I'm hoping it's just a one off. Time will tell...

Noted. Another question to be sure : in your OP (original post), you speak of duplicate files. Some helpers speak of duplicate filenames, which may not be the same : duplicate files are files containing the same data even if they don't have the same name. Duplicate filenames are just the opposite : they may contain different data but bear the same name. What are your duplicates : files or filenames ?

If it's just filenames and if any kind of duplicate renaming suits you, you may use Xenopeek's suggestion :

Code: Select all

find source -type f -exec cp --archive --no-clobber '{}' target \;

and replace the --no-clobber option with the --backup=t option for cp :

Code: Select all

find source -type f -exec cp --archive --backup=t '{}' target \;

What the --backup=t or --backup=numbered option does (quoting the cp man page) :

Code: Select all

numbered, t
    make numbered backups

So, if you copy let's say Pic.jpg many times, the first copy will be Pic.jpg, the second Pic.jpg.~1~, the third Pic.jpg.~2~ a.s.o. If you need a more customized renaming, it must be scripted.

Linux Mint Forums

Copy files from complex directory structure to one directory...

Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...

Re: Copy files from complex directory structure to one directory...