Copy files from complex directory structure to one directory...

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Post Reply
Pastcal
Level 3
Level 3
Posts: 180
Joined: Tue May 07, 2013 12:06 pm

Copy files from complex directory structure to one directory...

Post by Pastcal »

At work we have an expensive medical scanner, which dumps its images into a hierarchy of directories. I'd like to copy the files into a single directory preserving the meta data as much as possible* and if necessary renaming duplicate files.

* Yes, its a Windows only scanner and the software is terrible, quelle suprise. I've managed to access the images directory/folder over the Samba network.
User avatar
axrusar
Level 7
Level 7
Posts: 1515
Joined: Sat Jan 30, 2021 5:30 pm

Re: Copy files from complex directory structure to one directory...

Post by axrusar »

You can just use the file manager for this.
In the root directory where you have the files search by file extension for example *.jpg
You will get the list of files inside each folder and subfolder. Select them all in the search result window, copy and paste them to a new directory. If here are duplicated names, tick the checkbox to do the same for all the conflicts and then click the "Duplicate" button. The duplicated files will get (copy), (second copy), (3rd copy) and so on added to the filenames. You can then use something else like a file rename program or a script to change the names of the files if needed.
Linux Mint Una Cinnamon 20.3 Kernel: 5.15.x | Quad Core I7 4.2Ghz | 24GB Ram | 1TB NVMe | Intel Graphics
Image
Pastcal
Level 3
Level 3
Posts: 180
Joined: Tue May 07, 2013 12:06 pm

Re: Copy files from complex directory structure to one directory...

Post by Pastcal »

Thanks for that. I was really looking for a non-interactive method, a command line or shell script.
User avatar
xenopeek
Level 25
Level 25
Posts: 29615
Joined: Wed Jul 06, 2011 3:58 am

Re: Copy files from complex directory structure to one directory...

Post by xenopeek »

I think you'll have to script it if you want to do it on the command line.

If files in the source were guaranteed to have a unique name you could copy all files from 'source' directory and its subdirectories into 'target' directory with:

Code: Select all

find source -type f -exec cp --archive --no-clobber '{}' target \;
Or if you will be repeatedly doing this copy action and all files stay on source, so you only want to copy files not yet in target or that have been updated since the last copy you could it do it like this to avoid needlessly copying the same files again (but this does clobber files with the same name):

Code: Select all

find source -type f -exec rsync --no-R --no-implied-dirs '{}' target \;
The rsync needs more options to preserve attributes, there are a bunch of options for things you can preserve.

More complex but you could write a small script to loop over the find results, check whether the filename already exists at target and if so change the target name and then copy the file. I don't know of a tool that does this for you but I wouldn't be surprised if it exists. Maybe somebody else knows of a command line tool to do what you want.

Or you could use another command beforehand to ensure all filenames are unique, and if there are any non-unique names fix that manually before running the cp or rsync command. Maybe like so; this will print filenames (without directory) that aren't unique -- if all filenames are unique it will print nothing:

Code: Select all

find source -type f -printf '%f\n' | sort | uniq -c | grep -v '^[[:blank:]]*1[[:blank:]]'
Image
Shiva
Level 3
Level 3
Posts: 141
Joined: Thu Jul 07, 2022 11:25 am

Re: Copy files from complex directory structure to one directory...

Post by Shiva »

Pastcal wrote: Sat Mar 09, 2024 2:56 pm At work we have an expensive medical scanner, which dumps its images into a hierarchy of directories. I'd like to copy the files into a single directory preserving the meta data as much as possible* and if necessary renaming duplicate files.

* Yes, its a Windows only scanner and the software is terrible, quelle suprise. I've managed to access the images directory/folder over the Samba network.
Do you intend to run regularly the "copy" piece of code or just once ? Because if you intend to do it regularly, you may quickly get tons of duplicate files, the reason being that all files will be copied to your destination directory each time and be renamed (that is, of course, if I'm correctly understanding your problem).

Something beginning with :

Code: Select all

find SourceDir -type f -newerct <some timestamp>
could select files newly created since last time you backed up your files.
mikeflan
Level 17
Level 17
Posts: 7162
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Re: Copy files from complex directory structure to one directory...

Post by mikeflan »

This might be handy to tell if any duplicate filenames exist. It works for me:

Code: Select all

find -name '*' | awk -F"/" '{a[$NF]++}END{for(i in a)if(a[i]>1)print i,a[i]}'
Taken from this website:
https://stackoverflow.com/questions/163 ... -filenames
Pastcal
Level 3
Level 3
Posts: 180
Joined: Tue May 07, 2013 12:06 pm

Re: Copy files from complex directory structure to one directory...

Post by Pastcal »

>Do you intend to run regularly the "copy" piece of code or just once ?
I'm hoping it's just a one off. Time will tell...
Shiva
Level 3
Level 3
Posts: 141
Joined: Thu Jul 07, 2022 11:25 am

Re: Copy files from complex directory structure to one directory...

Post by Shiva »

Pastcal wrote: Thu Mar 14, 2024 12:21 pm >Do you intend to run regularly the "copy" piece of code or just once ?
I'm hoping it's just a one off. Time will tell...
Noted. Another question to be sure : in your OP (original post), you speak of duplicate files. Some helpers speak of duplicate filenames, which may not be the same : duplicate files are files containing the same data even if they don't have the same name. Duplicate filenames are just the opposite : they may contain different data but bear the same name. What are your duplicates : files or filenames ?

If it's just filenames and if any kind of duplicate renaming suits you, you may use Xenopeek's suggestion :

Code: Select all

find source -type f -exec cp --archive --no-clobber '{}' target \;
and replace the --no-clobber option with the --backup=t option for cp :

Code: Select all

find source -type f -exec cp --archive --backup=t '{}' target \;
What the --backup=t or --backup=numbered option does (quoting the cp man page) :

Code: Select all

numbered, t
    make numbered backups
So, if you copy let's say Pic.jpg many times, the first copy will be Pic.jpg, the second Pic.jpg.~1~, the third Pic.jpg.~2~ a.s.o. If you need a more customized renaming, it must be scripted.
Post Reply

Return to “Scripts & Bash”