[SOLVED] A counting problem

Quick to answer questions about finding your way around Linux Mint as a new user.
Forum rules
There are no such things as "stupid" questions. However if you think your question is a bit stupid, then this is the right place for you to post it. Stick to easy to-the-point questions that you feel people can answer fast. For long and complicated questions use the other forums in the support section.
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

[SOLVED] A counting problem

Post by Martin1001 »

I have a folder containing 1,996 items, the items being a mixture of folders and files.

The file extensions are : jpeg jpg JPEG tiff BMP PNG png WAV doc ODT PDF mp3 mp4 MPEG-4 MPEG-2 AVI

I'd like to know how many of each type of file there are. How might I do that?
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
User avatar
JoeFootball
Level 13
Level 13
Posts: 4673
Joined: Tue Nov 24, 2009 1:52 pm
Location: /home/usa/mn/minneapolis/joe

Re: A counting problem

Post by JoeFootball »

Martin1001 wrote:How might I do that?
An internet search reveals several possible solutions. For example ...

Code: Select all

ls -q -U | awk -F . '{print $NF}' | sort | uniq -c | awk '{print $2,$1}'
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: A counting problem

Post by rene »

Or, assuming you want the total from current directory and its subdirectories, very basic e.g.

Code: Select all

#!/bin/bash
for ext in jpeg jpg JPEG tiff BMP PNG png WAV doc ODT PDF mp3 mp4 MPEG-4 MPEG-2 AVI; do 
	echo -ne "$ext\t: "
	find -type f -name "*.$ext" | wc -l
done
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: A counting problem

Post by Martin1001 »

The best suggestion I've found so far---which would give exactly what I want-- is
https://www.2daygeek.com/how-to-count-f ... -in-linux/
# find . -type f | sed -n 's/..*\.//p' | sort | uniq -c
1 avi
1 docx
1 iso
10 jpg
17 mkv
2 mp4
30 pdf
71 png
1 sh
37 svg
5 torrent
1 txt
but, if I copy & paste the code, I get
martin@martin-Satellite-Pro-R50-B:~/XXX$ # find . -type f | sed -n 's/..*\.//p' | sort | uniq -c
martin@martin-Satellite-Pro-R50-B:~/XXX$
The code is beyond my understanding. Any idea why I'm not getting any output?
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
ralplpcr
Level 6
Level 6
Posts: 1096
Joined: Tue Jul 28, 2015 10:11 am

Re: A counting problem

Post by ralplpcr »

Are you currently in the directory you wish to check? I suspect you're running into a permissions issue, since by default that code will start in whatever directory you've currently set. Try changing your directory to your desired folder first.

That code works perfectly on my /Pictures directory:

Code: Select all

ralplpcr@CarbonMint:~/Pictures$  find . -type f | sed -n 's/..*\.//p' | sort | uniq -c
      1 3gp
      2 db
      3 gif
      1 jpeg
   1868 jpg
    974 JPG
     21 mov
     31 MOV
     16 mp4
      3 pdf
     27 png
      1 sh
      1 txt
      3 zip
As far as how that code works...that could take a while to explain! ;)
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: A counting problem

Post by rene »

Martin1001 wrote: Wed Nov 24, 2021 2:28 pm

Code: Select all

martin@martin-Satellite-Pro-R50-B:~/XXX$ # find . -type f | sed -n 's/..*\.//p' | sort | uniq -c
martin@martin-Satellite-Pro-R50-B:~/XXX$
You pasted the prompt sign # --- when part of a command known as the comment marker... --- alongside the actual command, i.e, merely "executed" a comment.

Warning: the command only works when you can guarantee that every file under that directory actually has an extension (which may be fine in the context but which is not generally the case on Linux).

[EDIT] Off, so let me be precise: files without extension as such are ignored --- but files without extension in directories that contain a period (such as here for example under ~/.thunderbird and ~/,mozilla) disturb things. As said, possibly fine, but as far as I can quickly see not trivial to fix while keeping that same method.
Last edited by rene on Wed Nov 24, 2021 3:16 pm, edited 2 times in total.
ralplpcr
Level 6
Level 6
Posts: 1096
Joined: Tue Jul 28, 2015 10:11 am

Re: A counting problem

Post by ralplpcr »

rene wrote: Wed Nov 24, 2021 3:02 pm You pasted the prompt sign # ---
Good catch! I glanced over that, and mistakenly assumed the OP was in su / root mode. Oops! :oops:
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: A counting problem

Post by Martin1001 »

Rene, yes, without the # it works. Many thanks. It gives all the information I needed.

Regarding the files without an extension---and I was aware of text files in the folder that lack one---I'd had a look on the web to see if there was a command that would count just those, or preferably list them, but couldn't find one. And also a command to count any hidden files. Any suggestions?
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: A counting problem

Post by rene »

Certainly you can split things up like that but allow me to try and tweak that sed to see if I can get you a nicer one; when I quickly tried yesterday I failed, and will now need a few hours to have time for that: regex is the type of "programming language" one basically relearns each time it's needed. That method you use is relatively elegant so let's try...
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: A counting problem

Post by rene »

OK, this should work for the no extension case I believe; slight cop-out in the sense of using multiple sed expressions...

Code: Select all

find -type f | sed -n -e 's/.*\///' -e '/.*\./!s/$/.<empty>/' -e 's/.*\.//p' | sort | uniq -c
First edits out anything but the actual filenames, second adds .<empty> to those filenames that do not contain a period, third edits out anything but the extension. Feel of course free to replace <empty> by anything of your liking. Example from my home directory with an additional sort -r | head -10 to display the for me 10 most popular extensions:

Code: Select all

$ find -type f | sed -n -e 's/.*\///' -e '/.*\./!s/$/.<empty>/' -e 's/.*\.//p' | sort | uniq -c | sort -r | head -10
  27576 <empty>
   4707 png
   3633 final
   2597 gz
   2174 html
   1237 jpg
   1002 md5
    885 mo
    881 symbols
    836 page
"A command to count hidden files" is slightly poorly defined. Being hidden is not something fundamental in Linux but a mere convention as to not showing in e.g. ls files/directories starting with a period, but that find for example couldn't care less; lists them already. Also, is a "hidden file" then only a file which itself starts with a period or one that has a starting period on any of its path components, i.e., resides in a "hidden" directory, ...

As such for now that part ignored.
Aztaroth
Level 5
Level 5
Posts: 764
Joined: Mon Jan 11, 2021 1:48 am

Re: A counting problem

Post by Aztaroth »

Martin1001 wrote: Thu Nov 25, 2021 6:12 am Regarding the files without an extension---and I was aware of text files in the folder that lack one---I'd had a look on the web to see if there was a command that would count just those, or preferably list them, but couldn't find one. And also a command to count any hidden files. Any suggestions?
I'll give it a try. If I'm corrected, it would mean I'll learn something.

For counting :
a command that would count just the files with no extensions :

Code: Select all

find . -type f ! -name "*.*"| wc -l
(! -name *.* excludes files with an extension, so only files without are counted - corrected, as all commands below where I mixed -iname and ! -name, credits to rene)

command to count any hidden files :

Code: Select all

find . -type f -name ".*" | wc -l
(-name ".*" only counts files beginning with a . = hidden files)

To list them, just delete the wc pipe :

Code: Select all

find . -type f ! -name "*.*"
find . -type f -name ".*" 
and eventually redirect to a file :

Code: Select all

find . -type f ! -name "*.*" > MyFilesWithNoExt.txt
find . -type f -name ".*"  > MyHiddenFiles.txt
and for those who have an extension, add name "*.*" to your original command :

Code: Select all

find . -type f -name "*.*"| sed -n 's/..*\.//p' | sort | uniq -c
Last edited by Aztaroth on Sat Nov 27, 2021 3:34 am, edited 1 time in total.
dual boot LMDE4 (mostly) + LM19.3 Cinnamon (sometimes)
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: A counting problem

Post by Martin1001 »

Rene, perfect. I now have a match of 1920 files between
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ tree
[...]
75 directories, 1920 files
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$
and
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ find -type f | sed -n -e 's/.*\///' -e '/.*\./!s/$/.<empty>/' -e 's/.*\.//p' | sort | uniq -c
46 AVI
1 bmp
50 doc
31 docx
15 <empty>
1 gif
1 html
350 jpeg
375 jpg
614 JPG
54 mp3
44 mp4
16 MP4
46 MTS
3 odt
37 pdf
50 png
20 tif
1 TIF
18 txt
147 WAV
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$
Aztaroth, it seems that at present your first example counts 'all but' (1920-15):
a command that would count just the files with no extensions :

martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ find . -type f -iname "*.*"| wc -l
1905
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: A counting problem

Post by rene »

Very slight tweak to that second sed expression; doesn't matter, functionally same, just a tiny bit cleaner.

Code: Select all

find -type f | sed -n -e 's/.*\///' -e '/\./!s/$/.<empty>/' -e 's/.*\.//p' | sort | uniq -c
To automate that total count just for the heck of it:

Code: Select all

find -type f | sed -n -e 's/.*\///' -e '/\./!s/$/.<empty>/' -e 's/.*\.//p' | sort | uniq -c | awk '{ n += $1; print } END { printf "%7d TOTAL\n", n }'
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: A counting problem

Post by Martin1001 »

Rene, neat, thank you, I'll log that for future use.
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
Aztaroth
Level 5
Level 5
Posts: 764
Joined: Mon Jan 11, 2021 1:48 am

Re: A counting problem

Post by Aztaroth »

Martin1001 wrote: Thu Nov 25, 2021 10:42 am Aztaroth, it seems that at present your first example counts 'all but' (1920-15):
a command that would count just the files with no extensions :

martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ find . -type f -iname "*.*"| wc -l
1905
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$
You're right. I just mixed -iname (wrong) and ! -name(correct)
Last edited by Aztaroth on Sat Nov 27, 2021 3:36 am, edited 1 time in total.
dual boot LMDE4 (mostly) + LM19.3 Cinnamon (sometimes)
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: [SOLVED] A counting problem

Post by Martin1001 »

I'm clearly missing some detail, for both of the following give 1905, while in fact there are just 15 (out of 1920) files which have no extension. What modification is required to get count = 15? And, as a separate command, to list those 15 files by name, preferably with their locations.
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ find . -type f -iname "*.*"| wc -l
1905
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$ find . -type f -name "*.*"| wc -l
1905
martin@martin-Satellite-Pro-R50-B:~/ARCHIVE/ARCHIVE - Family$
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: [SOLVED] A counting problem

Post by rene »

Yes; -iname is just a case insensitive version of -name. Aztaroth rather meant ! -name. To show them instead of counting just leave out the wc or combine showing and counting with e.g. that same-ish awk method as above:

Code: Select all

find -type f ! -name "*.*" | awk '{ n += 1; print } END { print n }'
Martin1001
Level 4
Level 4
Posts: 406
Joined: Sat Mar 28, 2020 7:19 am
Location: Plymouth, UK

Re: [SOLVED] A counting problem

Post by Martin1001 »

rene, many thanks again, that's perfect and gives me all the information I wanted.
Linux Mint 21.2 Cinnamon. 15.6 GiB. 1001.3 GB. Lenovo Thinkpad.
Aztaroth
Level 5
Level 5
Posts: 764
Joined: Mon Jan 11, 2021 1:48 am

Re: [SOLVED] A counting problem

Post by Aztaroth »

rene wrote: Fri Nov 26, 2021 5:00 am Yes; -iname is just a case insensitive version of -name. Aztaroth rather meant ! -name. To show them instead of counting just leave out the wc or combine showing and counting with e.g. that same-ish awk method as above:

Code: Select all

find -type f ! -name "*.*" | awk '{ n += 1; print } END { print n }'
Yes, a nonsense of mine. Thanks for the correction.
As stated in my first message, I learned something. Thanks twice.
dual boot LMDE4 (mostly) + LM19.3 Cinnamon (sometimes)
Aztaroth
Level 5
Level 5
Posts: 764
Joined: Mon Jan 11, 2021 1:48 am

Re: [SOLVED] A counting problem

Post by Aztaroth »

Begging for forgiveness :) , and under rene's control, you may find this interesting :

Code: Select all

find . -type f -exec file -N -i -- {} + | sed -n 's!: image/[^:]*$!!p' | wc -l
This command counts all your images whatever extension they have.

As said before, replacing the counting by a redirection will list all your pics in a file.

Code: Select all

find . -type f -exec file -N -i -- {} + | sed -n 's!: image/[^:]*$!!p' > MyPics.txt
Be careful when using it with video instead of image. Can be pretty long. I knew in 10s I had about 3500 pics on my disk and only after half an hour that I had 520 videos. I don't know if there's a better algorithm, but probably because I'm not competent enough to decipher this one. One day, I may get nosy but at present, I just take it as it is.

PS : I hope this time I did a better check.
dual boot LMDE4 (mostly) + LM19.3 Cinnamon (sometimes)
Locked

Return to “Beginner Questions”