Problem
I’d want to traverse a folder hierarchy on a Linux machine and get a list of all the different file extensions contained within it.
What is the best approach to accomplish this with a shell?
Asked by GloryFish
Solution #1
Try this (it’s not the best method, but it works):
find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u
It works like this:
Answered by Ivan Nevostruev
Solution #2
There’s no need to use the pipe to sort; awk can handle everything:
find . -type f | awk -F. '!a[$NF]++{print $NF}'
Answered by SiegeX
Solution #3
Recursive version:
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort -u
If you’re looking for totals (the number of times the extension was seen), go to:
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort | uniq -c | sort -rn
Non-recursive (single folder):
for f in *.*; do printf "%s\n" "${f##*.}"; done | sort -u
This is based on this forum post, therefore credit should go to them.
Answered by ChristopheD
Solution #4
My awk-free, sed-free, Perl-free, Python-free POSIX-complia
find . -type f | rev | cut -d. -f1 | rev | tr '[:upper:]' '[:lower:]' | sort | uniq --count | sort -rn
The trick is that the line is reversed and the extension is cut at the start. It also lowers the case of the extensions.
Example output:
3689 jpg
1036 png
610 mp4
90 webm
90 mkv
57 mov
12 avi
10 txt
3 zip
2 ogv
1 xcf
1 trashinfo
1 sh
1 m4v
1 jpeg
1 ini
1 gqv
1 gcs
1 dv
Answered by Ondra Žižka
Solution #5
Powershell:
dir -recurse | select-object extension -unique
Thanks to http://kevin-berridge.blogspot.com/2007/11/windows-powershell.html
Answered by Simon R
Post is based on https://stackoverflow.com/questions/1842254/how-can-i-find-all-of-the-distinct-file-extensions-in-a-folder-hierarchy