Listing all files with a set of possible extensions


Posted by Diego Assencio on 2014.02.15 under Linux (Shell)

Suppose you wish to recursively list all songs in a directory. You know your songs are stored in the mp3, flac and ogg formats. How can you get the desired list?

The solution is simple and provided by the find command. Just open a terminal and run:

find <dir> -type f -iname "*.mp3" -o -iname "*.flac" -o -iname "*.ogg"

The directory dir specifies where the files will be searched. If omitted, the search will be performed on the current working directory. You can also explicitly specify the current working directory as the target search directory as shown below:

find . -type f -iname "*.mp3" -o -iname "*.flac" -o -iname "*.ogg"

Now let's break down the command above and see what is happening:

-type f matches only regular files (directories and other file types such as sockets, pipes, devices etc. are ignored)
-iname <pattern> ignores case distinctions when searching for files whose names match the given patterns (uppercase is not distinguished from lowercase)
... -o ... or operator: matches any of the previous and next patterns

Alternative #1

It is also possible to use grep to do the pattern matching work:

find -iname "*" -type f | grep -i -e "\.mp3$" -e "\.flac$" -e "\.ogg$"

This will recursively list all files on the current working directory and send the file list to grep. The grep command prints all lines from a specified file (or from standard input if no file is given) which match the given pattern(s). As for the flags:

-e <pattern> matches the given pattern (regular expression); you can specify multiple search patterns (as done above)
-i ignores case distinctions when searching for strings which match the given pattern (uppercase is not distinguished from lowercase)

Regular expressions such as "\.mp3$" match the string ".mp3" at the end of a line (find produces a list with one filename per line).

Alternative #2

The command below is equivalent to the alternative command just given:

grep -i -e "\.mp3$" -e "\.flac$" -e "\.ogg$" <(find -iname "*" -type f)

This second alternative version is interesting as it contains a cool fact about shell commands. To understand how it works, first recall that grep can take a file as input (-e is optional when only one search pattern is given):

grep "string" file.txt

will show all occurrences of "string" on file.txt. On Bash, when a command is surrounded by parenthesis, it is run on a "subshell" (the shell clones itself and the command is interpreted on this cloned shell). The output of this subshell can be sent to a temporary pipe which is created and named by Bash. The segment '<(find -iname "*" -type f)' of the command shown means the find command is executed on a subshell and the output is given to a temporary named pipe which is then passed as a file to grep.

Comments

No comments posted yet.