Regular Expression Matching
While standard globbing (-name "*.log") covers 90% of use cases, sometimes you need the precision of regular expressions to match complex naming patterns.
GNU find supports regular expressions through the -regex and -iregex (case-insensitive) flags.
Crucial Difference: Full Path Matching
Unlike -name, which only checks the basename (the final part of the file path), -regex evaluates the ENTIRE relative path string generated by find.
If you execute find ./src, the generated path is ./src/app.js. Your regex must match the entire ./src/app.js string, not just app.js.
WRONG:
# Fails because "app\.js$" does not match "./src/app.js"
find ./src -regex "app\.js$"
CORRECT:
# Prefix with .* to match the preceding directory path
find ./src -regex ".*app\.js$"
Selecting Regex Dialects (-regextype)
By default, GNU find uses Emacs regular expressions. This is often confusing for users expecting standard POSIX or Perl-compatible regular expressions (PCRE).
You can change the dialect using the -regextype flag. This flag must be placed before the -regex test.
Available types usually include posix-awk, posix-basic, posix-egrep, and posix-extended. posix-extended (often similar to egrep) is generally the most comfortable for modern developers.
# Use posix-extended regex
find ./data -regextype posix-extended -regex ".*\.(csv|tsv|json)$"
Practical Regex Examples
1. Matching Multiple Extensions
While you can use \( -name "*.js" -o -name "*.ts" \), a regex is cleaner.
find ./src -regextype posix-extended -regex ".*\.(js|ts|jsx|tsx)$"
2. Matching Version Numbers
Find application logs containing semantic version numbers in the filename (e.g., app-v1.2.3.log).
find /var/log -regextype posix-extended -regex ".*app-v[0-9]+\.[0-9]+\.[0-9]+\.log$"
3. Finding UUID Filenames
Find files named as UUIDs (e.g., 123e4567-e89b-12d3-a456-426614174000.txt).
find /tmp -regextype posix-extended -regex ".*[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\.txt$"