Regex like search in action: If at path something exists? in a Folder containing ~1000 image files

Currently I am using an If condition to check if something at path exists like this:

~SomeFolder/MyPhrase.png

I want to be able to search for any file that contains MyPhrase in its name.

Probably something like ~SomeFolder/*MyPhrase*.png

Eg I want it to look for "MyPhrase.png" as well as "variable prefix MyPhrase.png" or "MyPhrase variable suffix.png" (basically any file that contains "MyPhrase" in the file name)

Thanks in anticipation

cd "/Users/$USER/Path/To/Folder/"
ls "*$(cat)*.png"

You can leave $USER in, or substitute it for your username (home folder). This will return a list, one file per line.

1 Like

Hey @forums2012,

In general I like to use the Unix find command for this sort of thing. It was designed to do exactly this sort of job.

I've broken up the parts of the command using the Bash continuation character “\” to place them one per line for easy viewing and identification.

find		== Unix command.
~/test...	== Target dir.
-type f		== Restricts to finding ‘files’.
-depth 1	== Prevents descent into any sub-directories.
-iname		== Case-insensitive name (can use wildcards).

** The main pitfall of Unix commands for this sort of thing on macOS is that they can descend into package files that are really directories like .app “files”. If you need to avoid packages then we have a very good AppleScriptObjC solution.

The way I've written the find command will return the full-path to each found file, although it can be restricted to returning only file names.

-Chris


Find Files in a Directory with Wildcards in the File-Name.kmmacros (4.6 KB)

3 Likes

Chris, I too really like the Bash ⟦find⟧ command.

I thought there was a way to exclude packages, but I was wrong.
After substantial research and testing, I posted this question at StackOverflow.com:
How Do I Exclude Package Files (Directories) from Bash Mac find?

Maybe someone will come up with a way to do this.

Hey JM,

There is a method that requires excluding file types via file suffixes.

It's not convenient, but it's doable.

Courtesy of Axel Luttgens back in 2015 on the Applescript Users List:

find -E ~/Documents -type d \( \( -iregex '.*\.(xcodeproj|deps|nib|app|pbproj|fs|ppp|workflow|xcode|xcdatamodel|pkg|bundle|lpdf|kext|dSYM|wdgt|settings|trace|Patch|split|rtfd|scptd)$' -and -prune \) -or -print \)

This will find subdirectories in a given directory and exclude the named packages.

-Chris

Hey Folks,

Here's the original thread on the Applescript Users List:

Shell script to get a list of all subfolders inside a folder

-Chris

My solution that I posted on StackOverflow is quite similar:

find -E /Applications ! -iregex '.*\.(app(download)?|scptd|pkg|bundle|qlgenerator|c?action|dictionary|cannedSearch|photoslibrary)/.+'

EDIT: Actually, on testing, Axel's solution, as you did in fact say, excludes the package files completely. Mine includes the package file itself, but not any of its contents.

Hey @CJK,

Well done!

I haven't actually looked at this solution in years, because there's a very good AppleScriptObjC method.

I'll monkey with both Axel's and your solutions and refresh my memory on these aspects of find.

-Chris

Agreed. There's also a perfectly good vanilla AppleScript solution when dealing with a regular number of files, like in an Applications directory. The ASObjC solution is obviously superior when dealing with large directory sub-trees.

Thanks, Chris. That works and runs fast. But what if I need files only and want to provide a filename spec? I tried this, but it didn't work:

find -E /Applications -type f -name "*.app" \( \( -iregex '.*\.(xcodeproj|deps|nib|app|pbproj|fs|ppp|workflow|xcode|xcdatamodel|pkg|bundle|lpdf|kext|dSYM|wdgt|settings|trace|Patch|split|rtfd|scptd)$' -and -prune \) -or -print \)

This will only allow find to return what it considers to be files, whereas it considers application bundles to be directories, so you'd need -type d (which will then, of course, omit files from its output).

You would also need to remove "app" from the regex list for it to return anything ending in ".app".

Awesome! Thanks a lot!
Really appreciate the help!

How would I add an "exclude" line to this variation of your script?

find \
$KMVAR_DND_Fingerprints \
-type f \
-depth -maxdepth 4 -mindepth 1  \
-iname "$KMVAR_PIN*Vector.pdf"
-excl "*Ti*"

I want to find "$KMVAR_PIN"*Vector.pdf" excluding files with "Ti" or "TT" before the "Vector.pdf" (-excl "Ti" was a guess that did not work).

Well, maybe find is not the right tool where packages and files are involved.

I want it to find files and packages that appear as files, but NOT search inside packages. Apparently that is a tall order. Maybe time to goto Chris @ccstone's ASObjC solution. :wink:

This discussion on find started out of a particular fondness for it, particularly in contrast to ls, which I used on the OP's original problem. And I quite agree: find is really the right tool to use for most file-getting jobs, particularly if you plan on using the output from it to pipe through into subsequent commands. You can do this with ls too, but it's not advised because, in theory, the output is less structured and not designed for other commands to read.

But, generalising this discussion, which you may wish to move to separate thread, you can use either plain AppleScript or AppleScript-ObjC to get file and package lists without package contents.

You probably know how to do this in vanilla AppleScript, using either Finder or System Events. The problem with AppleScript is not being able to recurse through a directory tree very easily.

ASObjC can do this, and can do it very fast, and can do it without descending into packages:

use framework "Foundation"

property NSDirectoryEnumerationSkipsPackageDescendants : a reference to 2
property NSDirectoryEnumerationSkipsHiddenFiles : a reference to 4

set d to "/Applications" -- Directory to recurse

set FileManager to current application's NSFileManager's defaultManager()

FileManager's enumeratorAtURL:(current application's NSURL's URLWithString:d) ¬
	includingPropertiesForKeys:{} ¬
	options:(NSDirectoryEnumerationSkipsPackageDescendants + ¬
	NSDirectoryEnumerationSkipsHiddenFiles) ¬
	errorHandler:(missing value)

result's allObjects() as list

As you can see, there are specific options one can set to ignore hidden files and package contents. There are a million other options too. This returns a list of AppleScript file objects.

If you just want to do a shallow search of the directory, then you can do this:

(FileManager's contentsOfDirectoryAtPath:d |error|:(missing value)) as list

which returns a list of filenames (with extensions) as strings.

Returning to the deep directory search results, the object initially returned (before requesting allObjects()) is an object enumerator that allows you to iterator through each object and identify those of relevance versus those that are not. Sadly, there's no ASObjC one-liner equivalent of the AppleScript whose filter, so these object enumerators have to be iterated through manually to read an object's properties.

set f to {}

set fs to FileManager's enumeratorAtURL:(current application's NSURL's URLWithString:d) ¬
	includingPropertiesForKeys:{} ¬
	options:(NSDirectoryEnumerationSkipsPackageDescendants + ¬
	NSDirectoryEnumerationSkipsHiddenFiles) ¬
	errorHandler:(missing value)

tell fs to repeat
	set [isScriptable, fname] to [null, null]

	tell its nextObject()
		if it is missing value then exit repeat
		
		set [true, isScriptable] to ¬
			(its getResourceValue:(reference) ¬
				forKey:(current application's NSURLApplicationIsScriptableKey) ¬
				|error|:(missing value))
		
		if (isScriptable as anything) = true then set [true, fname] to ¬
			(its getResourceValue:(reference) ¬
				forKey:(current application's NSURLNameKey) ¬
				|error|:(missing value))
		
		set end of f to fname as anything
	end tell
end repeat

text of f

This returns a list of all the apps in the /Applications directory that are scriptable. Each property is read using getResourceValue, and there are properties that tell you whether a file is a regular file, a directory, a package file, an alias (symlink), etc.

1 Like

Hey Folks,

I've packaged an AppleScriptObjC script in a macro to make it easy enough for most users to manage.

Find Files and Folders Using a RegEx Name Pattern

It's not as flexible as Unix's find command line utility, but it's far faster and supports RegEx, recursion, and folder-exclusion – nor will it descend into packages.

It's a trifle rough, so I'll probably rework it and add more features in time.

-Chris

@ccstone I have a question, since I'm not very familiar with shell scripting.
You mention in the comments that single quotes are only necessary if the path contains spaces so for example I could only use the single quotes where the sections with space start, right? Let me give you some examples:

~/MyFiles/Tests/'Keyboard Maestro Tests/RegEx in paths'
~/'My Files/Tests/Keyboard Maestro Tests/RegEx in paths'
~/MyFiles/Tests/KeyboardMaestroTests/'RegEx in paths'

Also, if I don't want to think about it, would it also work if I add it before the path like this?
'~/My Files/Tests/Keyboard Maestro Tests/RegEx in paths' \

Thanks

Hi, @iamdannywyatt. You need to keep the ~ outside the quotes. For example:

~'/My Files/Tests/Keyboard Maestro Tests/RegEx in paths'

Another alternative is to exclude the ~ and supply the full path.

1 Like

And really, you should use double quotes, not single quotes, except in certain particular circumstances. You can read all you want on the topic in many places; here's one:

Basically, if/when you get to the point where you're working with shell variables ($myvar), they'll have to be in double quotes, otherwise they don't get expanded:

prompt $> myvar="yellow"
prompt $> echo "$myvar"
yellow
prompt $> echo '$myvar'
$myvar
prompt $> 

So you might as well start with double quotes, rather than having to undo/unlearn in the future :).

-rob.

2 Likes

So even if I leave the ~ outside the quotes, does it make a difference where I put the first one, using the example I shared? Of should I always put it right after the ~?