Regex like search in action: If at path something exists? in a Folder containing ~1000 image files

My solution that I posted on StackOverflow is quite similar:

find -E /Applications ! -iregex '.*\.(app(download)?|scptd|pkg|bundle|qlgenerator|c?action|dictionary|cannedSearch|photoslibrary)/.+'

EDIT: Actually, on testing, Axel's solution, as you did in fact say, excludes the package files completely. Mine includes the package file itself, but not any of its contents.

Hey @CJK,

Well done!

I haven't actually looked at this solution in years, because there's a very good AppleScriptObjC method.

I'll monkey with both Axel's and your solutions and refresh my memory on these aspects of find.

-Chris

Agreed. There's also a perfectly good vanilla AppleScript solution when dealing with a regular number of files, like in an Applications directory. The ASObjC solution is obviously superior when dealing with large directory sub-trees.

Thanks, Chris. That works and runs fast. But what if I need files only and want to provide a filename spec? I tried this, but it didn't work:

find -E /Applications -type f -name "*.app" \( \( -iregex '.*\.(xcodeproj|deps|nib|app|pbproj|fs|ppp|workflow|xcode|xcdatamodel|pkg|bundle|lpdf|kext|dSYM|wdgt|settings|trace|Patch|split|rtfd|scptd)$' -and -prune \) -or -print \)

This will only allow find to return what it considers to be files, whereas it considers application bundles to be directories, so you'd need -type d (which will then, of course, omit files from its output).

You would also need to remove "app" from the regex list for it to return anything ending in ".app".

Awesome! Thanks a lot!
Really appreciate the help!

How would I add an "exclude" line to this variation of your script?

find \
$KMVAR_DND_Fingerprints \
-type f \
-depth -maxdepth 4 -mindepth 1  \
-iname "$KMVAR_PIN*Vector.pdf"
-excl "*Ti*"

I want to find "$KMVAR_PIN"*Vector.pdf" excluding files with "Ti" or "TT" before the "Vector.pdf" (-excl "Ti" was a guess that did not work).

Well, maybe find is not the right tool where packages and files are involved.

I want it to find files and packages that appear as files, but NOT search inside packages. Apparently that is a tall order. Maybe time to goto Chris @ccstone's ASObjC solution. :wink:

This discussion on find started out of a particular fondness for it, particularly in contrast to ls, which I used on the OP's original problem. And I quite agree: find is really the right tool to use for most file-getting jobs, particularly if you plan on using the output from it to pipe through into subsequent commands. You can do this with ls too, but it's not advised because, in theory, the output is less structured and not designed for other commands to read.

But, generalising this discussion, which you may wish to move to separate thread, you can use either plain AppleScript or AppleScript-ObjC to get file and package lists without package contents.

You probably know how to do this in vanilla AppleScript, using either Finder or System Events. The problem with AppleScript is not being able to recurse through a directory tree very easily.

ASObjC can do this, and can do it very fast, and can do it without descending into packages:

use framework "Foundation"

property NSDirectoryEnumerationSkipsPackageDescendants : a reference to 2
property NSDirectoryEnumerationSkipsHiddenFiles : a reference to 4

set d to "/Applications" -- Directory to recurse

set FileManager to current application's NSFileManager's defaultManager()

FileManager's enumeratorAtURL:(current application's NSURL's URLWithString:d) ¬
	includingPropertiesForKeys:{} ¬
	options:(NSDirectoryEnumerationSkipsPackageDescendants + ¬
	NSDirectoryEnumerationSkipsHiddenFiles) ¬
	errorHandler:(missing value)

result's allObjects() as list

As you can see, there are specific options one can set to ignore hidden files and package contents. There are a million other options too. This returns a list of AppleScript file objects.

If you just want to do a shallow search of the directory, then you can do this:

(FileManager's contentsOfDirectoryAtPath:d |error|:(missing value)) as list

which returns a list of filenames (with extensions) as strings.

Returning to the deep directory search results, the object initially returned (before requesting allObjects()) is an object enumerator that allows you to iterator through each object and identify those of relevance versus those that are not. Sadly, there's no ASObjC one-liner equivalent of the AppleScript whose filter, so these object enumerators have to be iterated through manually to read an object's properties.

set f to {}

set fs to FileManager's enumeratorAtURL:(current application's NSURL's URLWithString:d) ¬
	includingPropertiesForKeys:{} ¬
	options:(NSDirectoryEnumerationSkipsPackageDescendants + ¬
	NSDirectoryEnumerationSkipsHiddenFiles) ¬
	errorHandler:(missing value)

tell fs to repeat
	set [isScriptable, fname] to [null, null]

	tell its nextObject()
		if it is missing value then exit repeat
		
		set [true, isScriptable] to ¬
			(its getResourceValue:(reference) ¬
				forKey:(current application's NSURLApplicationIsScriptableKey) ¬
				|error|:(missing value))
		
		if (isScriptable as anything) = true then set [true, fname] to ¬
			(its getResourceValue:(reference) ¬
				forKey:(current application's NSURLNameKey) ¬
				|error|:(missing value))
		
		set end of f to fname as anything
	end tell
end repeat

text of f

This returns a list of all the apps in the /Applications directory that are scriptable. Each property is read using getResourceValue, and there are properties that tell you whether a file is a regular file, a directory, a package file, an alias (symlink), etc.

1 Like

Hey Folks,

I've packaged an AppleScriptObjC script in a macro to make it easy enough for most users to manage.

Find Files and Folders Using a RegEx Name Pattern

It's not as flexible as Unix's find command line utility, but it's far faster and supports RegEx, recursion, and folder-exclusion – nor will it descend into packages.

It's a trifle rough, so I'll probably rework it and add more features in time.

-Chris

@ccstone I have a question, since I'm not very familiar with shell scripting.
You mention in the comments that single quotes are only necessary if the path contains spaces so for example I could only use the single quotes where the sections with space start, right? Let me give you some examples:

~/MyFiles/Tests/'Keyboard Maestro Tests/RegEx in paths'
~/'My Files/Tests/Keyboard Maestro Tests/RegEx in paths'
~/MyFiles/Tests/KeyboardMaestroTests/'RegEx in paths'

Also, if I don't want to think about it, would it also work if I add it before the path like this?
'~/My Files/Tests/Keyboard Maestro Tests/RegEx in paths' \

Thanks

Hi, @iamdannywyatt. You need to keep the ~ outside the quotes. For example:

~'/My Files/Tests/Keyboard Maestro Tests/RegEx in paths'

Another alternative is to exclude the ~ and supply the full path.

1 Like

And really, you should use double quotes, not single quotes, except in certain particular circumstances. You can read all you want on the topic in many places; here's one:

Basically, if/when you get to the point where you're working with shell variables ($myvar), they'll have to be in double quotes, otherwise they don't get expanded:

prompt $> myvar="yellow"
prompt $> echo "$myvar"
yellow
prompt $> echo '$myvar'
$myvar
prompt $> 

So you might as well start with double quotes, rather than having to undo/unlearn in the future :).

-rob.

2 Likes

So even if I leave the ~ outside the quotes, does it make a difference where I put the first one, using the example I shared? Of should I always put it right after the ~?

Thanks for sharing that tip

I would get away from using the tilde—it's a shorthand that requires the shell know the path to your home folder. Instead, just use the full path, which for a macOS user, will always be (unless they've relocated their Home folder, which is quite rare):

/Users/yourshortusername

Then you can put double quotes around the whole thing.

-rob.

1 Like

The reason I always avoid full paths is because if I buy a new computer (or just reinstall the macOS) and for some reason I don't want to keep the same name for my User folder, all my macros will still work.

So if I keep the tilde, should the single quote come right after it?
Or for example if I use it like this, will it still work, since the spaces are not in the beginning?
~/folder1/folder2/'folder 3/file.png'

It's not that I will be doing it, because it makes everything more confusing. I'm just wondering if that would also work.
Whenever I need to use something like that, I will definitely do ~'

No, as long as the spaces are enclosed. But probably easiest to put immediately after the ~.

Another option is to not use quotes; instead escape the spaces with backslashes (\). For example:

~/My\ Files/Tests/Keyboard\ Maestro\ Tests/RegEx\ in\ paths

FYI, one thing you can do is drag a folder from the Finder into a Terminal window. If the path has any spaces they will be automatically escaped. You can even start a command in Terminal (e.g., pwd followed by a space) then drag the folder into the Terminal window.

As long as the shell knows the path to home, that will work. And whether using single or double quotes, leave the tilde outside of it, as it won't be expanded if inside either type.

And really, use double quotes :).

-rob.

Exactly. I was just curious if that was "a thing" :wink: Thanks for clarifying

This is always a very confusing option for me, especially when you have long paths and lots of spaces, which is pretty common with my folders and files.

Yes, when I use Terminal, that's what I do. I sometimes forget about that and I'm just working on a script using TextEdit or Notes, just because it's easier to click somewhere and edit it right away, and then the path is not properly "formatted". I wish the Terminal was a bit more flexible when it comes to editing stuff. Like a normal text editor where we can click somewhere and type, or select some text and delete it, etc.

I don't use it often anyway, but I think it would be way more pleasant to work with. Or at least, make it an option so those who like it the way it is, could keep it that way.

1 Like