MACRO: Quickly Remove Duplicates From a List in a Variable

griffman · December 28, 2021, 10:16pm

There have been some posts on this (not many) over the years, but most were beyond my skill level, involving JXA scripting. Or they relied on another third-party app (BBEdit) to do the sorting. I was looking for an all KM solution, but it had to be fast and (relatively) easy to implement. The following may be of use if you have long lists and need them de-duped.

Every Mac user has a great de-duping tool built in on the Unix side of their OS: The combination of sort and uniq. The sort command does what it says, and uniq (with the -d flag) will spit out a list of matching records from whatever you send it. Put them together...

sort myfile.txt | uniq -d

And the output is a list of duplicates in the (unsorted) myfile.txt file. But even better, you can use it with standard input, so you can send it the contents of a variable. I used that to make a demo project showing how it can work, with both a short (20) and long (240+) list of words. Just enable/disable the short/long list to test either one:

Dedupe Demo.kmmacros (12.0 KB)

The list is sent through the above command, which returns the list of duplicates. I massage that list into a regular expression "or" format—(A|B|C|D) matches any row that are either A, B, C, or D—and then do a multi-row find/replace to remove all the dupes—not even leaving one in place. But because I have the list of dupes, I just add it back at the end, sort, and that's that.

It's quite speedy—fast enough for any lists I'll ever de-dupe; it took about a tenth of a second to do the big list. Hope others find this useful—I'm very happy with its performance; it's about six times faster than the old hodgepodge solution I'd come up with before.

A footnote on the shell script bit, which looks a bit odd with the two which lines. Although these commands are built into macOS, a user may have replaced them with another version on a different path, which would cause the script to fail. The alias variables prevent that from happening.

-rob.

ccstone · December 29, 2021, 3:18am

Hey Rob,

While this is possible it's very unlikely.

People tend to add things with MacPorts, Homebrew, or manually – but only a very uninformed person would move a built-in Unix executable.

Keyboard Maestro does not respect changes made to your Terminal environment – you have to manually change the path in Execute a Shell Script actions – or create an ENV_PATH KM variable and/or other environment variables as needed.

Path in Shell Scripts

A shebang line is not required in Shell Script actions, unless you want or need to use a shell other than the default.

sort has a built-in unique switch -u, so unique is not required, unless you need its extra features.

So – this task is really a simple one-liner.

-Chris

Sort - Unique v1.00.kmmacros (6.8 KB)

Macro-Image

Keyboard Maestro Export

griffman · December 29, 2021, 5:05am

Thanks for the clarification on the shell stuff; I wasn't sure how KM did it.

Relative to the script, and the reason I wrote it the way I did, is that I need to pull the duplicates so that I can show them—it's for users who create duplicate shortcuts, and they need to know which ones they are so they can fix them.

As far as I know, you can't get the duplicates out with sort, right? It just makes them go away.

regards;
-rob.

ccstone · December 29, 2021, 5:10am

Quite right.

I missed that you were using the -d switch in uniq...

MACRO: Quickly Remove Duplicates From a List in a Variable

Options