Need help making a macro that compares similar files and deletes the less desirable of the two

Hello All,

I am brand new to Keyboard Maestro and I have already found so many good uses to the various macros listed throughout the wiki. Thank you all so much!

This is my question (I am going to try and explain my need as detailed as possible):

I have been creating audiobooks with a program called Audiobook Builder (ABB). The books start as a folder full of .mp3 audio files with a name like “Dan Lewis - Now I Know The Revealing Stories Behind the Worlds Most Interesting Facts (Unabridged)” I run it through ABB and the new file name looks like this “Now I Know The Revealing Stories Behind the Worlds Most Interesting Facts” and it becomes one file with an extension .abbuilder.

The original files start in a folder called “Audiobooks RAW,” once they are processed by ABB they are automatically moved to “Audiobooks Finished.”

I would like to compile a macro that runs on a scheduled time each day and scrapes “Audiobooks Finished” folder for the file names, and compares them to the files within “Audiobooks RAW.” Once the macro finds a match i.e. the file names contain/ share two or more uncommon words, it moves the original unmodified “Audiobooks RAW” files to the trash and keeps the “Audiobooks Finished” with the .abbuilder extension.

It's plausible to do most of this, but the concept of "share two or more uncommon words" is something that would be very hard to implement. A human can easily see that two things are related, but that is a hard problem for a computer to handle. Even for a human, it's not easy to decide what "two uncommon words" to choose. Of your title:

Now I Know The Revealing Stories Behind the Worlds Most Interesting Facts

Which two words would you chose? I would think it's probably that any two words from that list has another book with those two in it. "Revealing" is about the most uncommon, the rest are fairly common.

There are techniques for calculating the string difference between two strings (eg, the minimum number of characters you need to add or delete to make two strings match). But they are non-trivial.

If the RAW files always include the text of the final name (except the extension), then that would be easy enough to do.