A few years back, a friend and I were discussing the amount of cruft in many URLs. As an example, here's a URL from a recent Monoprice email newsletter. Note: I changed a few of the characters so these URLs will not work; they're for demonstration purposes only.
http://enews.emails.monoprice.com/q/LLMxJBmP1xo0XEpHXEE5PmLbqHbNxpVsuheZcOJcm9iZ0Bnc3lmZnN3ZWIuY29tw4gZyWuu1KtfpSL58gax3zGue2fhqQ
If you load that in your browser, here's what you'll see in the URL bar:
https://www.monoprice.com/product?p_id=11297&trk_msg=8OLJJ72Q87O4PCIC9HC71NL7VS&trk_contact=T3KPNYKTNJTD6NU53HNJCDD7T4&trk_sid=FGHVB9JP6L3MVLERLES7DLEGEG&trk_link=9HBDB7KBCAEKT949P8P8JQG5J4&cl=res&utm_source=email&utm_medium=email&utm_term=View+product+recommended+for+you&utm_campaign=210902_thursday
But everything after the "p_id..." bit is simply tracking information; this is the actual URL:
https://www.monoprice.com/product?p_id=11297
You may not wish to share all that tracking information with companies every time you click one of their URLs. Enter the URL Decrufter…
__ The Decrufter 8.7 Macros.kmmacros (412 KB)
The Decrufter works whenever you copy a URL in one of the apps it's active in. It first checks the domain to see if it's one you want to decruft, and if so, cleans it up and optionally opens it in your browser (and puts it on the clipboard and saves it for future lookups).
The process of cleaning the URL is tricky, as the source URL (the enews.emails.monoprice...
one above) doesn't even contain the final domain or tracking information. The first thing The Decrufter does is use curl
to find the actual final destination, which is the one shown in the second URL above, with all the tracking information attached. Then, finally, The Decrufter uses a series of regex filters to try to clean up those URLs; here's what the Monoprice filter looks like:
After cleaning, The Decrufter puts the clean URL on the clipboard, optionally opens it in a browser, and saves it to a database history file—if you ever try to decruft the same URL again, it'll load instantly from the database.
Please read the instructions in the very first macro in the group, usefully named ━━ The Decrufter 8.6 ━━
for more details on using the macro. If you have questions, please ask!
Latest Release
8.7 (May 27 2024): This update fixes a few things to help the macro work better and quicker. I moved some routines around, added a couple indices to the database, and loosened the restrictions on curl
results, as the macro was sometimes canceling when it didn't have to do so.
Older Releases
8.6 (Dec 9 2023): This is a huge update that removes the use of a large global variable for tracking decrufted URLs. Instead, there's a new database, which is faster, safer, and much easier to use. (I removed about 10 regex search/replace functions thanks to the database.) There's also a new curl
"progress bar" to let you know that the macro is still waiting on curl
for a response. I also fixed a lot of other little things to make it run faster and fail more nicely.
8.5 (Oct 9 2023): A minor update that changed some variable names to match my convention for other public macros I've written, and that updates the flying.com decrufting routine due to a new URL.
8.4 (May 5 2023): This version has the Facebook typo fix, a simple YouTube decrufter, and some improvements in logic in a few routines.
8.3 (Apr 11 2023): Though I fixed the custom URL feature in 8.2, that fix broke everything else. Whoops. Should be all better now.
8.2 (Apr 11 2023): This version fixed an issue with custom domains for both filtering and non-curling, added aliexpress as a decrufting domain, and updated the in-macro help comments.
I'd love your feedback, and if you have domains that you'd like to see filtered, feel free to contact me with the copied URL (the source URL from the email, etc.—not the final URL!) and I'll try to get them in the macro.
-rob.
Tags: @clean, @sanitize, @url, @privacy, @email, @strip, @tracking @shorten