Regular expression - what explains the difference between KM output and regex101

String being reviewed: 2018-01-07–15-09-53-Name of_file-:football:.jpg
Regular expression used in KM: [\p{L}\s._0-9-]+
Output variable after regex application in KM: 2018-01-07–15-23-34-Name of_file-

Expected output: 2018-01-07–15-09-53-Name of_file-.jpg
And this is what happens at https://regex101.com/r/CkdY9t/1), the :football: is removed and “.jpg” is part of the output.

I have not been able to figure out why KM regex application leaves out the .jpg.
It is only supposed to remove the :football:

Why am I doing this: I am trying to remove any characters that cannot be part of a file name on Mac/

Help appreciated. Thank you!

There are various Regex dialects in the wild – note, for example, the 'flavor' side-panel at the regex101 site

Choosing the JavaScript option would be the best match for Regexes used in 'Execute a JavaScript for Automation'.

The Regex standard used by KM actions is defined at: http://userguide.icu-project.org/strings/regexp

Thank you for your reply!

I am using the "Get Substring of Variable" action in KM and therefore chose PCRE on regex101.
Based on this Wiki link for Regex in KM

"Keyboard Maestro uses ICU Regular Expressions (aka RegEx or RegExp) which is very similar to PCRE (Perl Compatible Regular Expressions), and you can read their documentation by choosing ICU Regular Expression Reference from the Help menu in Keyboard Maestro."

Am now going to try with javascript option at regex101.

As you know, Emoji are tricky multi-byte unicode things - you may need to search for some StackOverflow discussion of matching them, and perhaps use their codes rather than glyphs in your regex expressions.

So it seems that KM macro only returns the first match, even though the wiki page says -

All searches are global in Keyboard Maestro.

In my application, the portion after any special character is missed. I simplified the regex.
.jpg which follows after the :football: is the second match, but is missed.

Btw, I am newbie :slight_smile:

To test regex - remove special characters.kmmacros (2.6 KB)

I meant to attach the visible macro. Not sure how others have done so in other messages, but will now add a screenshot of it.

Matches are indeed global, and you can work through them with a For Each action.

See for example:

football match.kmmacros (2.9 KB)

Thank you for creating the example for me.

I think I have it working now!

I think you might find it easier, and more effective, to search for any character that does NOT match your desired characters. This would be the RegEx:
[^\p{L}\s._0-9\-]

Then just do a simple Search and Replace, which will replace ALL matches.
Here's an example macro:

Example Results

##Macro Library   Remove Unwanted Characters from String using RegEx


####DOWNLOAD:
<a class="attachment" href="/uploads/default/original/3X/0/a/0a31c5c468ac932fced90545b6d68f2907abee3a.kmmacros">Remove Unwanted Characters from String using RegEx.kmmacros</a> (2.9 KB)
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**

---



<img src="/uploads/default/original/3X/5/0/508e0ff84417148a499d3892b266d5fd18879ebb.png" width="560" height="719">
2 Likes

Thanks @JMichaelTX!!

This one works well also - and avoids me having to do the loop and concatenate etc.

Oddly enough, this one works on https://regex101.com when I choose pcre (php), not with javascript.

I'm not sure I understand your question. PCRE is the correct option to choose to be most compatible with KM RegEx.

Got it. Will use PCRE. Thank you!