RegEx Beginning/End-of-Line Anchors Not Working?

I'm not clear on your test case. Please post the entire test case, including source string, RegEx you used, and KM Action that you used. Probably that's your test macro. :wink:

OK, but as I said, it's identical to the first post. I just changed the value of the variable because my Latin is rusty. Simulator, as we used to say.

With this macro the text is not wrapped in tags as it should be if the "entire string of text" is handled by ^(.*)$.

I won't repeat my earlier message, but the point is no substitution takes place.

Keyboard Maestro 8.0.3 “KM Regex Test” Macro

KM Regex Test.kmmacros (3.7 KB)

The problem is with your RegEx. By default the dot character . does NOT match end of line characters, so your RegEx is NOT matched.

You need this RegEx:
(?s)^(.*)$

The (?s) flag enables it to match end of line characters.

With this, the match is made, and results are as expected:

See Regular Expressions - ICU User Guide

If set, a "." in a pattern will match a line terminator in the input text. By default, it will not. Note that a carriage-return / line-feed pair in text behave as a single line terminator, and will match a single "." in a RE pattern.
Line terminators are \u000a, \u000b, \u000c, \u000d, \u0085, \u2028, \u2029 and the sequence \u000d \u000a.

Yep. So the “entire string of text” is not actually handled by ^(.*)$. The “.” prevents that.

No, without the flag (?m), the ^ and $ still refer to the entire string, the beginning and end, respectively. What happens (matches) in between will determine whether or not the RegEx has made a match.

For example, the Regex ^Some Text at the beginning$ would also fail if the text "Some Text at the beginning" was NOT actually at the beginning of the source string. In fact, it would have to be the entire string to match.

I have found a great way to test and learn RegEx is to use https://regex101.com/

Given input:

Line 1
Line 2
Line 3

The matches for ^(.*)$ depends on the s and m Flag Options, both of which are off by default.

  • The s (DOTALL) flag: If set, a "." in a pattern will match a line terminator in the input text. By default, it will not. Note that a carriage-return / line-feed pair in text behave as a single line terminator, and will match a single "." in a RE pattern.
  • The m (MULTILINE) flag: Control the behavior of "^" and "$" in a pattern. By default these will only match at the start and end, respectively, of the input text. If this flag is set, "^" and "$" will also match at the start and end of each line within the input text.

So Search and replace for a variable of ^(.*)$ and replace with "xyz" results in:

  • ^(.*)$ - fails to match. ^ matches at the start of Line 1, $ matches at the Line 3, . does not match end of line characters.
  • (?m)^(.*)$ - returns "xyz%Return%xyz%Return%xyz%Return%". ^ matches at the start of each line, $ matches at the end of each line, . does not match end of line characters.
  • (?s)^(.*)$ - returns "xyz". ^ matches at the start of Line 1, $ matches at the end of Line 3, . does matches everything.
  • (?sm)^(.*)$ - returns "xyz". ^ matches at the start of Line 1, $ matches at end of each line, . does matches everything. Because .* is greedy, it will match until the end of the string.
  • (?m)^(.*?)$ - returns "xyz%Return%xyz%Return%xyz%Return%". ^ matches at the start of each line, $ matches at the end of each line, . matches everything. Since it is not greedy now, it will match until the end of the first line, where $ matches.

Keyboard Maestro Actions.kmactions (1.6 KB)

2 Likes

Peter, thanks for confirming my posts.

Not to beat this to death but the issue that bothered me was the implication that the entire string would be handled by the delimiters, which I felt was confusing at best. And required qualification.

I appreciate, having written them since 1976, that regexps are tricky little things.

Keyboard Maestro’s syntax of putting options before the regexp is a new wrinkle for me anyway. So I appreciate the clarification of the syntax for flag options at least. But I wonder if it wouldn’t be better (one day) to be explicit about the options with, oh something like checkboxes.

OK, I’ve beat it to death. Sorry.

Not sure what you mean by "delimiters". The scope of the ^, $, and . and are standard, and long-standing across all RegEx engines I have seen or used.

Again, this is standard. From Specifying Modes Inside The Regular Expression

Sometimes, the tool or language does not provide the ability to specify matching options. The handy . . . Or, the regex flavor may support matching modes that aren't exposed as external flags.

In those situations, you can add the following mode modifiers to the start of the regex.

If you insert the modifier (?ism) in the middle of the regex then the modifier only applies to the part of the regex to the right of the modifier.

This is also clearly described in the article
Regular Expressions (KM Wiki)

Search Modifiers

The ICU calls these modifiers “flag options”.

The search modifier “Pattern to Use” shown below is placed at the very beginning of the Search/Find Regular Expression box.
For example:
(?m)^\s*\d+[\t]+

I think KM's method of handling flags was one of the first things I learned about using RegEx with KM, since many of the RegEx I need/use require either or both (?mi) (multiline and case insensitive).

IAC, it is hopefully clear to all now how to use RegEx flags with KM. :smile:

You can also use the (?s:xxx) method for options, so something liek:

(?m:^)((?s:.)*)(?m:$)

The flag applies only to the parts within the (non-capturing) brackets.

This allows for explicit control, also useful with the i case sensitive flag.

By “delimiters” I mean metacharacters that delimit the actual text, which is what ^ and $ do. Sorry if I wasn’t clear.

Whether some particular regexp syntax is standard or documented or peculiar to a particular implementation isn’t what I was getting at. Sorry if I wasn’t clear about that either.

Keyboard Maestro makes an attempt with its graphical actions to make it easy for someone with a problem to craft a solution without years of experience of deep dives in documentation.

In fact, you can see this with the regexp popup menu that doesn’t require you to know about the case flag (i) because there is a “case sensitive” and “ignoring case” option.

But, as we’ve seen in this thread, there are other flags (like multiline) that can frustrate Keyboard Maestro users. Even the default of a global substitution has confused people here.

I’m not arguing against modes inside the expressions (although give me a moment) but suggesting it might be worth thinking about more explicit visual controls.

Like a checkboxes for options like global substitution, ignoring case, multiline, etc. perhaps in the gear menu (although that’s a little hidden away).

I think that addition to the user interface would help people build regexps in Keyboard Maestro with less frustration.

That’s what the discussion on this thread suggested to me. A checkbox for multiline would have made the option obviously desireable and avoided the confusion of not knowing the default behavior.

2 Likes

After having thought about your suggestion for a bit, I have to agree.

So, instead of this:

We would have this:

with a popup menu something like the one from RegEx101.com:

Of course the above is just a functional mock-up, not a finished UI, but I hope it illustrates the point.

This would also make it more directly comparable with the screen/UI at RegEx101.com, a great place to test and develop RegEx. I think many other RegEx apps use the syntax of /<RegEx here>/<flags here>

So, what do you think @peternlewis, is this a reasonable, doable request?

1 Like

That would get my vote if I had one <g>.

I like that the field gives a quick synopsis of what’s been set and that the popup gives a fuller explanation of the options, which would really help a lot of people. And the combination makes explicit what if anything (like global) the defaults are.

The problem with this is that regex tests are used all over Keyboard Maestro (probably a hundred different places). It would be a huge amount of extra UI clutter to include flags everywhere you can use regex, and it would be equally confusing to have the regex flags somewhere and not others.

OK, granted, it is a lot, but a hundred ??
I'm not seeing near that many with this KM Wiki search:
Search for "regular expression" [Keyboard Maestro Wiki]

Some of these are probably seldom-used things, like some of the conditions.
If you just did the main Actions to start with, and then do the others as you had time, I think that would still be helpful.

Sorry, Peter, but I don't buy the clutter claim.
There's no real clutter difference (to my eye) between these two:

To some degree, yes. But you already have one big difference:
Some show choices for case sensitivity, others don't.

I'm hoping with some clever ObjC class design you could sub-class for the various differences with just minor changes. But I'm obviously just guessing, since I don't any insights into the KM design/code.

Just my 2¢. Clearly this is NOT an urgent issue/request. We've lived with KM like it is for several years, and can continue to do so for some time until you have time to make such a change.

Just had a thought: I bet collectively we (your users) could put our heads together and come up with a KM macro that lets the user build that RegEx Flag Options. How 'bout it guys, can we do it?

1 Like

I concede your point and wouldn’t want to restrict usage of regexps anywhere.

But maybe we could distinguish between actions that expect a regexp get the popup and those that accept one don’t (remaining the same as now).

Power users won’t be frustrated and people learning regexp will be helped where it’s most needed.

Interesting idea! And one that seems quite doable. Here's a quick proof-of-concept to get us started:

Regex Flags.kmmacros (5.3 KB)

As far as I'm aware there's no way to turn the global flag off, and the x whitespace/comment flag didn't seem particularly useful given the way KM handles regex (though that could also just be because I've yet to find a use for it), so these should (hopefully!) be enough to cover the majority of KM user needs (especially since, again, this is just a proof of concept to get us started).

1 Like

Great start, @gglick! :+1:

Just to test/reinforce my memory about showing one value, but storing a different value, in the Prompt for User Input, I took a different approach. For example:

  • Variable: Local__Multi_Line
  • Default Value: __No|m__Yes (?m)

The first value, before the "__", is the value to be stored.
The value after the "__" is the one that is displayed to the user.
Just for clarity, the vertical bar "|" used above has nothing to do with this. It just provides the choices of the popup list that will be shown.

I didn't know if using nothing for the stored value would work, but it turns it does. So the __No, for example, returns the empty string "" if the user selects "No".

Some RegEx programmers may object to use of my title of "Dot_Newline" instead of the traditional "Single_Line". But I think it is clearer to the novice user, because seeing both "Multi_Line" and "Single_Line" might seem like a contradiction to some.

Obviously, the user of this macro can change it to be whatever he/she prefers.

I also use all Local variables because I wanted to start with a clean sheet of NO options every time. If you prefer a different default, you can just switch the order of the Default Value in the Prompt, For example, change:
FROM: __No|m__Yes (?m)
TO: m__Yes (?m)|__No


Here's my DRAFT Macro:

### Example Output

<img src="/uploads/default/original/2X/c/c8c3ef8463c83ded88afe5d096d737c5f561f4fe.png" width="539" height="339">

#### With ALL Options Selected:
<img src="/uploads/default/original/2X/0/0e4e0aec23903e48794ee9ba0aea799e51b4f398.png" width="539" height="339">

**Result Pasted**
`(?msi)`

###MACRO:&nbsp;&nbsp;&nbsp;Prompt User for RegEx Flag Options and Paste

~~~ VER: 0.1&nbsp;&nbsp;&nbsp;&nbsp;2017-10-19 ~~~

####DOWNLOAD:
<a class="attachment" href="/uploads/default/original/2X/b/b19289980ce3ac4beb7a58b486590c6fa4b9c51a.kmmacros">Prompt User for RegEx Flag Options and Paste.kmmacros</a> (9.9 KB)
**Notes: This Macro was uploaded:**
  1. In a DISABLED state. You must enable before it can be triggered.
  2. With a Typed String Trigger.

---


<img src="/uploads/default/original/2X/a/adcf53ba4ad11216d839adb47cd708ceeb39d744.png" width="531" height="1509">

### To Those New to RegEx

Please give the above macros by @gglick and myself a try, and give us some feedback on how it works for you, and what changes/improvements you'd like to see, to make it user-friendly for you.
1 Like

I like that Help option. Here's a checkbox version:

Keyboard Maestro 8.0.3 “Regexp Options” Macro

Regexp Options.kmmacros (11 KB)

1 Like

Another good choice. I like your name of "Ignore Case" better than "Insensitive". I'm going to change my macro to use that.