However, because it doesn't grab just the line, but the line + the newline character, the replacement will not be applied to the last line of the text. A problem which I wouldn't have if ^ and $ were working.
I've also seen examples here on the forum where ^ and $ are being used in KBM regex stings. So I wonder what could be the cause of them not working for me.
I'm on KBM 8.0.3 and Mac OS Sierra 10.12.6 (16G29).
I believe KM tends to default to single-line search, whereas TextMate (I'm guessing) defaults to multi-line, which would explain this behavior. Either way, explicitly searching with the multi-line flag (?m) in KM seems to resolve the issue:
To be clear, the default is not to "single line", but to the entire text string, which may have one or more lines. So, without the (?m) flag, ^ refers to the start of the string, and $ refers to the end of the string.
Control the behavior of "^" and "$" in a pattern. By default these will only match at the start and end, respectively, of the input text. If this flag is set, "^" and "$" will also match at the start and end of each line within the input text.
Take a look at the first post where the variable contains multiple paragraphs. Shouldn’t the replacement have <p> as the first characters and </p> as the very last ones?
That would handle the entire string of the input text. But it doesn’t (just tested it for myself). The only way it handles the entire string is when you cut it back to a single paragraph (just one line).
I’m referring to paragraphs to indicate newlines and avoid confusion with any soft wrap in the variable definition.
I would have expected the entire string to have been surrounded by the paragraph tag in the first example if the entire string is processed.
I'm not clear on your test case. Please post the entire test case, including source string, RegEx you used, and KM Action that you used. Probably that's your test macro.
If set, a "." in a pattern will match a line terminator in the input text. By default, it will not. Note that a carriage-return / line-feed pair in text behave as a single line terminator, and will match a single "." in a RE pattern.
Line terminators are \u000a, \u000b, \u000c, \u000d, \u0085, \u2028, \u2029 and the sequence \u000d \u000a.
No, without the flag (?m), the ^ and $ still refer to the entire string, the beginning and end, respectively. What happens (matches) in between will determine whether or not the RegEx has made a match.
For example, the Regex ^Some Text at the beginning$ would also fail if the text "Some Text at the beginning" was NOT actually at the beginning of the source string. In fact, it would have to be the entire string to match.
The matches for ^(.*)$ depends on the s and m Flag Options, both of which are off by default.
The s (DOTALL) flag: If set, a "." in a pattern will match a line terminator in the input text. By default, it will not. Note that a carriage-return / line-feed pair in text behave as a single line terminator, and will match a single "." in a RE pattern.
The m (MULTILINE) flag: Control the behavior of "^" and "$" in a pattern. By default these will only match at the start and end, respectively, of the input text. If this flag is set, "^" and "$" will also match at the start and end of each line within the input text.
So Search and replace for a variable of ^(.*)$ and replace with "xyz" results in:
^(.*)$ - fails to match. ^ matches at the start of Line 1, $ matches at the Line 3, . does not match end of line characters.
(?m)^(.*)$ - returns "xyz%Return%xyz%Return%xyz%Return%". ^ matches at the start of each line, $ matches at the end of each line, . does not match end of line characters.
(?s)^(.*)$ - returns "xyz". ^ matches at the start of Line 1, $ matches at the end of Line 3, . does matches everything.
(?sm)^(.*)$ - returns "xyz". ^ matches at the start of Line 1, $ matches at end of each line, . does matches everything. Because .* is greedy, it will match until the end of the string.
(?m)^(.*?)$ - returns "xyz%Return%xyz%Return%xyz%Return%". ^ matches at the start of each line, $ matches at the end of each line, . matches everything. Since it is not greedy now, it will match until the end of the first line, where $ matches.
Not to beat this to death but the issue that bothered me was the implication that the entire string would be handled by the delimiters, which I felt was confusing at best. And required qualification.
I appreciate, having written them since 1976, that regexps are tricky little things.
Keyboard Maestro’s syntax of putting options before the regexp is a new wrinkle for me anyway. So I appreciate the clarification of the syntax for flag options at least. But I wonder if it wouldn’t be better (one day) to be explicit about the options with, oh something like checkboxes.
Sometimes, the tool or language does not provide the ability to specify matching options. The handy . . . Or, the regex flavor may support matching modes that aren't exposed as external flags.
In those situations, you can add the following mode modifiers to the start of the regex.
If you insert the modifier (?ism) in the middle of the regex then the modifier only applies to the part of the regex to the right of the modifier.
The search modifier “Pattern to Use” shown below is placed at the very beginning of the Search/Find Regular Expression box.
For example: (?m)^\s*\d+[\t]+
I think KM's method of handling flags was one of the first things I learned about using RegEx with KM, since many of the RegEx I need/use require either or both (?mi) (multiline and case insensitive).
IAC, it is hopefully clear to all now how to use RegEx flags with KM.
By “delimiters” I mean metacharacters that delimit the actual text, which is what ^ and $ do. Sorry if I wasn’t clear.
Whether some particular regexp syntax is standard or documented or peculiar to a particular implementation isn’t what I was getting at. Sorry if I wasn’t clear about that either.
Keyboard Maestro makes an attempt with its graphical actions to make it easy for someone with a problem to craft a solution without years of experience of deep dives in documentation.
In fact, you can see this with the regexp popup menu that doesn’t require you to know about the case flag (i) because there is a “case sensitive” and “ignoring case” option.
But, as we’ve seen in this thread, there are other flags (like multiline) that can frustrate Keyboard Maestro users. Even the default of a global substitution has confused people here.
I’m not arguing against modes inside the expressions (although give me a moment) but suggesting it might be worth thinking about more explicit visual controls.
Like a checkboxes for options like global substitution, ignoring case, multiline, etc. perhaps in the gear menu (although that’s a little hidden away).
I think that addition to the user interface would help people build regexps in Keyboard Maestro with less frustration.
That’s what the discussion on this thread suggested to me. A checkbox for multiline would have made the option obviously desireable and avoided the confusion of not knowing the default behavior.
After having thought about your suggestion for a bit, I have to agree.
So, instead of this:
We would have this:
with a popup menu something like the one from RegEx101.com:
Of course the above is just a functional mock-up, not a finished UI, but I hope it illustrates the point.
This would also make it more directly comparable with the screen/UI at RegEx101.com, a great place to test and develop RegEx. I think many other RegEx apps use the syntax of /<RegEx here>/<flags here>
So, what do you think @peternlewis, is this a reasonable, doable request?
I like that the field gives a quick synopsis of what’s been set and that the popup gives a fuller explanation of the options, which would really help a lot of people. And the combination makes explicit what if anything (like global) the defaults are.
The problem with this is that regex tests are used all over Keyboard Maestro (probably a hundred different places). It would be a huge amount of extra UI clutter to include flags everywhere you can use regex, and it would be equally confusing to have the regex flags somewhere and not others.
Some of these are probably seldom-used things, like some of the conditions.
If you just did the main Actions to start with, and then do the others as you had time, I think that would still be helpful.
Sorry, Peter, but I don't buy the clutter claim.
There's no real clutter difference (to my eye) between these two:
To some degree, yes. But you already have one big difference:
Some show choices for case sensitivity, others don't.
I'm hoping with some clever ObjC class design you could sub-class for the various differences with just minor changes. But I'm obviously just guessing, since I don't any insights into the KM design/code.
Just my 2¢. Clearly this is NOT an urgent issue/request. We've lived with KM like it is for several years, and can continue to do so for some time until you have time to make such a change.
Just had a thought: I bet collectively we (your users) could put our heads together and come up with a KM macro that lets the user build that RegEx Flag Options. How 'bout it guys, can we do it?