@ccstone I'm intrigued that the look-behind assertion in
gets around the situation described by Peter:
because (?<=\S) doesn't really match anything. But I guess it does. So I was playing around with other ways to avoid the empty-string pattern. Here's a solution with a Find+Replace keystroke count of 17.
Find:
\A\s*|(.)\s*\z
Replace:
\1•
Haven't tried with Billie Holiday, Gene Krupa, or Duke Ellington, but works on your example. Also works with no trailing white space.
That's much more efficient than capturing the whole body text, and from my (albeit limited) understanding of regex engines more efficient than the lookbehind.
I think I'd change the captured metacharacter just to make it read easier (for me).
And in a real-world case, wouldn't you use Search/Replace to simply trim the whitespace from both ends (replace \A\s*|\s*\z with null) and then "manually" add the bullets to the ends of the resulting string? Seems that combining the two modifications into a single Search/Replace operation is great intellectual exercise but perhaps not great procedural design.
@ccstone But (not trying to be difficult here), if what's being tested is uniformity of whitespace truncation in various implementations of Search/Replace (and what an interesting challenge!), all the more reason to use a separate mechanism (other than Search/Replace) to visually demarcate the result. No?
Actually, I wrote the regex I posted because search and replace seemed to directly address the specific problem: Search for a possible bit of leading white space, followed by ANYTHING (which can include internal whitespace), followed by a possible bit of trailing white space, then replace all of that with everything except the leading and trailing white space. In other words, \A\s*([\s\S]*?)\s*\z (by virtue of its capture group) means "choose absolutely everything, then discard only white space at either the very beginning or very end of the document, if any exists, and keep everything else".
In case it is not clear to all, especiallly @peternlewis , the real issue is that the KM RegEx Replace Action ALWAYS does a GLOBAL search and replace. Every other RegEx tool I have ever used allows for Search/Replace of ONLY the first match.
Request
@peternlewis, please provide an option in the Search/Replace Action to apply it to ONLY the first match.
Another point well taken, @JMichaelTX, which I hadn't fully grokked until your comment. But in the context of @ccstone's original Search pattern \A\s*|\s*\z, I don't think your request would help. The pattern is of the form "X|Y", and it would only work as intended if the Y half were matched after the X half succeeded. That is, the pattern presumed a global search/replace. So in this particular case, the problem isn't that a global search/replace was being done, but that the global search for Y matches in 2 locations rather than 1, where there's some debate as to whether that second location is legit.
I agree with your reply, @SLWorona . More generically, I was just pointing out that @JMichaelTX 's complaint about KM's flavor of regex, while valid, is not really what we have been talking about in this thread.
I just did exactly what I said, so it is not incorrect.
If you do a global search and replace, as I did, to transform words exactly as I wanted, I get the same bogus behaviour.
I don't care if it has other modes in other ways o operating, the fact is it behaves in exactly the same bogus way when doing a global search and replace on the document.
Clicking Find, and the Replace & Find over and over again works properly, but clicking Replace All shows exactly the same bogus handling of empty matches as seen in the original post.
You are totally missing the point. If you do NOT do a global/search/replace, it works fine everywhere.
So, the point is that the KM Search/Replace needs to have a "First Match" option.
I am not missing your point - I am ignoring your point as irrelevant to the discussion.
In the OPs case and in my case, the goal is a global search & replace.
It does not work if you don't do a global search & replace because it would only replace the first line (or in the OPs case, the start of the text).
You are arguing that a bogus behaviour that causes multiple replacements doesn't happen if you don't do multiple replacements, which is obvious but not helpful since the purpose in both cases is to do a global search & replace.
Even if I add a switch to the Search & Replace action to do only a single replacement, it would not have resolved the OPs problem, nor would it have resolved my problem, so this discussion is in no way an argument in favour of adding such a switch.
The issue is the replacement at the END of the string.
When you have GLOBAL turn on, it fails.
When GLOBAL is OFF, it works.
I know this because Chris @ccstone and I discussed and tested this privately.
It is unfortunately that the use case he presented at the top does not make that clear.
So, rather than belabout this any further, I will post a new topic with a new use case that clearly illustrates the issue.
BTW, Peter: You seem to be using the RegEx engine provided by AppKit ("TRE"). I used that in Find Any File initially, but ran into two issues:
It tends to crash, especially on binary data.
It does not support many advanced regex expr, such as "(?!^ABC$)".
I resolved this all by using PCRE2 instead. Had to build it and include as a lib, but since then I had not a single crash any more related to regex use.
Well, I think there is a role for a regular expression here, but perhaps we only need a simple one ?
[\r\n]+
Expand disclosure triangle to view JS Source
(() => {
"use strict";
const
txt = Application("Keyboard Maestro Engine")
.getvariable("testDataStr");
const main = () =>
unlines(
lines(txt).flatMap(x => {
const trimmed = x.trim();
return trimmed ? (
[trimmed]
) : [];
})
);
// --------------------- GENERIC ---------------------
// lines :: String -> [String]
const lines = s =>
// A list of strings derived from a single
// string delimited by newline and or CR.
0 < s.length ? (
s.split(/[\r\n]+/u)
) : [];
// unlines :: [String] -> String
const unlines = xs =>
// A single string formed by the intercalation
// of a list of strings with the newline character.
xs.join("\n");
return main();
})();