Search and Replace in Variable — regex capture group $1 only returning first character

Title: Search and Replace in Variable — regex capture group $1 only returning first character

Post text:

I am using "Search and Replace in Variable" with Regular Expression mode. My search pattern is:

(?m)^#([^@\n]+?) *@(\d+)

My replacement is:

[[$1]] (($2))

The variable contains text like:

#All @6
#Books b @39

Expected output:

[[All]] ((6))
[[Books b]] ((39))

Actual output:

[[A]] ((6))
[[B]] ((39))

$2 captures the number correctly. $1 only captures the first character. What am I missing?

This is in preparation for chunking text which is am working on in another macro. I am preparing a Table of Contents to chunk books.

It may be that the only regular expression you need is \s+@

(Splitting text into substrings is the more natural home of regular expressions – the search and replace pattern strains them a bit, adding redundant complexity)


One approach would be:

Split and bracket.kmmacros (2.6 KB)

Check your regex carefully, and/or post the actual "Search and Replace" Action here. What you've written should work, but I suspect you've accidentally put a . before the *

1 Like

Thank you @Nige_S

01a sanitation of TOC.kmmacros (5.7 KB)

Using your macro but isolating that Action, everything's fine:

Sometimes the KM Engine can lag behind changes in the Editor, so try:

  1. Quitting then opening the Editor, to force a plist re-write, then...
  2. Quitting and relaunching the Engine (in the Editor, File->Quit Engine then File->Launch Engine) to force the Engine to reload the fresh plist

Also, I'm guessing that your last two AppleScript Actions are either erroring or doing something unexpected. They're unnecessary anyway -- much quicker and easier to use the native "New Folder" and "Move a File or Folder" Actions -- but if you must use AS then read the Wiki's "Execute an AppleScript"page to see how you should be passing in KM variables.

Have a go at replacing those AS Actions with the KM ones, and if you get stuck then just ask here -- you'll get better results than going to an AI...

It also depends on what do you have in the file.

I always suggest to use site:

Here is your regex and example data you may check how replacement works:

RegEx Test with parameters

Open file with text, copy/paste part of them to test string and check what will be the result

Or for zero regex (and zero script)

Keyboard Maestro Variable Arrays, with custom delimiters



Split and bracket with KM variable arrays – custom delimiter.kmmacros (4.6 KB)

2 Likes

My concern would be time -- instead of a couple milliseconds to process the text, regardless of length, the "For Each"/pseudo array method will take ~3ms per line of text. Not too bad for a list of book chapters (unless it's a very long book!), but something to be aware of.

The regex should work, and is actually over-complicated given that the input data is strictly formatted. This will work just as well:

(?m)^#(.*?) *@(\d*)

...unless you have chapter names like "The Boy Who Failed @12"...

Ditto – threads like this regularly suggest hours lost in puzzled regex-wrangling :slight_smile:

Agreed! Although in this case, and assuming the input we've been shown is correct, the regex isn't the problem.

( I somehow feel that that may not suffice to make it the right tool for the job )

First - it may be other of many cases of format of TOC (ellen in other thread on this forum presented another format of TOC) so regex is better way to service different variants.

Second - we always test regex on that what was pasted here, not what is inside original file read by macro. To be absolutely sure the test should be done on exact copy of input file.

If you change your regex to (?m)^#([^@\n]+)\s*@(\d+)$ I think it will work. Note that it will also work without the final $.

No, that's subtly different. To see how, try both versions against:

#All  @6
#Books b @39

(Note that #All has two spaces after it.)

When I run it, I get All and 6. Is that not what is expected?

Look very carefully -- you'll find the original gives All but yours gives All , with a trailing space.

That's why the original has +? -- that's a lazy match, allowing the * (or \s*) to greedily take all the spaces.

1 Like

Good catch. What about this: (?m)^#([^@\n]+?)\s*@(\d+)

I checked both use cases

Yes, it's a better pattern because it catches tabs as well.

But, as I showed above, the pattern isn't the problem if the inputs we've been given are correct. The "obvious" way to get incorrect results is

(?m)^#([^@\n])+?.*@(\d*)

...and it may be that OP tried something like that first, changed the regex, but the plist didn't update and/or the Engine didn't reload the new version.