Automating REGEX

Looks to be, yeah. Idea is to get something between the two strings. Maybe I'm missing something as I'm only looking on my phone though.

Given I've used <key> and </key> as the two strings I would have expected this:

2023-04-12_23-05-01

If that is correct, I got to it by a small modification to your subroutine thus:

Download Macro(s): Return Everything BETWEEN Two Strings.kmmacros (23 KB)

Macro-Image

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

where I've added the Set Variable action coloured red; this is needed because the Search action, if it doesn't find anything, does not set Local__SubOutput to nothing, but leaves it as is.

Anyway, I'm beginning to think I don't understand what you're expecting to get out of these in terms of a before/after result so maybe it would help if you could explain that. :grinning:

On closer inspection (I was at the pub last night!), my version was doubling-up, wasn't it! Makes sense now that you've explained about having to reset Local__SubOutput. Nice one!

My preferred way to solve that issue is:

Screenshot

...as it gets rid of the blank lines.

As has been mentioned before, each one of these is going to do a very specific job, and some very specific jobs seem to be required more often than others. I suppose I'd like a quick and easy way to do some basic text processing without having to learn RegEx properly.

The ones I've started with (if they work) are things I often find myself wanting to do, so I thought I'd try to create some kind of presets. You mentioned you've tried something similar and have many more, so I think you get what I'm trying to do.

Which just reinforces my statement

i.e. provide examples of input and corresponding output that you are hoping for. I understand exactly what you’re after by creating “presets” - that’s not the issue here for me!

1 Like

Fair point.

It really depends on the application, and I may end up with multiple options for just the Return Everything BETWEEN Two Strings macro alone:

  • I do/don't want the start and end strings included in the result
  • I do/don't want to include blank lines
  • I do/don't want the RegEx to work over multiple lines

I can't think of a specific example, but grabbing a value from a big hunk of text or removing extraneous bumf to make something readable is something I've found extremely painstaking in the past, particularly because when I've trawled StackExchange for similar examples, they often don't work with the ICU flavour.

OK Neil - Here are my tests on each of your 4 subroutines. If the outputs I'm showing you are what you expect then that's great, but as I said without an explicit statement of what each subroutine is supposed to do, I can't say one way or the other (except for perhaps the last one...)

Everything before string

Only the first match works, the rest seem to be ignored - so I guess it's faulty:

2023-04-13_14-44-52

Lines containing replace

Doesn't seem to do the replacement:

2023-04-13_14-45-26

Everything after string

Generates an error and aborts. Here's the log file entry:

Return everything between

This is the version I tweaked and it has the blank lines. You didn't want the blank lines but I'm including it for completeness:

2023-04-13_14-47-37

Conclusion

I think encapsulating regular expressions in subroutines is a great idea if you have regexes that you use frequently; the subroutines essentially become black boxes whose inner workings you needn't worry about. However, to create the subroutines and the regexes in the first place you need to specify the conditions of use and what it is exactly you want them to do and I think perhaps that needs a bit more work for these specific subroutines.

I did say that I have been using subroutines involving regexes but I'm not sure mine do anything similar to yours. Here's an example though: I've had to provide users with a simple way to process HTML files and one of the things they regularly wish to do is extract the contents of a specific tag from the file (usually the title tag, but it could be some other tag). Remember, these are HTML files we're dealing with and not browser pages, so the use of in-browser JavaScript is not an option. Here's the subroutine:

Download Macro(s): [SUB] Get First Tag From HTML.kmmacros (2.1 KB)

Macro-Image

Keyboard Maestro Export

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

and this is how you'd call it:

image

This is a simple one, but it shows how you can simplify stuff hugely by "hiding" it in a subroutine. It is so simple though that I'm not sure how useful it is as an example for you!

1 Like

The conditions of use are: remove/return/replace something for each line of a variable. That's very useful to me.

Part of the reason some of these weren't working was that I'm new to using Subroutines and made some basic errors. Another part of it is that, before adopting your clever subroutine idea, I was only trying to auto-create RegEx actions, so things like iteration weren't part of the remit. I think I might have got them all working nicely now, but I'm sure you'll let me know if I've messed up.

RegEx Macros.kmmacros (137.1 KB)

1 Like

:rofl: - later...

I'm probably missing a trick, but doesn't that return the contents wrapped in > and <?

Also, interesting that the / in the Close variable is allowed as a literal, no need to escape it in the regex.

Nope - you spotted my (ahem) deliberate mistake.

Of course the call to the subroutine should look like this:

image

I know you had discussions about escaping string, but it's not been a problem with this.

Bum -- I was hoping you had a way to handle attributes in the opening tag. But this seems to work when added as the first action in your sub:

image

2 Likes

I can download your macro package fine, but it won't install into KM. This was after I'd archived the previous set and then deleted them. I've never come across this before so can you export your macros again so I can try again? Cheers.

Scrub that - rebooting my Mac seems to have cleared the problem :man_shrugging:

Hi @noisneil,
Just had a quick look and I've noticed a problem. In each of your subroutines you now call the sub to escape special characters like this as an example:

image

So you're passing in the contents of the variable Local__String which is fine but you're saving the result to a variable token as opposed to a variable. Consequently the subroutine would not affect the contents of Local__String

The call should look like this:

image

(which is how I showed it in post number 33 above).

You didn't notice any errors when running your tests because Local__String never contains anything that needs escaping! So that was fortunate :sweat_smile:

Anyway, I made the corrections and ran the tests and it all looks fine with the test cases you provided.

However, as soon as you try a test with a special character things go wrong I'm afraid. I'll just examine one of your subroutines to show what's going wrong (if I can).

RegEx: REMOVE Everything AFTER String

I tried running this test using the string ?)

The line I was expecting to be affected was this one
<string>(?<=a)(.*?)(?=c)</string>

but the output from the test showed no change.

The reason your sub doesn't work is this: after escaping the string you then go through the input line by line and for each line you test to see whether it contains the escaped string. Well of course it never will, because the escaped string is now \?\) which is not present anywhere on any line.

The solution is simple: call the sub to escape the string and save the result to a new variable - I've called it Local__StringX: you should use this variable in your regex Search and Replace action later on. In the meantime, where you test each line to see if it contains the string, you can continue to use the unescaped version of the string, which is held in the variable Local__String

I'm sure all this is a bit much to absorb so I'm posting the amended subroutine with the two places I've made a change coloured red. Here it is:

Download Macro(s): REMOVE Everything AFTER String.kmmacros (19 KB)

Macro-Image

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

I haven't looked at your other subroutines but I bet my right big toe there's something similarly erroneous going on in them and that the solution will be very similar.

So Neil, I honestly think it's your lack of experience in thinking like a computer programmer that's the hill you're climbing and not the fact that subroutines are new to you or that regexes are taxing. But then you're not a computer programmer so where's the shock in that? The mistake I've pointed out is one that I've made many times and, by the way, we are neither of us unique in this respect!

So go ahead - use my example to guide you when making the necessary changes to your other subroutines but probably the most importing thing to do is this: test; then test; then test again using as many different examples as you can. The value in that comes from the fact that this reply to you almost finished where I say above I made the corrections and ran the tests and it all looks fine.

Good luck!

I had actually noticed my mistake when returning the variable to the calling macro as Local__RGXOutput, but forgot by the time it came to setting up the Escape String sub of the sub. So yeah, this is down to being new to subroutines. Thanks for the heads-up!

Doh! Obvious when it's pointed out! I've adjusted them all accordingly.

Absolutely! I'm a musician who maintains a purposefully minimal studio setup, and I turned to KM to make a few things do a lot; namely a Stream Deck and a nObcontrol... Then I got the automation bug and here we are.

Here's another simple one I just came up with:

REMOVE Top Line(s).kmmacros (17 KB)

Macro screenshot

Remove Top Line(s).kmactions (2.6 KB)

Yeah - I've been on that slippery slope for some time now :grinning:

OK.

Run a test with 25 and then run the test with 26: now work out what's going wrong!

You mean removing 25/26 lines? The both contain the last line. Probably something to do with the fact that it's matching by carriage returns and line breaks, which come at the end of the line? :man_shrugging:t2:

If I knew the answer to this I wouldn't be the guy trying to automate RegEx. Anyway, it's fine by me, as I can't see myself ever wanting to explicitly remove every line in this way.

1 Like

Bingo!

Ahhh....

^.*\R?

(That's not a question. :joy:)

Or is it "0 or 1" questions? :wink: