Automating REGEX

Perfect! Thanks @Nige_S!

@tiffle I know it's taken me a while to come around to this, but I think you're on to something with the subroutine idea!

Here's what I've got so far:

RegEx Macros.kmmacros (84.5 KB)

Subroutine Caller Actions.zip (4.9 KB)

If you think you can improve on or add to these, I'd love to see what you come up with.

Hi Neil - examined this with interest and found the following:

  1. I loaded up your macros and created a testing macro using your caller actions zip and found that none of your subroutines work! Here's the testing macro:

Download Macro(s): Testing Regex Subroutines.kmmacros (29 KB)

Macro-Image

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

To perform the testing I just select the appropriate group and TRY it. I'd offer a solution but I don't have time right now. Maybe I'm doing something wrong or I don't understand what is supposed to happen?

  1. Whenever I see myself inserting the same KM actions over and over in several macros I think that it might be worth turning those actions into a subroutine. I see a bunch of actions that appear at least once in each of your subroutines; here they are:

image

So I've taken the liberty of turning them into a subroutine for you that looks like this:

image

Here's the downloadable version:

Download Macro(s): [SUB] Escape Regex String.kmmacros (17 KB)

Macro-Image

Keyboard Maestro Export

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

and you can use it to replace the 5 occurrences of that bunch of actions. The advantage of doing that is (and I'm sorry if I'm teaching granny to suck eggs) that (a) once you've tested the subroutine you can be sure it will always work; (b) if you need to change the subroutine in future (like add error checking or an extra "escaping" action for example) you need do it in only one place and not the 5; and (c) it reduces the overall count of actions used.

1 Like

See this is where it all falls down on my RegEx newbism. The reason I'm interested in this is that I'm dumbfounded by RegEx, but unfortunately it also means I can't fully incorporate it into these macros with any degree of competence. :joy: That said, I do think this idea has potential.

For example, the Everything BEFORE String subroutine does 'work' (sort of) in that it does what I told it to do. It removes everything after a string, if the string is found on that line.

<string>IgnoreCaseRegEx</string>
...becomes...
<string>IgnoreCaseRegEx

Now of course, this is where my incompetence comes into play as I now realise this subroutine should be called "Remove Everything After a String On Each Line If That String Is Found" instead. :man_facepalming:t2:

Replace LINES Containing String for some reason isn't receiving Local__Replace With. I'm very confused by this. It's just blank. To be fair, I've never really used Subroutines, so I may be missing something obvious.

RegEx: Everything AFTER String doesn't work and I'm not sure why.

RegEx: Return Everything BETWEEN Two Strings seems to work fine. :man_shrugging:t2:

NB: I did have to reconnect the callers to the subs when I imported your test macro, but presumably, they're all calling the appropriate things on your end...?

Very kind! I'll add that to the group! It's certainly more efficient, but I've never got into the habit of using subs when working on something I might share on the forum, as I prefer everything to be self-contained. I might change that mode of thinking...

Nice reply Neil. I’ll take a closer look later on unless someone else beats me to it!

BTW - don’t get discouraged! We all benefit from this stuff ‘cos we’re learning new things all the time! And that is never a bad thing!

1 Like

Oh, it's not really incompetence. Don't be so hard on yourself.

It can be difficult to describe the precision of a regex in a few words that fit on a menu item. And there are often subtle variations that have to be accounted for.

For Text Toolbox, which implements common text conversions with regexes, I resorted to short menu "topics" essentially that expanded into prompts for options where necessary with some explanatory text.

Text manipulation is complex. Describing it (in words, alas) is difficult. Keep fighting the good fight!

1 Like

I'm not sure if it's something to do with the example inputs or something else, but these were all working yesterday. Slightly frustrating.

Just a quick question, Neil - do the results you’re seeing come from my testing macro or yours?

Yours.

So you reckon this is correct for

2023-04-12_21-04-56

Looks to be, yeah. Idea is to get something between the two strings. Maybe I'm missing something as I'm only looking on my phone though.

Given I've used <key> and </key> as the two strings I would have expected this:

2023-04-12_23-05-01

If that is correct, I got to it by a small modification to your subroutine thus:

Download Macro(s): Return Everything BETWEEN Two Strings.kmmacros (23 KB)

Macro-Image

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

where I've added the Set Variable action coloured red; this is needed because the Search action, if it doesn't find anything, does not set Local__SubOutput to nothing, but leaves it as is.

Anyway, I'm beginning to think I don't understand what you're expecting to get out of these in terms of a before/after result so maybe it would help if you could explain that. :grinning:

On closer inspection (I was at the pub last night!), my version was doubling-up, wasn't it! Makes sense now that you've explained about having to reset Local__SubOutput. Nice one!

My preferred way to solve that issue is:

Screenshot

...as it gets rid of the blank lines.

As has been mentioned before, each one of these is going to do a very specific job, and some very specific jobs seem to be required more often than others. I suppose I'd like a quick and easy way to do some basic text processing without having to learn RegEx properly.

The ones I've started with (if they work) are things I often find myself wanting to do, so I thought I'd try to create some kind of presets. You mentioned you've tried something similar and have many more, so I think you get what I'm trying to do.

Which just reinforces my statement

i.e. provide examples of input and corresponding output that you are hoping for. I understand exactly what you’re after by creating “presets” - that’s not the issue here for me!

1 Like

Fair point.

It really depends on the application, and I may end up with multiple options for just the Return Everything BETWEEN Two Strings macro alone:

  • I do/don't want the start and end strings included in the result
  • I do/don't want to include blank lines
  • I do/don't want the RegEx to work over multiple lines

I can't think of a specific example, but grabbing a value from a big hunk of text or removing extraneous bumf to make something readable is something I've found extremely painstaking in the past, particularly because when I've trawled StackExchange for similar examples, they often don't work with the ICU flavour.

OK Neil - Here are my tests on each of your 4 subroutines. If the outputs I'm showing you are what you expect then that's great, but as I said without an explicit statement of what each subroutine is supposed to do, I can't say one way or the other (except for perhaps the last one...)

Everything before string

Only the first match works, the rest seem to be ignored - so I guess it's faulty:

2023-04-13_14-44-52

Lines containing replace

Doesn't seem to do the replacement:

2023-04-13_14-45-26

Everything after string

Generates an error and aborts. Here's the log file entry:

Return everything between

This is the version I tweaked and it has the blank lines. You didn't want the blank lines but I'm including it for completeness:

2023-04-13_14-47-37

Conclusion

I think encapsulating regular expressions in subroutines is a great idea if you have regexes that you use frequently; the subroutines essentially become black boxes whose inner workings you needn't worry about. However, to create the subroutines and the regexes in the first place you need to specify the conditions of use and what it is exactly you want them to do and I think perhaps that needs a bit more work for these specific subroutines.

I did say that I have been using subroutines involving regexes but I'm not sure mine do anything similar to yours. Here's an example though: I've had to provide users with a simple way to process HTML files and one of the things they regularly wish to do is extract the contents of a specific tag from the file (usually the title tag, but it could be some other tag). Remember, these are HTML files we're dealing with and not browser pages, so the use of in-browser JavaScript is not an option. Here's the subroutine:

Download Macro(s): [SUB] Get First Tag From HTML.kmmacros (2.1 KB)

Macro-Image

Keyboard Maestro Export

Macro-Notes
  • Macros are always disabled when imported into the Keyboard Maestro Editor.
    • The user must ensure the macro is enabled.
    • The user must also ensure the macro's parent macro-group is enabled.
System Information
  • macOS 10.14.6
  • Keyboard Maestro v10.2

and this is how you'd call it:

image

This is a simple one, but it shows how you can simplify stuff hugely by "hiding" it in a subroutine. It is so simple though that I'm not sure how useful it is as an example for you!

1 Like

The conditions of use are: remove/return/replace something for each line of a variable. That's very useful to me.

Part of the reason some of these weren't working was that I'm new to using Subroutines and made some basic errors. Another part of it is that, before adopting your clever subroutine idea, I was only trying to auto-create RegEx actions, so things like iteration weren't part of the remit. I think I might have got them all working nicely now, but I'm sure you'll let me know if I've messed up.

RegEx Macros.kmmacros (137.1 KB)

1 Like

:rofl: - later...

I'm probably missing a trick, but doesn't that return the contents wrapped in > and <?

Also, interesting that the / in the Close variable is allowed as a literal, no need to escape it in the regex.

Nope - you spotted my (ahem) deliberate mistake.

Of course the call to the subroutine should look like this:

image

I know you had discussions about escaping string, but it's not been a problem with this.