A Beginner Starts to Climb "Mount RegEx"

.
In order to advance my use of Keyboard Maestro, I have decided to learn regular expressions (RegEx).
Like climbing a mountain, one must begin at the bottom.

To be helpful to other RegEx novices here, I will report progress as I start to climb, including any slips and falls along the way.
I invite others to comment here, too.
.

The motivation to learn:

.
Those comments, and other, similar encouragement on this forum, motivated me to get started.
.

Where, exactly, to begin?

After looking at several web sites, I have decided to start with http://regexr.com/
It seems beginner-friendly.
We shall see if I've selected the best route up this mountain, or not.

That site offers a video tutorial:

Time: 02:45.

Narrator speaks clearly and to the point.
I think he'll be a good guide on this climb.
But he speaks so quickly that I must watch the video again, probably 3-4-5 times, to absorb it.

As a beginner, I found the most useful part of that video to be where to find examples (time 1:46 to 2:05).
That will be my entry point to learning about RegEx; asking, “What can it do?”

I’ll report back here when I’ve found some answers to that question, (in the context of using Keyboard Maestro).
.

I welcome comments from other beginners. And I especially encourage critique from experienced climbers, who still remember their first climb up "Mount RegEx".

4 Likes

Tutorial videos tend to be either too slow or too fast, which makes most of them a pure annoyance, IMO. And since regex is all about text, why not learning by reading? A very comprehensive site is Regular-Expressions.info. It’s well written, beginner-friendly, has a tutorial and many examples.

For trying out regexes and playing around I would use regex101.com.

1 Like

If you own BBEdit (or even just download the free trial), then you can use the BBEdit ➤ Help ➤ Grep Reference which covers a lot of stuff fairly well. BBEdit uses PCRE regex, which is slightly different, but the bulk of it is the same, and the concepts are the same regardless.

2 Likes

I have a different take than the others. I found that trying to read about it didn’t help. Nor did videos. I had to learn by doing.

So I decided to use regexes whenever I had to do searches, or search and replaces. Not every single time, of course, but often - many times a day. I discovered that many of the apps I use support regex searches.

The problem with learning regex, IMO, is that the rules are NOT easy to learn and understand. Very little, if anything, is obvious. So I believe that total immersion is the best bet.

As has been pointed out, regex101.com is an excellent place to try out regexes and learn what they do and don’t do.

4 Likes

A beginner climbs “Mount Regex” - Part #2.

The purpose of my posting here is to help other beginners gain a high-level understanding of regular expressions (Regex), by watching as I search for that myself.

The nitty-gritty details of operator syntax are available on many web sites and help documentation sources, so no need for me to repeat here.
I will curate the best of those, from a beginner perspective, as I find them, and I appreciate that others are mentioning their favorites here, too.
.

In Part 1 above, I posed the question, “What can Regex Do?”

That’s an essential question for a beginner, but I now realize that I should have asked a different question before that one:

“Where does Regex operate, and what are its fundamental elements?”

A comparison will explain why this new question should be asked first.
.

If I were trying to explain Keyboard Maestro to a beginner, I would say,

“KM works on the Macintosh operating system.

“Anything that a person can do on a Mac, KM can do, plus KM can memorize and repeat actions, and make decisions with IF statements.

“KM’s elements are actions in the Mac OS.
Examples are opening programs, closing windows, entering text, and clicking buttons.

“KM is bounded by the Mac operating system.
It can do anything within the OS, nothing outside the OS, nothing against the OS.
KM can not make bacon and eggs.”

.

So, how to explain Regex to a beginner, in a similar fashion?
That is the goal of this post.
I’ll offer a first draft answer below.

But first, some observations on the video that I linked above: http://regexr.com/

It was disappointing.
Useful, but disappointing.

I expected a short tutorial about Regex, suitable for a beginner, but the video is just an overview of that web site.
It says very little about Regex.

In spite of that, I watched it three times and found added value each time.
Here are my notes:

By pointing out features of the web site, the video brought to my attention several topics in Regex that seem essential.
As I climb the learning mountain, I will watch carefully for more about these topics:

  • "Number of matches” can be used how?
  • What are “expression flags” and when are they needed?
  • What are “substitutions”?

The video showed me it would be most efficient to write and test Regex first on a tutorial web site that offers helpful, error messages.
Only after thorough testing, will I copy into a KM action.
I haven’t done that yet, but I will start by making my Regex mistakes that way, rather than in KM editor.

The video mentioned a web site feature for personal “favorites”, like personal bookmarks here on the KM forum.
And, the video pointed out a searchable database of ready-made expressions.
Combining those two features could make this climb easier than I expected.
.

Now, returning to the goal of how to explain to a beginner (like me), “Where does Regex operate and what are its elements?”, here is my first draft answer:

Regex operates on characters in a computer (any computer, not just Macs).

Characters are letters A-Z, numbers 0-9, and all the special symbols that a computer can recognize.

Any color, any size, any language — Regex can see them all, including sets of characters, like words and sentences.

Regex elements are operators to find characters, count characters, and move characters around.
Elementary examples:

  • Change “abc” to “CBA”.
  • Format “18005551212” to be “1-800-555-1212”.
  • Find middle initial of name “John Q. Public” = “Q”.

Regex doesn’t do anything with images, sounds, drawings or anything in the user interface like clicking a button or moving the cursor.

.
So the main lesson I’ve learned so far is that Regex is all about characters, and only about characters.

I understand now that when a KM action depends on numbers, words, or sentences, Regex adds higher precision to the KM action.
.

In my next report, I will describe what answers I’ve found to the question, “What can Regex do?”
.

If anything I’ve written above is not clear, please tell me and I will continue to revise to make it more clear.

I welcome comment and critique on any of this.

3 Likes

Experimenting with the regex search and replace in a good editor is the best way in. Atom is excellent and free.

First goal – aim to learn the three kinds of bracket:

  • ( ) to group parts of your search pattern, and then use numbered references to particular groups in the replacement pattern
  • [ ] to specify custom or prepackaged subsets of characters to match
  • { } to specify a particular number of instances or repetitions

(Assuming that you have already tried matching one instance of any character with a period, a single optional character with ?, and an arbitrary number of optional characters with * )

1 Like

.
A beginner climbs “Mount Regex” - Part #3.

Now time to ask, “What can Regex Do?”

To answer, will look first at simple examples, suitable for beginners.
Next report, higher up, will look at real-life examples.

In my attempt to explain Regex, I wrote, “operates on characters … letters A-Z, numbers 0-9, and all the special symbols that a computer can recognize.”
Below are examples of that — from http://regexr.com/.
.


1. Matching numbers.

Here, for the first time, we see a real Regex in the wild.

.

So Regex can pick out integers and decimals from other text.
Hmmm … “pick out” seems to be one of the things Regex does well.

Regex probably could pick out a date in a text paragraph, or a price on an invoice form scanned with OCR (Optical Character Recognition).

Probably could “clean up” different numbers of decimal places to be a uniform number of decimal places, too.

Note the green box on the right, “3 matches”.
The concept of matching is clearly very important to understanding Regex.
Surely there will be a lot more to learn about it, later.
.


  1. Finding specific words (or other groups of characters).

A simple “Find”:

.

This example shows the essential inputs and outputs in Regex:

  1. The expression itself.
  2. The character set to be examined (labeled “Text”). (input)
  3. The number of matches. (output)
    .

3. Formatting numbers.

.

This is a joy to see.

As an old programmer I’ve written many lines of code to clean up and properly format numbers: phone numbers, product numbers, inventory numbers, price numbers, invoice numbers, and on and on.
None of my old code could be done so simply that it would fit on just one line, as this example above.

Regex is looking much more powerful than I thought.
.


4. Counting.

.

I’m seeing that Regex enjoys doing counting and being counted.
Will require thinking about, not just characters, but numbers of characters, too.
.

5. Forward and backward.

.

That’s impressive.
I don’t see a need for it yet, but still, good to know that Regex can work left-to-right or right-to-left (to find those palindromes).
.


6. Foreign languages.

The examples above are from the web site link.
But I couldn’t resist trying out one myself.
.

.

This shows that Regex handle complex, foreign language characters.

And that means it could see symbol fonts, too:

¥ å Æ €

:angry: + :hamburger: = :grinning:

Could be uses for that.

And now I wonder, could Regex handle music notation? Equations? Surveyor's symbols?
What else?
.

My example above will look strange to most people reading this.
My interest in Regex is not only to use in KM, but also for foreign languages.
The language above is Thai; my second language.
(Thailand, not Taiwan.)
The text translates to, “Hello world,” or, more dramatically, “Hello, Earth!”

I’ve already tested KM with Thai language, and KM handles it smoothly.
So I’m pleased to see Regex working equally well.
.

So, there are a few answers to the question, “What can Regex do?”

During the next stage of this climb up Mount Regex, I will look at more complex examples that likely have more practical uses.

2 Likes

I agree with @Tom. This would also be my suggestion.
I would start with Regular Expressions Quick Start

Actually, that's a great question, but a bit vague. :smile:
There is a general answer, and then specific answers.

General Answer, OR, What is the purpose of RegEx?

Before I go any further, let me say that I still consider myself a novice at RegEx. So, what I say below is subject to review and correction by other KM Forum members who are far more experienced than I.

At a high level I would say this:

  • A regular expression (sometimes abbreviated to "regex") is a way for a computer user or programmer to express how a computer program should look for a specified pattern in text. This is called "matching".
  • RegEx is a search pattern for finding, or "matching", one or more substrings in a string of text.
  • RegEx is a search pattern for matching "Capture Groups", which are sub-strings within the overall matched sub-string.

The matched substring, and the Capture Group(s), are returned to the user's program to use in some way. There can be many ways of course, but in general I'd say it breaks down into at least these classes (and maybe more):

  1. Parse (extract) the matched text for use elsewhere
  2. Replace the matched text with some other string.
  3. Validation of string for a particular purpose (like a date, or URL)

Here is a simple example of #1:
Using RegEx to Parse the KM Engine Error Log

Generally the tool you use, like Keyboard Maestro, JavaScript, BBEdit, TextWrangler, etc, will provide specific methods for obtaining a match, and then doing something with that match.

See Regular Expressions [Keyboard Maestro Wiki]

1 Like

I have just the thing for that. In fact I have mentioned this before in other threads. Try using MySpeed by Enounce.
https://www.enounce.com/download-myspeed
You can use it to slow videos down. If you take the time to learn the keyboard shortcuts, it’s even handier.

MySpeed is great for videos that plod along slowly. You can speed them up really fast. I have found that my understanding increases when I speed some things up. On other occasions I find that slowing things down helps improve my understanding though for me I am more often speeding things up. Weird, but it really works.

I have a theory about this. I think we all have a ‘communication rate’. I think we tend to connect with people who communicate at the same rate or frequency. I think this is why some speakers put us to sleep and others cause us to feel lost.

This is a separate concept from “boring speakers” who bore because of a lack luster monotone run on sentences. Those kind of speakers will tend to lose you what ever the speed is. Though I am inclined to believe if you can speed a boring speaker up a bit, it’s easier to stay engaged. Anyone have any thought on this?

1 Like

Getting this material has changed in the intervening four years. Under the Help Menu, type "Grep" and select "Searching with Grep" in the listed Help Topics to access Grep reference material.

MySpeed isn't ready for Big Sur, and for Mojave and Catalina, it requires disabling SIP (SIP prevents malware attacks from completion. Disabling it will instantly raise macOS vulnerability. Note that this is for experienced users or developers, and you normally shouldn't turn SIP off.) per https://macpaw.com/how-to/disable-enable-system-integrity-protection.

While a nice idea, this is a significant hack to enable it. I don't know if that was the case four years ago.

I like 'communication rate' idea, thanks.

Rate is an important ingredient for sure, and tone is as well as you point out.

I'm sure there is a large cluster of impactful factors that has a listener be more or less engaged including, subject matter, thoughts about one's relationship with the subject matter, speaker's accent, confidence level with both 'presenting' and perceived confidence with the subject matter, body language, take on overall credibility level of the speaker, prior experiences that shape one's listening as they relate to current circumstances, internal body states such as rested-ness and nutritional satiation, environmental factors like perceived safety with the location and others present, general mental state as in being distracted by something present in the environment or not in the present environments like past or future thinking, etc.

Here's a reading tip: https://www.amazon.com/Practical-Usage-Regular-Expressions-introduction/dp/1985752921

And an article by the hand of Anthony Rudd: https://www.translatorscafe.com/cafe/Articles.asp?Mode=Print&ArtID=173

Regular Expressions The Complete Tutorial: https://www.princeton.edu/~mlovett/reference/Regular-Expressions.pdf

1 Like

Get the PDF user manual.

You can find it under BBEdit Menubar ⇢ Help ⇢ User Manual

The Apple Help system is horrible, and the PDF user manual provides a much better user experience.

-Chris

Various video player apps like VLC (freeware) can adjust playback speed.

While it says User Manual as you indicate, 412 pages lands more as a Tome or at least Reference BOOK!

Thanks for the pointer, yes the help window's interface is less than helpful. You can't even copy from it or notate it to have your own notes.

Thank you :slightly_smiling_face:

Fortunately for you PDF's are searchable.  :sunglasses:

-Chris

FWIW, the BBEdit User's Manual provides this:

Recommended Books and Resources

Mastering Regular Expressions, 3rd Edition
by Jeffrey E.F. Friedl. O’Reilly & Associates, 2006. ISBN 0-596-52812-4

Although it does not cover BBEdit’s grep features specifically, Mastering Regular Expressions is an outstanding resource for learning the “how-to” of writing useful grep patterns, and the second edition is even better than the original.

I have not read this book, but plan to get it, maybe. :wink:
It ain't cheap. Mastering Regular Expressions -- Amazon costs $30 to buy Kindle version, and ~$12 to rent for a month.

and

A grep pattern, also known as a regular expression
Grep is the name of a frequently used Unix command that searches using regular expressions, the same type of search pattern used by BBEdit. For this reason, you will often see regular expressions called “grep patterns,” as BBEdit does. They’re the same thing.

That's a very detailed and very technical resource. I don't recommend it as a reference for learning regular expressions. I've owned a copy for 15 years or more and was very excited when I got it – then I started to read it, and it made my head hurt...


grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p (globally search for a regular expression and print matching lines ), which has the same effect.[3][4] grep was originally developed for the Unix operating system, but later available for all Unix-like systems

** `Grep` on Wikipedia


The acronym grep has become synonymous with regular expression(s) over the years. One could say it's become more of an idiom than an acronym these days – except when used to designate the grep command line tool.

-Chris

Hey Chris, didn't I just say all that, but more succinctly, using the quote from BBEdit??? LOL

Glad to hear that you agree. :wink: