Pulling Multiple Strings of Data From an Email Using Regex Into Separate Variables

Hello community.

Here with another project that has me stuck. I do not have a macro example to show because I don't know how this will work, all I can do is tell the scenario and share what it could look like.

I will be getting an email that will be used to trigger a macro. I will need some information from these emails. The email will look like this:


From: mike@mike.com
Subject: XYZ Termination - John Smith [152637]
(Message Content)

John Smith [152637] has been terminated for XYZ on 02/07/2023.
Please update the necessary fields.

Thanks.


What will change and what I need:

Form the Subject line or Content:

  • XYZ - Set to local_entity
  • 152637 - The numbers will always appear between [ ] - Set to local_empID
  • 02/07/2023 - set to local_termDate

I have tried the regex you all have taught me over time but could not get it to work. I think mainly because what I used in the past was when the content was one line or one after the other in a column.

If there is something more I can offer or pieces of a macro that might help make things easier for you, please let me know. Thanks!

What is XYZ in real-world terms?

When you select a sample message in Mail and run this AppleScript from Apple's anemic Script Editor app does it extract the entire text of the message?

tell application "Mail"
   set selectedMessageList to selection
   if selectedMessageList ≠ {} then
      set selectedMessage to item 1 of selectedMessageList
      tell selectedMessage
         its content
      end tell
   else
      error "Zero messages were selected!"
   end if
end tell

Hi Chris. The XYZ is an entity abbreviation. We have 2 entities. For security reasons, I will name the entities:
XYZ
ABC

Depending on the entity, I have the macro go down a different path or sub path.

I was using a script you and Nige put together for me a while back that always works (once I change the specific details such as KM #, variable, etc). And setting a rule in apple mail for the subject line.


using terms from application "Mail"
	on perform mail action with messages theMessages for rule theRule
		tell application "Mail"
			repeat with eachMessage in theMessages
				my doKeyboardMaestroMacro(eachMessage's content)
			end repeat
		end tell
	end perform mail action with messages
end using terms from

--------------------------------------------------------
--» HANDLERS
--------------------------------------------------------
on doKeyboardMaestroMacro(msgContent)
	tell application "AppleScript Utility"
		«event coredosc» "5E50DFA0-06E7-4D56-8B86-ACC8D67CFC20" given «class KMpa»:msgContent
	end tell
end doKeyboardMaestroMacro

Trying now.

The answer is yes, when I run that script in script editor I get the entire content of the email.

You might have been getting stuck on the fact that [ and ] are special characters in RegEx and need to be "escaped" to match their literals. Something like this should do you:

image

...though you'll have to change Local_theText to whatever your variable holding the email's contents is called.

\[(\d+)\] has been terminated for (.*) on (\d{2}/\d{2}/\d{4})

The RegEx is longer than it needs to be, but it might be best to be overly cautious when dealing with canning notifications!

Hi @Nige_S,

Thanks for jumping in here.

I have it looking like this at the moment and it's not working. Am I doing something wrong?

Keyboard Maestro Actions.kmactions (2.1 KB)

I broke them up into 3 different tasks.
The only one that works is [(\d+)] for entity

You've accidentally added some newlines at the end of the Regex, so it's trying (and failing) to find "date and newline and newline".

Sheesh. I appreciate all the help, technical and dumb errors.

There's nothing dumb about not being able to spot invisible characters! But it is something you learn to check for after getting caught out bazillion times yourself...

2 Likes

Hey Mike,

You still don't have a solid understanding of what you're doing, and that's fine – that's how everyone starts out with regular expressions.

So what do you do to get things done? You simplify. Here's an example:

MikeS88 – Mail – Data Extraction Test v1.00.kmmacros (9.2 KB)

Macro Image

Keyboard Maestro Export

Break things down into tasks you do understand.

As you get more comfortable with those it becomes easier to stretch and reach for more understanding.

Note that in the first search action I've used the multi-line flag (?m), because I'm not assuming there will be only one line of data from the email.

From there I've broken each task into a separate action.

Pro Tip – always test your regular expressions in a good programming editor like BBEdit, but be aware that most of them don't use strict ICU regex like the macOS and Keyboard Maestro.

BBEdit uses PCRE for instance, and the differences can give newbies headaches.

I used to keep this critter around for testing, and it uses ICU regex:

I can't try out the latest version, because it requires macOS Monterey – but I suspect it's improved, since I last had my hands on it.

-Chris

1 Like

Hi Chris - I was interested in CotEditor but I too am on Mojave so I found this - the page of archived versions of the software going back to Panther!

1 Like

Hey Taj,

Splendid! Thanks.

I looked for an archive on the git page and completely missed the link on the product page.

:man_facepalming:

-Chris

1 Like

FWIW, I've posted a possible approach:

Thanks Chris, I am now reading this because I finally have the time to absorb it. I have been taking regex courses to learn some basics. This is very helpful.

1 Like

Sorry for hijacking the thread, but I have a similar but way simpler question.
I have a string: "100 Coke bottles"
Can I split it up and perform a calculation on "100" (for example *2) and then paste the answer as "200 Coke bottles"?

I have these actions I use to mark some text.
Run the macro and all numbers are incremented by one.

You can use this a template for your use case. Instead of +1 in the CALCULATE you replace it with *2.

In addition to @JimmyHartington's version, you can also use an array instead of regex (i.e. the @ComplexPoint version).

Result: