Sum up all numbers matched by RegEx pattern

LeoB · November 19, 2017, 12:06pm

I'm grabbing my Amazon order history, page by page.

Every page gets appended to a text file.

I end up with a long text file which contains all of my orders.

Now I want to grab the order sum of each one of these orders, and add them up together.

They look like this:

I'll have hundreds of occurrences in the text file.

I can find them with this regex pattern:

(?m)SUMME\nEUR (\d*,\d*)

But what's the most elegant way to have KBM go through the file once, grab each occurrence, and add it to one variable (let's say AmazonSpending2017)?

I thought about running a loop: Grab the first match. Add it to my results variable. Then delete the occurrence. And run the same loop again and again, until the pattern cannot be found anymore.

Is there a more elegant way to do this?

The comma might need to be replaced by a period. I'm using Amazon in Germany, which also doesn't have the CSV reporting tool that the US version has - hence my need for a KBM solution. But it's a problem that interests me in general, because I come across it often - how to deal with multiple occurrences of a regex match.

gglick · November 19, 2017, 12:40pm

I doubt this is the most elegant way, but here's the first thing that came to my mind upon seeing this problem; hopefully it will prove useful. Feel free to ask if you have any questions about how it works!

Sum Up All Numbers Matched by Regex.kmmacros (8.0 KB)

###Results

ccstone · November 19, 2017, 1:45pm

Hey Leonard,

Ha!. Gabe beat me to the punch by about an hour.

Gabe ⇢ My approach was almost exactly like yours, although I hadn’t bothered with the formatted output. Good job.

Okay, I’ll demonstrate once again why I like the Satimage.osax so much.

If you forget about the handlers there’s all of 4 lines of code.

(Remember — the script won’t work, unless the Satimage.osax has been installed!)

-Chris

------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2017/11/19 07:25
# dMod: 2017/11/19 07:30
# Appl: AppleScript + Satimage.osax
# Task: Extract values from text and sum them
# Libs: None
# Osax: Satimage.osax (MUST BE INSTALLED OR THE SCRIPT WON'T WORK!)
# Tags: @Applescript, @Script, @SIO, @Satimage.osax, @Extract, @Values, @Text, @Sum, @Them, @ccstone
------------------------------------------------------------------------------

set sourceText to "
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 20,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 21,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 22,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 23,40
"

set euroList to fndUsing("^SUMME\\nEUR (\\d+,\\d+)", "\\1", sourceText, true, true) of me
set euroList to cng(",", ".", euroList) of me
set theSum to format (sum of (statlist euroList)) into "€0.00"

------------------------------------------------------------------------------
--» HANDLERS
------------------------------------------------------------------------------
on cng(_find, _replace, _data)
   change _find into _replace in _data with regexp without case sensitive
end cng
------------------------------------------------------------------------------
on fndUsing(_find, _capture, _data, _all, strRslt)
   try
      set findResult to find text _find in _data using _capture all occurrences _all ¬
         string result strRslt with regexp without case sensitive
   on error
      false
   end try
end fndUsing
------------------------------------------------------------------------------

gglick · November 19, 2017, 2:16pm

Wow, thanks, Chris! I guess I must have figured out a good solution if the script master himself had the same idea

LeoB · November 19, 2017, 9:24pm

Thank you, guys.

@gglick, using “For each” really seems like the most elegant way to do it with KBM’s building blocks.

And @ccstone, thanks for the pointer to Satimage.osax. I had never heard of it, nor am I experienced in AppleScript. But I’ll have a look at it. Doing such a task in four lines - that does sound elegant indeed.

gglick · November 19, 2017, 10:57pm

Happy to help, @LeoB. “For each” is a fairly new addition to my KM repertoire, and one that I’ve found to have a bit of a learning curve even compared to other aspects of using KM, but it’s incredibly useful and well worth the time it takes to get up to speed with.

peternlewis · November 20, 2017, 5:01am

Sorry to disillusion you, but For Each came in in version 5.1, March 2012.

For Each is fairly simple once you get your head around the concept. Basically, any time you have or want a list of things, For Each is likely the tool to choose. It iterates through the list, setting a variable each time to one entry and performing the actions.

So as soon as you say "sum all all numbers" in your question title, the For Each action should be your starting point (unless your @ccstone and then AppleScript is your starting point ;- ). the same would be true if you were asking anything like:

How do I do X with all the lines in a file
How do I do X with all the files in a folder
How do I do X with the selected Finder items

Etc.

gglick · November 20, 2017, 5:13am

Oh, I know that For Each has been around for a few versions now. I just meant that I wasn’t personally aware of it (I clearly didn’t read the 5.1 update notes very carefully, even though I was happily using KM since version 5.0) and didn’t even try using it until this year; hence why it’s a recent addition to my KM repertoire

peternlewis · November 20, 2017, 5:13am

Clearly I should read more closely too

LeoB · November 20, 2017, 5:15pm

Thank you @peternlewis. "For each" had intimidated me. But now that I've used it, I see how powerful it is.

Etched into my mind. Thanks for this rule of thumb.

Few things bring tears of appreciation to my eyes. There's my daughter. There's Tesla. And there is Keyboard Maestro.

Beautiful - that's the only word I have for it.

ComplexPoint · November 20, 2017, 8:13pm

and, of course, we don't need Satimage to use Regex matching in Applescript - the Foundation classes also give us that
( though of course it's always simpler in JavaScript : -)

Sum Numbers Matched by AppleScript Regex.kmmacros (21.3 KB)

JMichaelTX · November 20, 2017, 10:02pm

It's always good to have options.

Though, of course, it is always simpler using RegEx with Satimage than with ASObjC Foundation classes.

Chris' (@ccstone) AppleScript with Satimage is much simpler and much easier to follow and modify.

And, Satimage comes with a host of other great features/functions/commands that all run very fast and make it an excellent choice for most who want to get the most out of AppleScript for the least amount of effort.

Coders can choose the method they find that works best for them.

ComplexPoint · November 21, 2017, 2:16am

though of course it's always simpler in JavaScript : -)

A similar sketch in vanilla JavaScript for Automation ( batteries included – no library installations or imports needed )

Sum Numbers Matched by JavaScript Regex.kmmacros (19.7 KB)

ccstone · November 21, 2017, 2:59pm

Hey Folks,

Here's my script without any handlers

As you can see there are only 3 lines of working code (I'm excepting the assignment statement for sourceText).

Extract values from text and sum them.txt.zip (1.3 KB)

------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2017/11/19 07:25
# dMod: 2017/11/21 08:55
# Appl: AppleScript + Satimage.osax
# Task: Extract values from text and sum them
# Libs: None
# Osax: Satimage.osax (MUST BE INSTALLED OR THE SCRIPT WON'T WORK!)
# Tags: @Applescript, @Script, @SIO, @Satimage.osax, @Extract, @Values, @Text, @Sum, @Them, @ccstone
------------------------------------------------------------------------------

set sourceText to "
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 20,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 21,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 22,40
BESTELLUNG AUFGEGEBEN
3. Oktober 2017
SUMME
EUR 23,40
"

# Due to a BUG in the Discourse Forum software please manually remove 1 forward slash from "\\\1" below.
set euroList to find text "^SUMME\\nEUR (\\d+,\\d+)" in sourceText using "\\\1" with regexp, all occurrences and string result
set euroList to change "," into "." in euroList with regexp
set theSum to format (sum of (statlist euroList)) into "€0.00"

------------------------------------------------------------------------------

Result: €87.60

-Chris

LeoB · November 21, 2017, 10:09pm

Wonder what I'm doing wrong.

Satimage.osax has just been installed. Yet, when I run the macro, nothing happens.

I'm sure I'm missing something here, but I don't know what it is.

ccstone · November 21, 2017, 10:18pm

Hey Leo,

Bleep!

There's a display bug in the Discourse Forum software that fouls up \\1.

I added an instruction to the code for how to fix the problem if you copy it, and I added a downloadable text file with the pure code.

Try that and let me know if you still have problems.

-Chris

LeoB · November 22, 2017, 3:18pm

That was it. Now everything works. Lovely. Thank you, Chris!

Sum up all numbers matched by RegEx pattern

Options