Splitting up data with regex search and replace

hayleyh · May 18, 2022, 3:01am

Hi there, maybe someone can help me with this regex. I'm trying to split a line of data into specific variables seen below. The number of items in the list isn't always the same, I'd like a regex that would be able to flex no matter how many "folders" are added.

So ideally it would handle:
File,Folder,URL

and also:
File,Folder1,Folder2,Folder3,URL

Any thoughts would be much appreciated!

tiffle · May 18, 2022, 6:49am

Why use a regular expression?

If each line is in the format you’ve said - a set of comma-separated items - then that is effectively immediately accessible by KM since each line is just a KM array.

So, suppose you read a line into a KM variable, call it Line. Then:

Line[0] tells you how many items in the array, call it N
Line[1] gives the first item in the array; in your case that’s the File
Line[2] to Line[N-1] give you the Folders
Line[N] gives you the URL

KM is more than sufficient to handle this without a regex although, of course, you could go about it that way too. As always, there is more than one way to approach this and I’d guess others might chip in with different methods.

Edit: just to be clear, to get the value of each of those array elements in KM you would refer to it as %Variable%Line[x]%

For more detail refer to the KM wiki page about variables: manual:Variables [Keyboard Maestro Wiki]

hayleyh · May 18, 2022, 6:07pm

Thanks for the response! I'm confused though on how this would dynamically scale no matter how many folders are added? Perhaps I'm not understanding how to write this part:

Line[2] to Line[N-1] give you the Folders

tiffle · May 18, 2022, 8:13pm

Since you know the value of N you can create a loop that runs from 2 to N-1 allowing you to retrieve all the folders regardless of how many there are.

hayleyh · May 18, 2022, 8:31pm

Ok how would you suggest creating a loop? Any way I can think of to do it would require me to know the number of folders in advance. This is what I have:

Screen Shot 2022-05-18 at 1.26.33 PM

And my output is:
Screen Shot 2022-05-18 at 1.25.38 PM

Split Data with Array.kmmacros (3.5 KB)

Appreciate the help!

tiffle · May 18, 2022, 9:47pm

Hi @hayleyh - happy to help.

I'm just leaving for the night but I've thrown together this demo macro that shows the principles of what I mean.

I've used Local variables throughout so don't be distracted by that. This works with global variables but I prefer locals as they don't hang around after the macro terminates.
I've arranged the macro so it processes as many lines as you want to feed it in the variable Local_Lines (green). That means there are 2 loops: an outer one (orange) that picks out each line and an inner one (aqua) that processes the items in each line.
I've made sure that each line has a different number of folders to demonstrate the "scalability" you asked about.
the point of this is that you don't have to name each folder Folder1, Folder2 etc - you just refer to them as %Variable%Local_Line[Local_Count]% since you know that Local_Count takes the values 2 to Local_N - 1 for the folders.

The output of my demo looks like this since there are three lines containing file, folder and URL information:

KM 0 2022-05-18_22-40-20

(I mistakenly added spaces to the start of some folder names originally, sorry about that but the principle is the same - I removed the spaces in this latest version of the macro!)

Here is the demo macro itself:

Test Split into Items.kmmacros (7.1 KB)

Click to see macro

Keyboard Maestro Export

You'll need to examine this to understand what's going on so if you have questions that's fine but as I said I'm leaving now...

One last thing - you haven't fully explained what you want to do so this macro of mine may not meets your needs: that's up to you to decide!

Cheers.

hayleyh · May 19, 2022, 3:46am

This is all extremely helpful to see, thanks so much! It works exactly as described, making sense. Thinking it through for my use case I think this is enough to work.

I think what I didn't explain well is that I was looking to capture each of these elements into variables. The File and URL variables work perfectly, but for folders I was hoping it could create variables like %Folder1% %Folder2% etc. without me needing to build those out with "Set Variable to Text" in advance. Referencing %Variable%Line[2]% is fine but it's a bit easier for my sanity to put into a variable that describes the data better.

Curious to know if this is possible somehow but like I said, your solution works enough for me. Thanks!

tiffle · May 19, 2022, 7:04am

I understand your point in which case you should check out a couple of third party plug in actions.

There’s Split:

And then there’s Split:

Nige_S · May 19, 2022, 9:11am

While I completely understand where you are coming from, the problem is often "how do I access the variable later"! Using your case as an example, if you generate variables called "Folder1", "Folder2", etc as needed, how do you find the "last folder" for any particular processed line of data? All you can really do is access each potential variable in turn, and when "FolderX" errors (you did blank/remove all your variables after processing the last line, I hope!) you know the "last folder" was the variable you visited before "FolderX"...

I had a quick look at the "Split Text" plug-in @tiffle linked, and it does look to produce "sensibly named" variables on the fly. But it returns the "sensible names" as a list, and that's what you use to get to them -- you're still working with an array, just one step removed!

"Array" is quite an nerdy word -- "list" is much friendlier, and you work with those every day! So in your case above you can have "sensible" variable names of Filename, URL, and FolderList and you think "first folder of FolderList" for FolderList[1], "second folder of..." for FolderList[2], and so on. (On the "helping you sanity" front, at least item 1 is the first item in KM, unlike many other systems where arrays are zero-indexed...)

Lists/arrays are really very powerful and easy to negotiate once you get your head round them, with most things you'll want to do already built-in. And they are practically purpose-made for this kind of "unknown number of things to store and reference" scenario.

tiffle · May 19, 2022, 9:47am

Exactly what I was going to say but you beat me to it @Nige_S! In my view having something that “dynamically scales” is not compatible with having hard-coded variable names like “Folder1” - you really do need an array.

Well, my shopping lists look nothing like KM’s arrays which are actually sets of comma separated values as I’m sure you know. And to a nerd, KM’s arrays look nothing like actual arrays as I’m sure you also appreciate - even though KM provides an array-like interface for them. By using the term “list” the KM user quickly finds out that KM has nothing to say about natively accessing items in that list and so where does the user turn to for help: variable arrays! I get that you’re trying to simplify in order to explain, but sometimes it’s better to stick to the dark side where the nerds hang out

Nige_S · May 19, 2022, 10:04am

I'm also showing my AppleScript side, where you do use list constructs as arrays. My bad!

Although it does amuse me that when the KM manual discusses Variable Arrays, the name of the variable is myList, ListCombined etc. So I guess I'm not alone...

tiffle · May 19, 2022, 10:23am

Ha - that is funny It must be nerd humour!

I find it funny, in a tragic sort of way, that if you type "array" into the KM wiki search box you get zero hits, while doing the same for "list" returns 5 hits. Go figure!

Nige_S · May 20, 2022, 3:51pm

Really should pay more attention to the manual... Looking at something else and the section on Dynamic Variables caught my eye. After a bit of faffing:

Dynamic Variables Test.kmmacros (4.1 KB)

Summary

The Display Dialog action is there to show the (local) variables are actually created and persist til the end of the macro -- I'm not pulling a fast one and repeatedly setting a single variable Step through with the debugger to see what's happening, change the "Repeat n times..." for different amounts of variables, one per loop.

I still don't think it's any easier than using an array, but YMMV.

tiffle · May 20, 2022, 4:05pm

And thanks for jogging my (flaky) memory! Several years ago I was asked to automate a management system for several websites. I did so but, knowing the user wanted to add further websites in the future, I generalised the KM macros by using indirection not only in the variables that were used but also in the names of the macros that were being executed and it was all driven by a large KM dictionary. In the end one set of macros could manage over 60 websites and new websites could be added just by editing the dictionary, adding one new macro but leaving the bulk of the other macros intact.

It might sound great to some but it was a royal pita getting it debugged and is probably why I blanked it out of my memory. I’m sitting here now thinking (a) did I really do that and (b) will I have nightmares tonight

Nige_S · May 20, 2022, 4:17pm

Visions of @tiffle lying on a bed, the revolving ceiling fan turning into the thwock thwock of helicopter rotor blades...

tiffle · May 20, 2022, 4:39pm

Hmmmm… time to call my analyst

Nige_S · May 20, 2022, 6:45pm

I should point out that my macro above isn't using indirection -- dynamic determination of the variable to use. It's actually dynamically declaring variables at runtime. I really should have named it better...

I still wouldn't recommend it, though, for reasons previously discussed!

Splitting up data with regex search and replace

Options