How to split up clipboard/variable into multiple variables?

Hello,

I'm rather new to KM, I hope you could still help me out with (what might be) a basic problem..
For work reasons, I need to use address information to verify certain things. Most often, the summery will look something like this:

Address   Slangenburg 13
                7608 RS Almelo Nederland
Phone      123456789
Website    https://www.slangenburg.nl
lat. long.   52.376536, 6.691050

I would like to be split up this information into different variables, such as
Street name (slangenburg)
street number (13)
Post code (7608 RS)
City (Almelo)
Website
coordinates

note not all parts are always present. the address and the coordinates are always there, but the phone and website might be missing
Sometimes the street number includes a letter at the end (eg 13A)

I've managed to select the complete address using
Search Named clipboard using regex
(?s)address(.+)Nederland

as the phone isn't always there, but I don't care too much about "Nederland" in the address

All in all, I can see anything can be done using this great program, but as of now, I don't understand enough. Could someone help me with a direct solution to my problem? Also, I'd love a clear resource to understand the program better.
I used tutsplus a little, but it didn't' encompass all I needed. Also I came across https://www.regular-expressions.info/ which helped a little.

Thank you, I much appreciate your time and effort!

Given those requirements, I've developed the below Macro.
Note the following:

  1. I revised your post to put your source data into a Code Block, so that ALL characters are properly maintained.
    • In the future, please put ALL source data in a Code Block, and also ALL results in a Code Block.
  2. This is the data I used.
  3. I noticed you have a lot of spaces in your source text. I think I have properly accounted for one or more spaces or tabs using the \h RegEx metacharacter.
  4. I have assumed that your statement "might be missing" to mean that the entire lines containing either the Phone or Website would be missing. If this is incorrect, then the RegEx will need adjusting.
  5. RegEx:
    (?mi)Address\h+(\w+)\h+(\w+)\n\h*(\d+ \w+) (\w+) (\w+)\n(?:Phone\h+(.+)\n)?(?:Website\h+(.+)\n)?lat\. long\.\h+(.+)
    .
    For RegEx details, explanation, and test, see:
    https://regex101.com/r/Vovw15/1/

Please let us know if this works for you, or you have further issues or questions.


Example Output

Data WITH Phone and WebSite Lines

2019-03-17_18-19-12%20(2)

Data WITHOUT Phone and WebSite Lines

2019-03-17_18-19-13


MACRO:   Extract Address, Phone, and Web Site from Text [Example]


#### DOWNLOAD:
<a class="attachment" href="/uploads/default/original/3X/f/3/f377965be1b6f097f1fba975112a4db5e6a3530a.kmmacros">Extract Address- Phone- and Web Site from Text [Example].kmmacros</a> (13 KB)
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**

---

### ReleaseNotes

Author.@JMichaelTX	Requesting Post by @Mudness 

**PURPOSE:**

* **Extract Address and Optionally Phone & WebSite from Clipboard**

**REQUIRES:**

1. **KM 8.2+**
2. **macOS 10.11.6 (El Capitan)**

**==NOTICE: This macro/script is just an _Example_==**

* It has had very limited testing.
* You need to test further before using in a production environment.
* It does not have extensive error checking/handling.
* It may not be complete.  It is provided as an example to show you one approach to solving a problem.

**How To Use**

1. First, complete Macro Setup as instructed below.
2. Select Text that contains Address etc
3. Trigger this macro.

**MACRO SETUP**

* **Carefully review the Release Notes and the Macro Actions**
  * Make sure you understand what the Macro will do.  
  * You are responsible for running the Macro, not me.  ??
.
1. Assign a Trigger to this maro..
2. Move this macro to a Macro Group that is only Active when you need this Macro.
3. ENABLE this Macro.
.
* **REVIEW/CHANGE THE FOLLOWING MACRO ACTIONS:**
(all shown in the magenta color)
   * For Production Use, ENABLE the COPY Aciton, 
and DISABLE the Set Clipboard Action
   * For Testing, enter your sample source text into the Set Clipboard Action

TAGS: @RegEx @Extract @Address @OptionalCaptureGroups

USER SETTINGS:

* Any Action in _magenta color_ is designed to be changed by end-user

ACTION COLOR CODES

* To facilitate the reading, customizing, and maintenance of this macro,
      key Actions are colored as follows:
* GREEN   -- Key Comments designed to highlight main sections of macro
* MAGENTA -- Actions designed to be customized by user
* YELLOW  -- Primary Actions (usually the main purpose of the macro)
* ORANGE  -- Actions that permanently destroy Variables or Clipboards,
OR IF/THEN and PAUSE Actions


**USE AT YOUR OWN RISK**

* While I have given this limited testing, and to the best of my knowledge will do no harm, I cannot guarantee it.
* If you have any doubts or questions:
  * **Ask first**
  * Turn on the KM Debugger from the KM Status Menu, and step through the macro, making sure you understand what it is doing with each Action.

---

![image|523x1453](upload://4mkbmGSbV1qAu36DjFY1aCPTvp2.png)
4 Likes

I am genuinely impressed with how super skilled @JMichaelTX is and how incredibly helpful he is!

2 Likes

Wow Michael, Thank You!! Just amazing, so generous you are..
I've been looking at your macros for a while, trying to understand all of it. I got some problems still, but I'll continue trying to understand what is going on.

Thank you kindly !

What a cool program! And a beautiful Macro, Thank you!!

I've changed a few things, and extended it a bit too.

Now Keyboard Maestro is copying to a named keyboard. Does this have any unintended consequences? Should I clear the clipboard after I've finished researching this one business, before the next?

This is the regex I'm using now

(?mi)(?:(\w+)\n)?Address\h+(.+)\h+(\w+)\n\h*(\d+ \w+) (.+) (\w+)\n(?:Phone\h+(.+)\n)?(?:Website\h+(.+)\n)?lat,\hLng\h+(.+)

Sometimes the summery includes a name of a business, so I followed your example and included a another variable in the beginning

(?:(\w+)\n)?

Sometimes the street and/or the city name contains a "-", so instead of matching a word character "\w" I replaced it with a dot sign "."

the lat/lng is written differently from how I remembered, now it is as the summery comes to me

I renamed the variables from
local__name
to
name
do you think that will that be a problem?

I used the variables in some website searches, and it works!
I was trying to make the macro switch from map to satellite view. I couldn't get it done in changing the URL, so I tried to switch tabs, wait for the page to be loaded, press the map/satellite switcher, and then repeat for the next tab. somehow this doesn't work - the macro doesn't actually move the mouse to click after the page has loaded. when testing, a few times my laptop began to lift off, fanning away as though it was processing at full power. any idea why it doesn't work?

Perhaps the biggest issue I cant find a solution for, is in the address line. Sometimes, the street address will not consist of only 1 'word', but it could be multiple. (eg P.C. Hooftstraat or Jonkheer van de wall repelaerstraat)
it could even (although rarely) contain a number at the end or beginning of the street.
all that can be said with some certainty, is that the last part of the "Address" line is the street number. and that what is in between that number and "address" will be the street name. How would you write a code for that?

It's also showing the text in the lower right corner now, thanks to #Tom (Profile - Tom - Keyboard Maestro Discourse)
Manipulate KM’s Display Text Window

Many thanks again! you really helped me get into this beautiful program, and to experience this kind community

1 Extract Address, Phone, and Web Site from Text [Example].kmmacros (18.2 KB)

I'm glad my macro helped you get started. Looks like you're doing very well.
I'll be glad to answer your questions, but it might be a day or so before I have the time. If you don't hear form me within a couple of days, and someone else has not answered your questions, feel feel to ping me.

All good, take your time. I will keep reading, see if I can find some answers :slight_smile:

I've been fooling around a bit with Keyboard Maestro, running into a few things I don't understand..

when sometimes I think I understand how the regex works, it turns out I'm wrong. sometimes there is a part of the expression that works, but when I repeat it somewhere else the result is not the same..
I've been using regex101 a lot

For example

regex101: build, test, and debug regex

and

regex101: build, test, and debug regex

From the second regex (Vovw15) I copied the possibility of a website in the string.
However, in the first regex (L1R5AH), this same expression

(?:Website\h+(.+)\n)?

doesn't do anything

__

Another, more regular, problem that I can't seem solve, is with

regex101: build, test, and debug regex

(?mi)Address\h+(.+ (?:(\d))?)\n(\d+(?: \w+)?) (.+) Nederland \nlat,\hLng\h+(.+)

when I use the following address

Address Ridderhof 5
2981 AJ Ridderkerk Nederland
Lat, Lng 51.870098,4.599088

I'm able to say that sometimes "AJ" is present in the postal code, and sometimes it isn't.
in both case it will take "Ridderkerk" separate because it is followed by "Nederland"

How should I write the regex to show that sometimes a street number is present, and sometimes it isn't?
That in case it is not present, the street is the only variable on that line
and if it is present, that the last digits refer to a new variable?

A similar issue is that sometimes the street name consists of multiple words. Assuming that the street only consists of letters and spaces, how can I program so that everything between "Address" and the possible street number will be recorded as %Variable%Street%?

Thank you for your time!

OK, you've posed a number of RegEx challenges. The easiest way to solve this is for your post, in a Code Block, a real-world example of ALL possible cases that you expect to encounter. And, although you have mentioned some rules in the above text, it will help to understand if you can LIST all rules together.

Your Questions from Prior Post

Actually, from what I've seen and posted, KM is copying to the System Clipboard, not a Named Clipboard. If you want to remove the last copy to the System Clipboard, just use this: Delete Past Clipboard action with "0", which means the last or current clipboard entry.
image

It is not a problem per se. I use "Local" variables so that the global Variable list is not cluttered. "Local" and "Instance" variables are auto-deleted when the Macro in which they were created finishes. The only reason to NOT use "Local" variables is if you want to retain the Variable value for other Macros, and subsequent executions of the same Maco. See KM Wiki: Variables .

I don't understand your question, or what the objective is. Could you please clarify? Maybe post the manual steps you use to accomplish the task. If it is complicated, then a video or animated GIF would be great.

If I have missed any other questions, please feel free to repost.

2 Likes

I've been using Keyboard Maestro for a while longer now, and trying out things in regex101 most of the time a bit through trial and error. I've managed to work out a regex that works in most of the cases:

(?mi)(?:(.+)\n)?Address\h+(?:\n)?(.+)\h(\w+)\n\h*(\d+(?: \w+)?) (.+)\n(\w+)\n(?:Phone\h+(.+)\n)?(?:Website\h+(.+)\n)?lat,\hLng\h+(.+)
Roompot
Address	Schelp weg 7b
4357 DE Domburg de man
Nederland
Phone	+31118583210
Website	https://www.roompot.nl/vakantieparken/nederland/zeeland/hof-domburg/
Lat, Lng	51.55912,3.4873

In this case, it recognizes that the street name may consist of more than one word. and the same goes for the city

  • Why exactly is that?
    because of the combination
(.+)\n

and

(\w+)\n

?

That everything before the 'next line' is one group?

Sadly I haven't been able to figure out a way to have it recognize the street number as I would like it to
say that the first lines could be

HEMA
Address	Schelp weg 7b
HEMA
Address	Schelp weg

How can regex realize that the street number needs a number to be the street number?
and that otherwise the street number simply isn't there?

And a similar issue with the postal code. How can I say that the letter part of the postal code, if present, would consist of 2 capital letters? at the moment it takes whatever comes after the numbers to be part of the postal code

HEMA
Address	Schelp weg 7b
4357 Domburg de man
HEMA
Address	Schelp weg 7b
4357 DE Domburg de man

I hope I'm putting everything in Block quotes as you asked.

Thank you again for your time :slight_smile:

So, the easiest way to develop the RegEx for a specific part of your text, is to just limit it to the part of interest, in this case the address line.
This seems to work with either a number in the address, OR, no number:
(?mi)^Address\h+([\w ]+)
For details, see:
https://regex101.com/r/XoE5jB/2/

Same approach.

RegEx:
(?mi)^Address\h+([\w ]+)\h*\R(?-i)(\d+(?:\h+[[:upper:]]{2})?)\h+([\w ]+)

See https://regex101.com/r/XoE5jB/4/

Complete RegEx

And just for completeness, here's the RegEx to extract all fields allowing for the variations above:
(?mi)^.+\RAddress\h+([\w ]+)\h*\R(?-i)(\d+(?:\h+[[:upper:]]{2})?)(?i)\h+([\w ]+)\R([\w ]+)\RPhone\h+([\+\d \(\)\-\.]+)\RWebsite\h+(.+)\RLat,\h+Lng\h+([\d\.]+),\h*([\d\.]+)
See regex101: build, test, and debug regex

Note that throughout here I am using \R instead of \n so that it will match ALL new line characters, not just LF.

Yes. Thank you.

Does that answer your questions?