Extracting the Date and Place from Source Text

I am trying to use Regex to extract the date and place into separate variables and I can not figure it out. It will alway be the same format. It would be nice to have just one regex statements but if needed 2 would be fine.

Marriage 22 September 1947
I Place Lexington, KY

Hey Roger,

Is this going to be in the midst of other text, or will there only be those two lines to choose from?

-Chris

Just those 2 lines

It could be something as simple as the following assuming the format never changes:

(\d+\s\w+\s\d+)

That will capture the entire string, all together. If you need to separate the day, month and year, that is easily doable as well:

(\d+)\s(\w+)\s(\d+)

1 Like

Here's the the "simple" solution and the route I would take:

Extract Date & Place v1.00.kmmacros (6.9 KB)
Keyboard Maestro Export

But there are others.

  • You could process the lines one at a time in a For-Each action.
  • You could pipe the text into the shell and use various shell commands or scripting languages.
  • You could pipe the text into a JavaScript for Automation action and go to town.
  • Etcetera.
2 Likes

This worked fine to extract just date but did not extract the place.

This worked even better as it only needed one regex to do it but: (seems always a but) The place is Lexington, KY. It should not include the information, I Place or any other information, before Lexington, KY could even be Lexington, Fayette, Kentucky. Maybe even other countries.

It will take some time to figure what the regex is doing. Could you help me with that?

Thanks
Roger

Whoops, I missed that you needed both pieces of data. This should work for the example you posted.

^(?:.+\s)?(\d+\s\w+\s\d+)\v^(?:.+\s)?(.+,.+)

However... with the variables you just mentioned it is unlikely to work for everything. Ideally, you would have some sort of delimiter between your โ€œirrelevantโ€ data and your โ€œrelevantโ€ data that the RegEx could look for. But that might not be possible.

Iโ€™m nowhere near as experienced with RegEx as Chris (@ccstone) is, so he will no doubt be able to provide better assistance. But regardless, if you could provide some more examples I wouldnโ€™t mind taking a stab at providing a more polished RegEx solution.

-Chris

Hey Roger,

What is the prefix text on the location?

Always "I place " ?

Here's how the regular expression breaks down.

ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท

^.*?(\d+.+)\v+(.+)

ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท

^	Beginning of string (since I have multi-line turned off).
.	Any character.
*	One or more.
?	Non-greedy.
(	Start capture group.
\d	Any digit.
+	One or more
.	Any character
+	One or more
)	End capture group
\v	Vertical whitespace.
+	One or more
(	Start capture group.
.	Any character.
+	One or more.
)	End capture group.

How the regex breaks down on regex.101.com

-Chris

Chris

Yes prefix is l place

Keep warm weโ€™re expecting minus 30 wind chill

Roger Wells
Lexingtin, KY

Okay this regular expression should work:

^.*?(\d+.+)\v+I Place(.+)

Chris That worked exactly.

I have a questions: If the "I Place" had not been a constant is there a way to substitute a KBM variable in place of the "I Place"?

Thanks for your help
Roger

Hey Roger,

It depends upon what you mean by a variable.

Do you mean a variable standing in for an exact string?

If memory serves this will work:

^.*?(\d+.+)\v+%Variable%YourVariableName%(.+)

Or โ€“ do you mean regex syntax that would skip over the constant?

That gets sticky, because of the variable nature of the location.

Suppose your location was:

Some other placeholder Salt Lake City, Utah

-Chris