Regex getting only the first match

Desalegn · July 6, 2018, 11:13pm

(?s)\\ex[ig]?\..*?\n\s

Why is this Regex matching only the first instance in KM?
It is getting only the line with the "the first dummy text", while other regex tools including
https://regex101.com/r/jR5VA6/1. get all the matches?

I have to make the match non-greedy by adding "?" after ".*" because I want to filter out the texts in between the lines. The greedy one matches everything. I don't want that. But, making the match non-greedy seems to force KM to target only the first match.

JMichaelTX · July 7, 2018, 1:43am

It is not your RegEx -- it's how the KM Search using Regular Expression action works. It only returns the FIRST match.

To get all matches, you need to use a For Each Action with Substrings Collection. A good example is shown in that link.

You can use the same RegEx pattern to find each match, and for the subsequent Search Regex to extract the variables. OR, you can use a simpler RegEx in the For Each.

Questions?

Desalegn · July 7, 2018, 2:35am

Hi Michael.
Thank you for the reply.

Can you look at this macro? I couldn't get it work (get all the examples of the whole text).

Extract_examples.kmmacros (5.1 KB)

You can test the macro with the following examples.

   I have a large text above this. I would like to pick just the examples. 

\ex.\label{417}The first Dummy sentence\\
CAUS-{\Antic}-sleep\\
`interpretation '

I have a lot of text here. 


\ex.\label{417}The second Dummy sentence\\
CAUS-{\Antic}-sleep\\
`interpretation '



The situation in here looks like more stuff is coming up.

JMichaelTX · July 7, 2018, 2:46am

I'm not sure what you mean. Exactly HOW does it not work?

If you mean that you are getting more matches than you want, then you will need to tell me what is unique about the two cases that you do want to match.

I looked at your macro, but it does not contain any sample data.
To determine the RegEx, we need:

A great, extensive, example of the real-world data to be searched.
(if it is too large to post, just zip it and upload the zip file)
A way to identify the cases you want to match:
- Provide a real-world examples of the data to match.
  OR
- Provide a detailed description of what is unique, what is different, about these cases from all the rest of the text.

We're pretty good here, but reading minds is a skill we're still working on.

Desalegn · July 7, 2018, 3:15am

I cannot attach rtf in this website. The following text is just a sample:

This is a sample text: contains two examples.

\ex.\label{417}The first Dummy sentence\
CAUS-{\Antic}-sleep\
`interpretation '

Assume I have more text here

\ex.\label{417}The second Dummy sentence\
CAUS-{\Antic}-sleep\
`interpretation '

And more texts here.

Desalegn · July 7, 2018, 2:37pm

I am sorry that I didn't make myself clear. The lower part of the text (I put it under Formatted text) is supposed to be a sample, not part of my question.

Using the regex in SublimeText worked for me for now.

JMichaelTX · July 7, 2018, 4:35pm

Sounds like you have solved your problem.
If so, perhaps you could post the solution for the benefit of others.

If not:

Desalegn · July 8, 2018, 4:33pm

I didn't understand how the text concatenation works.
The examples given in http://www.macdrifter.com/2012/03/the-new-keyboard-maestro-for-each-action.html
helped.

This macro extracts only the example sentences written in Linguex (Latex) format.

This package itself can extract examples (handout). But, it needs a lot of tweaking, and has some inherent weaknesses. The above macro solves the problem.

JMichaelTX · July 8, 2018, 8:30pm

I'm glad the reference helped you resolve your question.

From your macro:

I have a couple suggestions:

Use Local Variables wherever possible when you don't need the Variable later.
- This helps keep your KM Variable environment clean, and free of large Variables.
When you are concatenating data, put the repeating Variable on the same line as the collecting Variable, followed by a linefeed.
- Otherwise, you will have a blank line at the beginning of the list.
- You can also use a manual RETURN instead of the token %LineFeed%

So, it might look like this:

I replaced

Your Variable    With
temp             Local__AllEmails
ExtractEX        Local__ExtractEX

Desalegn · July 8, 2018, 9:24pm

Thank you. I have posted the marcro here: Extracting Linguistic examples from a Latex document incase sb (linguist) find it useful.

Regex getting only the first match

Options