Regex: positive lookahead not working

Hi keyboard maestros,

I'm trying to extract all numbers and letters before a colon and am using this regex:

.*(?=:)

See more here:

Unfortunately, in KM, this positive lookahead seems not to work, as the whole line is captured. Any idea how to resolve / work around this?

Thanks a lot
Dwarfy

I'm definitely a novice, but could you not capture your desired characters using
(.*):

That's nearly it, I think. Try (.+?)\: to capture at least one character before a literal colon.

(The asterisk will match nothing before the colon and the colon alone is a range operator.)

Then you don't need to use a RegEx Positive Lookahead.

This works fine in your RegEx101.com example:
(.+):

The data you want is returned in the first Capture Group.

You're right about using a plus instead of an asterisk, but the colon is fine when used in this context. Of course it never hurts to escape any character ensuring that the character literal is used. I don't see the need for a "?". The plus + will capture ONE or more characters.

From RegEx101.com

image

Thank you very much guys! That did the trick. And I learned how to circumvent complexity with capture groups if one isn't really needed. Thanks! :slight_smile:

I don't know what you had in your CitationLines so it is hard to say for sure, but to be clear, positive lookahead assertions work just fine.

Note that .*(?=:) will match everything on a single line up until the last colon (which is also true of (.*):).

image

Capture groups are a fine solution for this, but lookahead assertions do work fine as well.

Really, if you want to capture everything up until the first colon, then you should use either a non-greedy search like (.*?):, or explicitly exclude the : like ([^:]*): (note that the latter will also allow for line ending characters so if that might be an issue you would need to deal with that possibility).

right on point, Peter. The variable in some instances contained a second ":" and therefore the expression had to be non-greedy (no matter whether using a capture group or positive lookahead).

Thanks for your input, it has been of tremendous help!

That will return an empty capture group if a colon is the first character on the line.
If you want to ensure there is always some text before the colon, then you would need:
(.+?):

But the question mark is needed if there are multiple colons in the text to make sure only the text up to the first colon is captured.

This shows two matches:

image

While lookahead assertions can work, I have found them much more complex to use. So I try to use the simplest solution.