Multiline regex working on regex101.com but not in KM macro

I have multiple chess games in a single pgn file and am parsing the first one out.
This regex works on regex101.com but not in my macro
\A\[Event.*(?:\r?\n(?!\[Event\s).*)*

When run in KM I get this:
[Event "FICS rated standard game (FICS, San Jose"]

My expected result is up to the second occurance of the above.

Which seems like just the first line.
Many thanks!

Sample data:

    [Event "FICS rated standard game (FICS, San Jose"]
    [Site "California USA)"]
    [Date "2009.12.09"]
    [Round "?"]
    [White "Jschmid"]
    [Black "Damo"]
    [Result "1-0"]
    [PlyCount "98"]
    [WhiteElo "1616"]
    [BlackElo "1787"]

    1. d4 d5 2. c4 dxc4 3. Qa4+ Nc6 4. e3 Be6 5. Nf3 Qd7 6. Bxc4 Bxc4 7. Qxc4 h6 8.
    O-O e6 9. a3 Bd6 10. e4 e5 11. Be3 Nf6 12. Nbd2 O-O-O 13. d5 Na5 14. Qc3 b6 15.
    Qd3 Ng4 16. Qa6+ Kb8 17. Qd3 f5 18. exf5 Nxe3 19. Qxe3 Qxf5 20. b4 Nb7 21. Nc4
    e4 22. Nd4 Qxd5 1-0 {Black resigns} 

    [Event "FICS rated standard game (FICS, San Jose"]
    [Site "California USA)"]
    [Date "2009.12.09"]
    [Round "?"]
    [White "Damo"]
    [Black "Dalvero"]
    [Result "0-1"]
    [PlyCount "16"]
    [WhiteElo "1565"]
    [BlackElo "1465"]

    1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. Nc3 d6 5. h3 Nf6 6. Ng5 O-O 7. d3 Nd4 8. Bd2
    Be6 0-1 {White forfeits by disconnection}

There's some extraneous white space in front of each line in that sample text that caused the regex to fail completely when I first tried it, but once I got rid of that, it seems to work fine for me:

Here's the exact action I used if you want to compare and see what might be different:

Search using Regular Expression.kmactions (1.3 KB)

Thanks @gglick and very interesting. I can't see that white space when I paste it into regex101.com but yes, it also fails to capture anything for me. Unfortunately my data comes in that way, if you know about such things and it doesn't take too much time, could you suggest a regex that would also capture the white space in front of each line?

I can't quite get my head around the logic 'Any text and/or and newlines until the negative lookahead is spotted'.

I don't really understand what you mean by extraneous space, do you mean from my post above? When I run the macro there is no extra space, but the regex still doesn't work. In my post above, I said the data comes in that way, but looking more closely, I don't think there is extra spaces in my original data, even though there is in my first post.

I've attached the example. The capture I get is:
[Event "FICS rated standard game (FICS, San Jose"]

I'm so confused!
Keyboard Maestro Actions.kmactions (2.5 KB)

Yes, just from your above post. You can see the whitespace difference pretty clearly like this:

    [Event "FICS rated standard game (FICS, San Jose"]
    [Site "California USA)"]
    [Date "2009.12.09"]
    [Round "?"]
    [White "Jschmid"]
    [Black "Damo"]
    [Result "1-0"]
    [PlyCount "98"]
    [WhiteElo "1616"]
    [BlackElo "1787"]

    1. d4 d5 2. c4 dxc4 3. Qa4+ Nc6 4. e3 Be6 5. Nf3 Qd7 6. Bxc4 Bxc4 7. Qxc4 h6 8.
    O-O e6 9. a3 Bd6 10. e4 e5 11. Be3 Nf6 12. Nbd2 O-O-O 13. d5 Na5 14. Qc3 b6 15.
    Qd3 Ng4 16. Qa6+ Kb8 17. Qd3 f5 18. exf5 Nxe3 19. Qxe3 Qxf5 20. b4 Nb7 21. Nc4
    e4 22. Nd4 Qxd5 1-0 {Black resigns} 

    [Event "FICS rated standard game (FICS, San Jose"]
    [Site "California USA)"]
    [Date "2009.12.09"]
    [Round "?"]
    [White "Damo"]
    [Black "Dalvero"]
    [Result "0-1"]
    [PlyCount "16"]
    [WhiteElo "1565"]
    [BlackElo "1465"]

    1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. Nc3 d6 5. h3 Nf6 6. Ng5 O-O 7. d3 Nd4 8. Bd2
    Be6 0-1 {White forfeits by disconnection}
[Event "FICS rated standard game (FICS, San Jose"]
[Site "California USA)"]
[Date "2009.12.09"]
[Round "?"]
[White "Jschmid"]
[Black "Damo"]
[Result "1-0"]
[PlyCount "98"]
[WhiteElo "1616"]
[BlackElo "1787"]

1. d4 d5 2. c4 dxc4 3. Qa4+ Nc6 4. e3 Be6 5. Nf3 Qd7 6. Bxc4 Bxc4 7. Qxc4 h6 8.
O-O e6 9. a3 Bd6 10. e4 e5 11. Be3 Nf6 12. Nbd2 O-O-O 13. d5 Na5 14. Qc3 b6 15.
Qd3 Ng4 16. Qa6+ Kb8 17. Qd3 f5 18. exf5 Nxe3 19. Qxe3 Qxf5 20. b4 Nb7 21. Nc4
e4 22. Nd4 Qxd5 1-0 {Black resigns} 

[Event "FICS rated standard game (FICS, San Jose"]
[Site "California USA)"]
[Date "2009.12.09"]
[Round "?"]
[White "Damo"]
[Black "Dalvero"]
[Result "0-1"]
[PlyCount "16"]
[WhiteElo "1565"]
[BlackElo "1465"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. Nc3 d6 5. h3 Nf6 6. Ng5 O-O 7. d3 Nd4 8. Bd2
Be6 0-1 {White forfeits by disconnection}

Anyway, I tried the regex in KM with the text included in that action, and got the same result you did, with only the first line being matched. However, I also found that pasting the text into BBEdit and then back into KM made it work how it was supposed to, i.e. matching everything down to the second event. I don't know how or why, but I can only conclude that there's some strange line wrapping going on with the data that you're pulling in. Fortunately, you don't need to know how or why it's happening to work around it. In my testing, adding a single Filter action that set all the line endings to Unix ones got the regex working as expected again. You can try for yourself by enabling and disabling the Filter action in these actions:

Keyboard Maestro Actions.kmactions (3.0 KB)

1 Like

Possibly it is a line ending issue.

The regex as written handles linefeed, or return-linefeed, but does not handle return-only line endings, which are definitely possible on the Mac (that is actually the traditional line ending for Macs, and is still relatively common on the clipboard).

Try replacing \r?\n with \R? or [\r\n|\r|\n].

3 Likes

Thanks for the expert advice, Peter! Replacing \r?\n with \R? seems to have done the trick.

1 Like

You got in just before me @gglick! Yes, all three solutions work, filtering with Unix line endings,
replacing \r?\n with [\r\n|\r|\n] and replacing it with \R?.

Hopefully this will be useful for other people with similar outlier situations. I'm delighted, this has been stumping me for a few days now.

1 Like