I'm hoping someone can help me figure out why this Regex is not finding both matches with one Action. It will find each pattern individually but not both together as 3 separate capture groups (it works here: regex101: build, test, and debug regex).
I'm assuming it's something to do with the flavor of Regex maybe? Or the fact that I'm only looking for the LAST match of each using Lookahead - combined with the Alternation it's only giving me the FIRST match and then stopping?
@ComplexPoint Thank You for the idea but I think the images I posted are a bit deceiving.
If you check the regex101: build, test, and debug regex link you’ll see that the body text I’m looking through is a log file with multiple lines and groups which are not all separated by commas.
Essentially the log file is read every 3 seconds during a transfer and the bottom most % and ETA is read to feed a progress bar. On top of that I’m monitoring the log file for any Errors that show up. I could easily just have a separate Regex Search Action for each pattern, but I’m confused as why it’s not working with the alternation.
It'll help if you export and upload the actual action, assuming you can't strip down the macro enough to post a demo of the borkage -- images are all well and good but don't show all the action options.
But it looks like the first part of the alternate is matching so the second part is never evaluated. Try this demo, then try it after replacing all ETAs in the text with ETBs so the first alternate fails:
I'm pretty clueless when it comes to RegEx, but I'm guessing the difference between KM's and regex.com's behaviour is that regex.com allows for global (all matches) while KM only returns the first match (see the Search Modifiers section of the action's Wiki page).
@DanThomas Thanks for the input. I believe the global modifier is also not assumed by KM and I forgot about that. I have come across this issue before and remember this post Feature request: RegEx search global modifier - #2 by JMichaelTX that explains this. I did try the (?m) and that didn't make a difference - but I'm having a bit more luck with the global modifier and the For Each action - but I haven't quite solved it yet.
@ComplexPoint This is good to know, I'll have to use this more often. In this particular case I only want the bottom most Percentage and ETA - but also the bottom most Error.
That won't help here -- you want to consider the whole string, not go line-by-line. And there is no global modifier for the "Search by Regular Expression" action.
Do you actually need to do it this way? If the text on regex.com is representative it would appear that you could do a more simple pattern, anchored on the end of the string and working backwards:
Ignore me, I've realised this wasn't doing what I thought. For whatever reason -- probably my ignorance! -- the pattern I thought would give a "less greedy" match between ETA and ERROR isn't doing so.
@Nige_S The problem is that there might not 'always' be an ERROR, so I can't anchor the regex to the ERROR. I need the most up to date Percentage and ETA, and the most recent Error if there is one.
To use the Global modifier in the "Search by Reg Ex..." you need to use the "For Each" action with the "Substrings" "matching in". Everything 'works' but the macro gets more unnecessarily complicated.
The Action below will "work" but the problem is that using Alternation means that during the 'Loop 1' it will write the FTP__PercentComplete and FTP__ETA capture groups to their variables, and during 'Loop 2' it will write the FTP__Error capture group - but since 'Loop 2' is the 'other match' it 'deletes' the FTP__PercentComplete and FTP__ETA variables when it writes the FTP__Error variable.
As a result I need three "Set Variable" actions at the beginning to clear the variables, and then at the end that append said variables during each loop.
Yeah, I should have realised that. I goofed anyway, the pattern wasn't actually matching what I thought it was. Sorry -- I should learn to until sure...
@ComplexPoint This is very cool - and it's almost so simple that your brain wants to make it harder to understand than it is, haha.
So this goes through each line - if the line contains Transferred: and ETA then sets the first 2 Variables based on their sequence number in the line, separated by commas. If the line does not contain Transferred or ETA, but contains ERROR it sets the Error variable.
What if I only want the "10s" part and NOT the ETA part?
Also, what if I only want the "ERROR : ..." and everything after this on this line?
I'm assuming I'll have to resort to regex for that?
For the ETA I just need the 10s or whatever comes after ETA
For the error I only need whatever the error is. In this case it's:
ERROR : 01.wav: Failed to copy: Put mkParentDir failed: mkdir "__For Upload" failed: findItem: failed to make FTP connection to "ftp.com:21": tls: first record does not look like a TLS handshake
So for the Error, everything on that line EXCEPT this
Back to the regex. In my last try I was smart enough to use \R to match any line ending -- and dumb enough to forget your couldn't use \R in a character set. D'oh!
So, looking at this in another way -- the problem is that you want the last matches in the text, but KM's "Search with Regular Expression" only returns the first. Solution -- reverse the line ordering in the text so you search for the first matches!
It seems to work with the limited text sample available, both with and without the error message. Assuming you're tailing the log file so as not to grab too many lines this should be plenty fast enough. If you are going to loop the search then remember to blank the "error" variable (Local_FTP_Error in my macro) before each search.
It still might be better to split only the relevant lines from the log text and then search on those. You might be able to split on the INFO lines and take the last or last-but-one element, but you really need to watch the log file while transfers are progressing to see how best to do it.
@Nige_S I really like this idea but the regex is not catching the error. When you make: (ERROR[^\n]*)optional with the ? right after, the ERROR match disappears. I took a quick look but haven't solved it.