Sorry for a repeat post , I understand this has been addressed but I'm learning regex and can't figure.
I'm trying to extract text in multiple lines as seen below.
I'm trying to extract the two reference number as seen below. KM extracts a single one.
@JMichaelTX has answered your question, but I have to ask, what is your intention with (.|)*? Because that is a very strange construction, which matches exactly the same as .*, but is dangerously close to being a pathological regex since it can match an infinite number of nothings before matching the .. For example the text “hello” could match that regex as:
()()()()()(.)(.)()(.)()()()(.)()()()(.)()()()
And any number of () empty matches could be included in each position.
OK, well, if you aren't trying to do anything more, .* is generally what you want, matching any sequence of zero or more of any character except line ending characters (there are ways to make . also match line ending characters, but by default it does not).
My preference would be to use two Capture Groups with one Search using Regular Expression action, since you probably want to use each Reference# separately.
==Assuming the Reference numbers are defined by a Regex Word==, then this Regex should work: (?i)reference.+?(\w+).*\R.+?reference.+?(\w+)
If you want a broader definition of Reference number that must start with a RegEx word character, but then can be anything other than a SPACE, then this would work: (?i)reference.+?(\w[^ ]+).*\R.+?reference.+?(\w[^ \n\r]+)
Below is just an example written in response to your request. You will need to use as an example and/or change to meet your workflow automation needs.
Example Output
NOTE: There may be minor errors in the source text, which was obtained by OCR of your image. In the future, please post your source text using a Forum Code Block.
This is just an example written in response to the below KM Forum Topic. You will need to use as an example and/or change to meet your workflow automation needs.
MACRO SETUP
Carefully review the Release Notes and the Macro Actions
Make sure you understand what the Macro will do.
You are responsible for running the Macro, not me.
I want to personally thank you all for answering my queries in such detailed replies. @JMichaelTX@peternlewis@thoffman666 There are so many things for me to learn
So \R allows me to continue my regex to match next line but not all lines . @JMichaelTX in ur macro is \i needed when regex is already case insensitive ? @thoffman666 in your answer we capture the string in group and replace with string and new line , may ask what does localextractedtext regex doing ?
The second regex search and replace is just to get rid of all the text after the last found Reference Number. You can see that as the difference between the outputs for Preprocessed Text and Extracted Text. There's no need to use two different variables for this. I did so just to make the example easier to understand.
Technically it is not needed, but I have just developed a habit of always providing the Regex Options at the beginning of the pattern -- it makes things explicitly clear. Some KM Actions let set choose case matching, but others don't.