I see I misread your initial post slightly...
Try this pattern:
(?ms)^\n
-Chris
I see I misread your initial post slightly...
Try this pattern:
(?ms)^\n
-Chris
That worked.
For those who encounter the same problem as I did, (?ms) simply tells the system to process multiple lines. Otherwise it seems it will only process the very first line. Correct me if I'm wrong.
Thanks so much!
I would use this to match any newline character (including CRLF), not just linefeed:
SEARCH FOR:
\R+
REPLACE WITH:
\n
It will remove all blank lines.
Thanks JMichaelTX. Yours works too but not if there are more than 1 consecutive line returns.
It removes ALL blank lines in my testing.
See https://regex101.com/r/kOtsRs/1/
But you would need to be running macOS 10.11+
Hey Guys,
Let's make that just a trifle more robust.
Sometimes horizontal whitespace can sneak into what you think are empty blank lines, and you may also want to prevent the last line from having a linefeed.
JM's pattern has the advantage of simplicity and of taking any line endings and ensuring they end up being linefeeds, but you would need a second pass to remove the last linefeed (if desired).
The regular expression below REQUIRES macOS 10.11 or later.
RegEx ⇢ String ⇢ Test KM Find & Replace RegEx.kmmacros (2.8 KB)
-Chris
A couple points that may help your understanding:
The Flag Options are described in the reg ex help (Help ➤ ICU Regular Expression Reference), and the two in question are:
So the “s” in “(?ms)^\n” is redundant, since you are not using “.” in the pattern. Just “(?m)^\n” will be fine.
Next, ^ and $ are zero width matches. They match at the start or end of the text, or with (?m) at the start or end fo the line, but they do not match any characters per se. Thus replacing them with an empty string accomplices nothing because they are already an empty string.
The best option for removing blank lines is:
@JMichaelTX’s solution is cleaner, but fails to remove blank lines at the start of the variable:
The 10.10 solutions do not handle alternative line endings (\r or \r\n), but that is probably not an issue.
Well, unless I fouled something up, It looks like to me that all of our solutions leave a blank line on the bottom if there were multiple blanks lines at the bottom to start with:
Source data for all tests
–––––––––––––––––––––––––––––––––––
This is text line 1
This is text line 2
a normal line
This is text line 3
–––––––––––––––––––––––––––––––––––
last line
Chris' Solution
Peter's Solution
JMichaelTX's solution
Here's my test macro:
#### DOWNLOAD:
<a class="attachment" href="/uploads/default/original/3X/0/6/067f3d3230730ea05ffc38775aa21679aaef6e32.kmmacros">TEST Regex Replace Syntax.kmmacros</a> (3.1 KB)
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**
---
![image|458x966](upload://qsexXI9B1MGAlJ2PbWIE7miwWJx.jpg)
---
### KM to the rescue!
Get rid of all lines at top and bottom of text! 👍
[Filter action -- Trim Whitespace](https://wiki.keyboardmaestro.com/action/Filter)
There is generally meant to be a line ending character at the end of a list of lines, assuming the lines are text. This may or may not be displayed as a blank line at the end of the text.
So a piece of text with three lines a, b, c would be “a\nb\nc\n”.
As you note, if that is not what you want, then there are further tools to solve the problem.
Easy fix:
Chris:
(?m)(^\h*\R)|(\R+\Z)
Interesting point/discussion. For a while I called this stuff "end-of-line" characters. But I think that is wrong, as most references call them "newline" or "line break" characters.
That would imply that the last line would not have one.
I guess the point is you never know for sure what you will get. If you don't want a blank line on the bottom, then you probably need to take action to remove it, if it is there.
Right.
When you assume, you'll get bitten more often.
I nearly always remove vertical head and tail whitespace when I'm massaging text.
\A\s+|\s+\Z
Here's a tweak to my pattern in post #9 above that will remove the EOL character from the last line if it exists.
(?m)(^\h*\R)|(\R+\Z)
-Chris
My brain is developing an aneurysm trying to understand this ICU GREP stuff. For years I've only used the type of GREP that TextWrangler uses (don't know exactly which one it uses).
For example: In TW /r will find a line return but in KM it must be /R.
It seems that I now have to start learning the type that KM uses.
Two questions:
Might anyone know of a way to change TextWrangler's GREP system to match KM's so I don't go crazy with their differences? If not, can KM's be changed to Match TextWrangler's. If not, Is there another text editor that has support for ICU GREP?
There's a lot that I don't understand on the ICU User Guide. Just one example would be:
{n,m} Match between n and m times. Match as many times as possible, but not more than m.
I don't understand exactly what they mean by "Match between n and m times". Before I can understand this, I'd first need to know what "n" and "m" exactly mean. I did a Google search but found nothing.
Can someone recommend a book/resource for beginners that I can study?
Again, thanks to all for your help. This is an amazing forum!
Looks familiar.
Both TW (now BBEdit Lite) and KM use RegEx based on PCRE.
So, they are very similar, but with some differences. KM uses the RegEx engine provided by the macOS, which is ICU Regular Expressions.
TW has been deprecated, replaced with BBEdit Lite. From the BBEdit User's Manual, Chapter 8, p168:
BBEdit’s grep engine is based on the PCRE library package, which is open source software, written by Philip Hazel, and copyright 1997-2004 by the University of Cambridge, England. For details, see: http://www.pcre.org/
KM uses the ICU Regular Expressions, which Unicode Technical Standard #18. This is, in essense, PCRE, "The regular expression patterns and behavior are based on Perl's regular expressions"
However, as best as I can tell (@peternlewis and @ccstone might know more about this) , BBEdit does do some things differently:
\r \n
and \R
to all simply match "'hard' line break".[The above items are under review 2018-08-10.]
From the BBEdit User's Manual, Chapter 8, p173:
Personally, while I use BBEdit RegEx a lot to manipulate text, I rarely use it to develop RegEx patterns. For that, I mostly use Regex101.com
Sorry, but that is incorrect.
First of all, you need to use the backslash rather than the forwardslash.
Both BBEdit and KM can use both \r
and \R
, but they have different meanings, at least in ICU compliant apps like macOS and KM:
I'll try to post some RegEx references later.
Basically, no. BBEdit and TextWrangler (which is defunct now anyway, use BBEdit Lite) use PCRE, where as Keyboard Maestro uses the system icucore ICU regular expressions.
They are similar in most ways, but they do have their differences - not that I could find a good description of what the differences are.
n and m are numbers you provide. So a{3,5}
will match aaaaa
or aaaa
or aaa
(in that order).
Keyboard Maestro uses the system ICU Core Regular Expression engine, which is different to PCRE (Perl Compatible Regular Expressions), although it is very similar. And as you note, BBEdit seems to have adjusted PCRE as well.
It would be nice to have a definitive list of what is different between the two, but I can't seem to find even a set of differences between PCRE and ICU, and then BBEdit is its own world again.
I've always loved those type of statements.
According to ICU Regular Expressions:
"The regular expression patterns and behavior are based on Perl's regular expressions"
To me, it boils down to this:
Hey @project_guru,
A lot of people feel that way when they start fooling with regular expressions.
Me included.
See this section of our wiki page on regular expressions (and below) for a decent listing of resources.
https://wiki.keyboardmaestro.com/Regular_Expressions#Books
If you regularly use BBEdit and/or TextWrangler then you might want my BBEdit Cheat Sheet too:
-Chris
@project_guru, I hope you don't mind that I have revised your topic title to better reflect the question you have asked.
FROM:
GREP not catching "^" beginning of line
TO:
How Do I Use GREP RegEx to Remove Newline Characters (like CR, LF)?
This will greatly help you attract more experienced users to help solve your problem, and will help future readers find your question, and the solution.
Hi Chris, it's very nice of you to share your cheat sheet. Thanks!
I've created a KM action that will open its URL from the Status Menu