How Do I Trim Whitespace at the End of Text Lines?

Please forgive me for asking something that might look very trivial to a RegEx pro — what I’m definitely not.

I’d like to trim the white space at the end of lines of the text in the System Clipboard.

This RegEx \h*$ looks fine in “regex101”:

CleanShot 2022-02-27@15h32m35

From my understanding, it should be translated into (^m)\h$ for Keyboard Maestro:

But it doesn’t work. :unamused:

What is my mistake ? What would be the solution to this seemingly trivial problem ?

Thank you very much.

You're close. The way to tell Keyboard Maestro to treat $ as the end of each line instead of the end of the entire string is to start the regular expression with (?m). Also, if you want to delete all the horizontal whitespace characters, you should use the + quantifier, which means "one or more." Therefore,

(?m)\h+$

is what you want.

1 Like

If you're happy to trim any starting space as well,
then a filter action might serve as an alternative to a regular expression.

Trim whitespace of each line.kmmacros (3.6 KB)

If your next step (as mine often is) is to remove the Return characters to merge everything into one line, then there's a better way.

First, you replace the Return characters with a Space, then you squish the spaces into a single space. Like this:

image

That will do everything all in a single command, and very efficiently.

Hi @olivierps. If you want to replace all trailing whitespace and all whitespace-only lines, the following RegEx can be used:

Here are those actions in a working example macro:

DOWNLOAD Macro File:
Remove Trailing Whitespace and Blank Lines [regex].kmmacros (13 KB)
Note: This macro was uploaded in a DISABLED state. It must be ENABLED before it can be run. If it does not trigger, the macro group might also need to be ENABLED.


1 Like

Thank you very much to all who have answered!
It’s the first proposition that does what I wanted/needed:

(?m)\h+$

@_jims : the pattern (?m)\s+$ “eats” all the empty lines and, in Markdown and for legibility, we need them.

@ComplexPoint : in Markdown, white space at the beginning of lines has a semantic meaning; so I definitely don’t want to trim it.

2 Likes

Understood.

In that context I would personally use:

  • .trimEnd() (in a KM Execute JavaScript for Automation action), or
  • rstrip() (in a KM Execute a Shell Script action calling Python)

Beyond very short and trivial patterns, Regular Expressions cost a lot of fiddle, puzzlement and debugging, in exchange for rather limited scope.

This forum is remarkably full of posts where the user has started with one problem, and by choosing Regular Expressions has ended up with two problems, so that some permutation of "Regex" actually figures in the way they name their difficulty :slight_smile:

(Regex problems have moved into the foreground, to the extent that the original problem has become occluded)

Time invested in learning anything beyond the most trivial Regular Expressions is really much better invested in experimenting with a scripting language like Python or JavaScript.

They are more readable, give you more flexibility and power, and waste less time.

Using JS here (just because Keyboard Maestro, and the uncertainties of macOS Python versions and locations, make it easier),

Perhaps something like:

(() => {
    "use strict";

    return Application("Keyboard Maestro Engine")
        .getvariable("espacesSuperflues")
        .split("\n")
        .map(x => x.trimEnd())
        .join("\n");
})();

Trim trailling space of each line.kmmacros (2.8 KB)

1 Like

I’m completely with you about “solving a problem uncovers another problem whose solving uncovers another problem…” Some call this “productivity porn”, some others “yak shaving” (because when you shave a yak, what you get takes more space than the yak itself). I’m well aware of this and I try to avoid it as much as I can… :wink:

In my case, it’s learning Javascript and how to interface it with KM that seems well beyond my reach because of the time investment it would require. Anyway, as computing is kind of a hobby for me, I’ll ask you if you’d have a good book or introductory tutorial to recommend ? (I know the principles of programming, but haven’t actually written anything beyond FileMaker scripts for years.)

The gestes élémentaires are just:

and:

One common way into it at the moment is the first 8 or 9 chapters of « Eloquent JavaScript »

[Eloquent JavaScript]( https://eloquentjavascript.net/ )

(You don't need the remaining chapters, which are about the special case of JavaScript in Browsers)

To be honest (looking at it again),

just the first 5 chapters.

Thank you very much!

(tu/vous parlez français, by the way?)

Yes, since I wasn't sure of your requirements, I added the qualifier when I posted the pattern. Now I see that @drdrang's RegEx pattern is the best for your requirements.

With that said, it never hurts to contribute alternatives since others may see this post in the future and have the need to remove trailing whitespace AND remove all whitespace-only lines.


FYI, I recently shared a macro that is pertinent to this topic: Text Transformation EXAMPLES

JavaScript is not a language I've learned*, but I have been known to borrow JavaScript code from others. I've even made some minor changes when I've felt particularly brave. @ComplexPoint's Markdown Link Tool is one such example.

(*I'm a relatively old chemical/process automation engineer and through the many years we often had to do complex transformations particularly when were were moving a configuration from one control system to another. For the really fun stuff, we used awk, first on HP-UX, later on macOS, and then finally on Windows. Oddly enough, there was a third-party awk compiler available on Windows. It was very good.)

I've added @ComplexPoint's JXA method to the example macro. It will be included with the next update.

Macro Group-image

Someone, correct me if I'm wrong, but I think one must use KM global variables within JXA. In the image above, you'll see I added the Set Variable to Text and Execute AppleScript to delete the global variable before and after the Execute JavaScript For Automation action, respectively.

Finally, I'd like to point out that there is one subtle difference between @drdrang's RegEx and @ComplexPoint's JXA. The former does not remove leading and trailing whitespace-only lines.

Output-images


A couple of followup comments:

First, I agree with @ComplexPoint that too many people try to use complex regular expressions to solve problems that are better solved by a few steps of scripting. But in this case, I would argue that trimEnd is basically a regex substitution hiding under another name.

Having said that, I'm now going to backtrack (yes, that's a regex joke). @olivierps didn't mention in the original question that this was going to be used on Markdown text. Stripping whitespace from the ends of Markdown lines is, in general, not a good idea. I can think of two situations in which whitespace at the end of a line is essential:

  1. Paragraphs within a list item are created by adding a line that contains either four spaces or a tab and nothing else. Although you may think of such a line as beginning with whitespace, it also ends with whitespace and would be ruined by my regex substitution (and by trimEnd, too).
  2. Hard line breaks are created by adding two spaces at the end of a line. These, too, would be ruined by stripping all line-ending whitespace.

I'm guessing @olivierps doesn't use these Markdown constructs or they wouldn't have asked for a solution that strips all trailing whitespace. So maybe the regex I suggested will work for every case @olivierps encounters, but it's not a general solution for Markdown text. If I were going to build a solution for Markdown that respected trailing whitespace for the two special cases above, I would not try to do it with a single regular expression. I would use a script.

1 Like