How do I Remove Blank Lines with Regex?

troy · November 27, 2024, 3:12pm

How do I Remove Blank Lines with Regex?
I found a couple posts on the forum and followed the suggestions to no avail.

Ultimately while I'm at it, the 'numbered' lines could also start with "#1. " so I'd like to also remove the "#1. ", at the top of those lines leaving the rest of the text in those lines.

If I could at least get the "1. " and / or the "#1. " removed at the top of those lines and all the other lines removed completely that would be great.
If I can't get the space before "Tone:" or "Style:" that's fine, I can manually add 2 carriage returns.
There will always be the words Audience, Tone and Style in the original block of text.

I appreciate any feedback.
Cheers

Set Variable to Text.kmactions (1.5 KB)

Keyboard Maestro Export

Summary

This text will be hidden

TheText source original block is:

Audience:
1. Pet Owners  
   - Primary focus on pet owners, as they are the most directly impacted by safe cleaning practices.
2. Environmentalists  
   - Secondary focus for eco-conscious readers interested in sustainable cleaning solutions.  
3. Parents  
   - Tertiary focus for families with young children, as safe cleaning practices benefit both pets and kids

Tone:
1. Friendly  
   - Establishes trust and relatability, especially important for pet owners and families.  
2. Descriptive  
   - Clearly explains the benefits and features of safe cleaning products, enhancing understanding.
3. Empathetic  
   - Addresses concerns about pet and family safety, connecting emotionally with the audience

Style:
1. Guide  
   - Provides actionable advice and step-by-step cleaning methods, ideal for pet owners seeking practical solutions.  
2. How-to  
   - Perfect for breaking down cleaning techniques and DIY solutions for hardwood floors and safe cleaning.  
3. FAQ  
   - Addresses common questions about safe cleaning products, reinforcing the article's authority and utility.

The desired result would be:

Audience:
Pet Owners  
Environmentalists  
Parents  

Tone:
Friendly  
Descriptive  
Empathetic  

Style:
Guide  
How-to  
FAQ

griffman · November 27, 2024, 3:35pm

I'm 100% positive there will be other much more elegant solutions, but here's a brute force approach with four Search/Replace via Regex actions:

Download Macro(s): Process a text block.kmmacros (5.8 KB)

Macro screenshot

Macro notes

Macros are always disabled when imported into the Keyboard Maestro Editor.
- The user must ensure the macro is enabled.
- The user must also ensure the macro's parent macro-group is enabled.

System information

macOS 14.7.1
Keyboard Maestro v11.0.3

It handles the sample block you posted:

I don't actually know why there's not a blank row at the top; I would have expected the final regex to add one, but it doesn't seem to do so. As noted, not elegant at all, and hard to scale as you add more conditions, but it does work.

(If you downloaded the macro as soon as I posted it, make sure you got the version where the first regex reads like this: (?m)^[1-9]+\.\ + ... I was missing a plus sign after the numbers, so double-digit line numbers were missed.)

-rob.

Airy · November 27, 2024, 4:03pm

I do this all the time, and I use this shell-based solution: (the "." is actually regex)

grep .

This solution has one additional side benefit - it adds a newline to the end of the last line if the last line ends with an end of file marker instead of a newline. This is extremely important to me as some shell commands will fail if the last line doesn't end with a newline.

ComplexPoint · November 27, 2024, 5:36pm

A bit puzzled by your title, and its relationship with the before and after samples.

You don't seem to want blank lines removed at all, as far as I can see ?

The pattern seems, instead, to be:

bulleted lines filtered out
numbered lines de-numbered
blank lines left in place

Is that about it ?

troy · November 27, 2024, 5:51pm

well yes, you are absolutely right....
I was off base by the fact that I was first able to remove the text from the 'bulleted'/hyphen lines, and then could not remove the blank lines.

With your correct assessment, you are correct:
bulleted lines filtered out
numbered lines de-numbered (if a "#1. " could also. be removed that's even better
blank lines left in place

ComplexPoint · November 27, 2024, 5:55pm

So lines triaged by prefix, essentially ?

Something like this ?

Lines triaged by prefix.kmmacros (3.4 KB)

Expand disclosure triangle to view JS source

const prefixed = /^([0-9\.\-\s]+)(.*)$/u;

return  kmvar.local_Source
        .split("\n")
        .flatMap(line => {
            const match = line.match(prefixed);

            return null === match
                ? [line]
                : match[0].startsWith(" ")
                    ? []
                    : [match[2]]
        })
        .join("\n");

ComplexPoint · November 27, 2024, 6:15pm

Or perhaps for a simpler pattern, as above, but using something like:

Expand disclosure triangle to view JS source

const numbered = /^([0-9]+\. )(.*)/u;

return kmvar.local_Source
    .split("\n")
    .flatMap(line =>
        line.startsWith(" ")
            ? []
            : (() => {
                const match = line.match(numbered);

                return null === match
                    ? [line]
                    : [match[2]]
            })()
    )
    .join("\n");

ComplexPoint · November 27, 2024, 6:57pm

For a single-pass (For Each) variant with no JS and just one regular expression:

^\d+\.\s+

Lines prefix-triaged by for Each action.kmmacros (6.5 KB)

How do I Remove Blank Lines with Regex?

Options