Cut lines with regex help

nikivi · August 25, 2016, 9:29am

I came up with regex that would supposedly cut two or more lines if the string has two or more lines. The regex is /.*/

This doesn’t work however. Can someone help me make it work. Sometimes the string I copy will have two or more lines and I want to remove them.

Thank you.

DanThomas · August 25, 2016, 5:15pm

(?m)(^.*$)([\w\W]*)

(?m) = multi-line, meaning "^" batches beginning of line, "$" matches end of line.
(^.*$) = match an entire line
([\w\W]*) = match the rest of the string

JMichaelTX · August 25, 2016, 9:31pm

If you are saying that you always want just the first line, regardless of how many lines are on the Clipboard, then this should do it. It is a slight variation of the solution provided by @DanThomas.

Assuming the clipboard contains this:

line 1
line 2
line 3

Uses this RegEx:
(?m)(^.*$)

DanThomas · August 25, 2016, 9:33pm

Much better than mine. I guess I focused too much on the “delete” part and not enough on the “result” part!

ccstone · August 26, 2016, 1:53am

Once you begin to ken regex extracting line 1 is easy enough, and TMTOWTDI (there's more than one way to do it).

Deleting anything after line 1 is similarly easy:

Many people don't realize that regex switches (in this case (?s) need not be at the beginning of a pattern.

-Chris

peternlewis · August 26, 2016, 4:02am

"/" is not part of a regular expression, not in the context of /.*/.

"/" is the delineator for regular expressions in some contexts (notable Perl), that is, you would have /regex/ where "regex" is the regular expression.

Unless you want to match an explicit "/" character, you would not use them in a regular expression in Keyboard Maestro.

@JMichaelTX’s Search Clipboard is a good solution, and you just need to use the expression (.*) (your .* expression will match the first line (since . matches any character except end of line characters normally), but you need the brackets to "capture" the match).

They can also be "scoped" within a non-capturing bracket, let this:

\A(.+)(?s:.+)

This would match the entire string (assuming it had started with at least one non-end-of-line character, and contained at least two characters) and would capture the non-end-of-line characters at the start.

The \A is redundant in this case. However if the string might start with an end-of-line character, then the match will fail with the \A or return the first line after leading end-of-line characters without it. Similarly, the match will fail for a single character, or if there are no end-of-line characters, then it will return the first (only) line minus the last character, so better is:

(.*)(?s:.*)

which will always match, and will return the first non-blank line (or an empty string) as the first capture, but may not match the entire string (thus works better with the Search Clipboard than the Search & Replace Clipboard.

You could also do this:

(?s:.*?)(.*)(?s:.*)

which will always match, and will return the first non-blank line (or an empty string) as the first capture, and always match the entire string.

Understanding the meaning of the (?s:REGEX) and the "*?" non-greedy match if worthwhile for extending your regex knowledge.

Regex, a bit like Keyboard Maestro, goes off into infinity. You don't need to know it all, but the more you know the more options you have available to you.

ccstone · August 26, 2016, 4:57am

That's right – the forward slashes are regex command delimiters used in languages like awk, sed, Perl, and JavaScript.

regex101.com and some others show them surrounding their regular expression pattern field, and it often confuses users who are unfamiliar with the regex code idioms of the various languages.

Patterns for finding text:

#!/usr/bin/env bash

read -r -d '' input <<'EOF'
one
two
three
four
five
EOF

awk '/f/' <<< "$input"
sed -n '/f/p' <<< "$input"
perl -wlne 'if ( /^.*f.*/ ) { print $& }'  <<< "$input"

Output for each command:

four
five

Patterns for replacing text:

#!/usr/bin/env bash

read -r -d '' input <<'EOF'
one
two
three
four
five
EOF

awk '{gsub(/e/,"•")};1' <<< "$input"
sed -E 's/e/•/g' <<< "$input"
perl -wnle 's/e/•/g; print' <<< "$input"

Once you've written 3 million of these you start seeing leaning toothpicks in your sleep...

Although some language allow you to substitute other characters for the forward slashes, so in sed and Perl I often use the exclamation point to write – especially when the regex pattern itself contains forward slashes:

sed -E 's!e!•!g' <<< "$input"
perl -wnle 's!e!•!g; print' <<< "$input"

Output for each command:

on•
two
thr••
four
fiv•

-Chris

nikivi · August 26, 2016, 12:06pm

Thank you all for your help. I’ve used the first solution by Dan and it worked great.

Can’t wait to write my first 3 million lines of this. One day.

JMichaelTX · August 26, 2016, 6:32pm

Good to hear that solved your problem. Please check the "Solved" checkbox at the bottom of Dan's post.

JMichaelTX · August 27, 2016, 5:42pm

Peter, since the period (".") will match only 1 character, don't you need this:
(.+)
to capture the first line?

peternlewis · August 29, 2016, 1:56am

Yes, the “*” is there in the .* that I wrote, unfortunately, Discourse took it as markdown and hid it. I have added some quoting to my post.

JMichaelTX · August 29, 2016, 2:01am

Sounds vaguely like "the dog ate my homework". LOL

Just kidding Peter.

Cut lines with regex help

Options