thank you
regex101: build, test, and debug regex
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
thank you
Well, we would need to see the concrete source of the data,
but if, for example, it turned out that:
then vanilla ( regex-free ) KM actions might let you fairly quickly:
Recast each tab-delimited field as a separate line:
and then let a for each action wind through those lines,
just giving each of them, in turn, a variable name like part
.
so that final value with the name part
would be the last segment of the string:
thanks a lot. I am trying it and will give you a follow-up
Just to keep it simple (I can later create a version with variables) I highlighted whole the text → copy → macro action below → paste
all tabs are replaced with line feed
I think that the text I end up with is basically the equivalent of cliplines.
After that I don't follow. Sorry. thank you
Last tab-delimited segment of a string.kmmacros (20.4 KB)
If:
then this would return what you want in the part
variable:
Last tab-delimited segment of a string.kmmacros (20.4 KB)
but it may simply be that that isn't exactly the pattern of the source data.
What application do these annotations come from ?
A problem with PDF annotations is that each annotation is proceeded by a prefix which makes for very tedious reading, so I want to just delete all prefixes in a RTF file text.
The following regex works
^(?\d+)\s(?Highlight)\s\s(?\d{4}-\d+-\d+),\s(?\d+:\d+:\d+)\s(?.+)$with the following prefix
10 Highlight 2020-11-20, 21:48:47 Lorem ipsum...But I would like it to work with 2 variants.
1- variant one
90 Highlight 2020-11-20, 21:35:17 Lorem ipsum...
I think that the difference is that it works with a single space after the first number, and it should be a tab instead2- variant two
Same as above but the name John Smith (ie author's name) is added followed by a tab
2 Highlight John Smith 2020-11-20, 17:19:26 Lorem ipsum...
First, I would like to encourage you to keep using and learning Regular Expressions (RegEx).
IMO, it is one of the most powerful and useful languages, and can be used just about everywhere.
Now, to your request.
Actually, you were pretty close with:
^(?\d+)\s(?Highlight)\s\s(?\d{4}-\d+-\d+),\s(?\d+:\d+:\d+)\s(?.+)$
My solution is:
(?mi)^\d+.+\d{4}-\d{2}-\d{2},\h+\d{1,2}:\d{1,2}:\d{1,2}\h+(.+)
For details, see regex101: build, test, and debug regex
The key to developing a good reliable RegEx is in recognizing the patterns in the data.
In this case, here is what I see:
Note that I used the following RegEx patterns:
\h+
to match one or more TABs and/or SPACES\d{1,2}
to make a one or two digitsIt might be tempting to include "Highlight" in the pattern, but that word might not apply to all annotations. IAC, it is not needed for a good match.
So, the KM solution is to use Search and Replace action:
Search FOR (using Regex)
(?mi)^\d+.+\d{4}-\d{2}-\d{2},\h+\d{1,2}:\d{1,2}:\d{1,2}\h+(.+)
Replace WITH
\1
Please feel free to ask any questions.
BTW, the RegEx101.com site is now back up.
You have a great solution. thanks VERY much. Works perfectly. Regex: I am trying but it's hard
thanks very much.
@JMichaelTX @ComplexPoint @tiffle
I don't know what Crystal Meth is like, but 2 days working on a regex must come close.
But hopefully, afterwards, you still remember what you’ve learned
I don't know what Crystal Meth is like, but 2 days working on a regex must come close.
Short regular expressions for personal use are still legal in most jurisdictions, but distribution of longer regexes really deserves some close attention from the regulatory authorities.
Personal desktop use for string mangling was never what Kleene's [Regular language - Wikipedia]
(Regular language - Wikipedia) formalism was intended for
But hopefully, afterwards, you still remember what you’ve learned
Sadly, whatever you wrote is seldom even legible the following morning, unless it was extremely short ...
Regular, but sub-economically time-consuming, and write-only.
Regular expressions are addictive, because they are:
(Which creates an insatiable hunger for more practice material, and a temptation to encourage others to become aspirant users, seeking help, and supplying material)
Like many such substances, best used in tiny (very short) quantities, and only after other possibilities have been exhausted.
The Jamie Zawinksi point is not far from the mark:
[Jamie Zawinski - Wikiquote]
(Jamie Zawinski - Wikiquote)
Two days is no joke ...
Short regular expressions for personal use are still legal in most jurisdictions, but distribution of longer regexes really deserves some close attention from the regulatory authorities.
very funny !
Thanks to your email I will plunge into regex, starting with the basics and try to emulate your short and sweet approach.
I will plunge into regex, starting with the basics
Great! I have no doubt that you can become productive using RegEx in short order.
Not only is RegEx101.com a great place to develop and test RegEx, it also provides detailed explanations of a RegEx pattern. For example:
See regex101: build, test, and debug regex
In spite of @ComplexPoint's campaign to urge everyone to not use RegEx, using false claims, Regex remains a very powerful language. Like any language, it requires study and practice to master. But there are many simple RegEx patterns that can solve many common problems.
As you know, I also like, and often use, both JavaScript and AppleScript. While string manipulation can easily be done with the powerful bult-in functions of JavaScript, there are many string manipulations that can be easily done using one line of RegEx, but would require many, many lines of JavaScript.
Regular expressions (RegEx or RegExp) are extremely powerful, but have an initial steep learning curve that is often intimidating. But once you get over that initial hump, and you continue to write new RegExp, it will become much easier.
I do all of my RegEx development at this free website:
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
You may also find these sites helpful:
thanks very much for the detailed info. I source of frustration is that fact that a regex may with on reg101, but not in BBEdit or KM search replace regex.
I source of frustration is that fact that a regex may with on reg101, but not in BBEdit or KM search replace regex.
IME, I have found Regex101.com and KM to be very consistent.
However, it does require that you setup Regex101.com correctly, set the needed Regex options in the Regex pattern, and use the proper KM settings.
(?mi)
(?mi)
" in all of the above examples, but you should only use the options that you need:
m
" -- Multiline so that "^" and "$" match on each linei
" -- Case insensitivePlease let me know if you have other questions.
Strange I posted a detailed reply, and even corrected a typo, and it has disappeared.
Yes - I saw your reply and in between then and now (about 30 minutes) it has gone! (It had a lot of images...)
thank you for your comment
So I hear the reply disappeared. All is fine now. Thank you Chris who went through all the trouble of reviving it.
Thanks again very much. I am very sorry to take so much of your time.
I am going crazy simply because I get different results with BBEdit, KM and Regex101
How can I even start to lean in those conditions
That's what I did for 2 days, jumping from one to the other.
Please note that the BBEdit playground is useful because you can save the regex to a text factory.
The example below uses your excellent annotation clean regex.
Sorry for the many images
Objective
Convert
1 Highlight 2020-11-20, 20:20:30 Lorem ipsum...
testing regex
To
Lorem ipsum...
testing regex
Note that the testing regex line is there just so you don't think that I am simply deleting all characters before character no: 53
Regex used is your excellent one:
(?mi)^\d+.+\d{4}-\d{2}-\d{2},\h+\d{1,2}:\d{1,2}:\d{1,2}\h+(.+)
The regex101 link for the issue below is:
Regex101 allows you to create, debug, test and have your expressions explained for PHP, PCRE, Python, Golang and JavaScript. The website also features a community where you can share useful expressions.
The KM macro works fine
BBEdit playground selects ALL the text including Lorem ipsum
Regex101 does nothing despite making sure I followed your settings guideline
Hey @ronald,
Well - a big part of the problem is that you don't know what you're doing.
That makes the learning process more difficult. Believe me – I almost went bald when I first started trying to learn regex back about '96.
Of course it doesn't help that Keyboard Maestro and BBEdit and RegEx101 can use different syntaxes of regular expressions either.
However – the way you've got RegEx101 configured looks like it should work...
Except that you've got an extra linefeed in the regex pattern. (Good thing you posted a link to the actual saved page!)
Your test set with the extra linefeed:
Your test set with the extra linefeed removed:
My test set, before I realized you had one saved already:
BBEdit's playground is selecting the entire text, because that's what you told it to do.
Note how I've added a replacement pattern \1
, and it shows what the replacement text will be.
If you click the Next button the next line will be highlighted in bright yellow and will show what will be replaced there.
With stuff like this you need to NOT knock yourself out. If you can't grok the problem/solution in 15-30-60 minutes then you need to reach out for some help.
I've spent hours banging my head against the wall, when what I really needed was a little advice from someone more advanced on the learning curve than me.
Over the years I've gotten pretty good at limiting the amount of blood spilled from all the banging.
-Chris
It's all crystal clear now, in terms of regex101, BBEdit and KM macro and what my mistakes were.
thank you both you’re your advice.
In fact I tried leaning regex a few times in the past, obviously unsuccessfully.
I think that a fundamental problem with regex is that all the online courses go through a dizzying number of expressions which I would forget as soon as I started the next chapter.
The reality is that the best time to learn regex is when I need it, ie need it to solve a specific problem. The range of regex expressions that I will end up using is very narrow, and much of what is taught in those courses I will never use.
That being said, I will try again.