Replace full stop new line with pipe character

I want to prepare a list of abbreviations on the clipboard for further use by replacing all trailing full stops followed by a new line character with a pipe character:

Dr.
Mrs.
vs. 

Becomes:

Dr|Mrs|vs|

The last pipe is added because the list ends with a new line character. I tried using a negative look ahead for \z representing the end of the input, but this had the effect that the last full stop wasn’t removed.

Is my task possible with one action?

One approach (split and join):

Pipe char from stop and possible new line.kmmacros (2.1 KB)


return kmvar.local_Source
.split(".")
.join("|")
.split("\n")
.join("")
3 Likes

Search and Replace for \.\n

1 Like

That’s indeed the regex that I use. But it’ll add one pipe too much, after the last abbreviation.

Was your initially stated requirement correct, or do you want only the first two dots to be replaced? Is the desired output actually of the form "Dr|Mrs|vs"?

Edit: I now suspect that you want "Dr|Mrs|vs.", but please clarify.

Why use one complicated action when you can use two simple ones?

Your problem seems to be that the last valid entry in the list will probably (definitely?) end with a ., and there will probably (definitely?) be a trailing linefeed. It's difficult to be certain with only one sample!

So the simplest approach is to delete any trailing periods/linefeeds, then do the search and replace:

image

I need a list of about 200 abbreviations converted to one regex with no trailing full stop or pipe. I can do this in two actions but out of interest I was wondering if it can be done in one action.

Out of interest I was wondering if it can be done in one action. I now solve it by removing the trailing pipe from the concatenated string.

Your question is an interesting one. Alas I don’t have an answer to it.

The last entry will definitely end with a full stop and a linefeed:

Download abbreviations-en_US.txt.zip @ Dropbox

Edit: Sorry, I posted before thinking it through. This will not remove the final full stop and newline, only not replace it. Will have to think further on this one.


Then I believe searching for \.\n(?=.) should do the trick. The positive lookahead at the end ensures that the full stop is followed by something for it to match and be replaced.

And that's the problem -- it's not conditional matching that's the issue, it's that @ALYB wants to do conditional replacement, "with a | except for the final match which should be with an empty string".

Probably doable in a single action by shelling out and using sed or similar, but I don't think you can do it with a "normal" regex S'n'R. There's a couple of possibilities here but a) (AFAIK) the KM regex engine doesn't do branch reset, and b) the "replacement pool" method does my head in!

1 Like

Yes, it does seem impossible contained within a single regex search and replace. Interesting article though, that I will check out further!


If the further implementation of the processed result includes a text field, then one could get rid of the trailing pipe using variable array notation though — Hiding the step of removing (or not reading in) the trailing pipe at the receiving end:

TEST Search and replace without leaving trailing pipe.kmmacros (3.1 KB)

In the split and join approach,

you can limit the range of interest to before the last dot:

const source = kmvar.local_Source;

return source.slice(
	0, 
	[...source].findLastIndex(
		c => "." === c
	)
)
.split(".\n")
.join("|");

Pipe char from stop and NL (except at end).kmmacros (2.2 KB)

3 Likes

I've had this thread's quest (for a one-action KM nativ solution) as a brain bug for a while here now. And when realizing that .\n is a fully valid variable array delimiter, I came up with this kind of "solution", consisting of a single Set Variable to Text action within a For Each. So not a true solution, but it is still quite contained, and is at the very least very native KM™:

TEST Replace full stop new line with pipe character- without leaving trailing pipe.kmmacros (3.6 KB)

Macro Image

But it is a lot more work -- and so takes a lot more time -- as the list gets longer. So I'd go with the simpler, quicker:

image

...which is also pretty much self-documenting -- "change it to a |-delimited pseudo-array then delete the last item".

That also lets us create a more generalised solution that caters for lists that don't always have periods at the end of items, or may not have the trailing linefeed:

Generalised Listing.kmmacros (4.8 KB)

Image

1 Like

And I fully agree! This plain normal way of solving it, or @ComplexPoint's JavaScript-scripted method, is the sensible way of performing the task. My take was only in the interest of exploring other ways to tackle the same.

Sometimes my curious mind leads me deep down these rabbit holes where my original problem evolves into something that does not at all really need solving. I still do enjoy "pure" problem solving like this though, and sometimes it leads to solutions that actually brings something to the table.

Other times I end up with absolutely silly "solutions", like the one attached below here as a supplement to this anecdote:

Silly anecdotal ''solution'' to this problem that was already perfectly solved from the get-go.

Before I realized that .\n was a fully valid variable array delimiter, I ended up finding a way to build a counter into the calculation field in the variable array notation, like this:

local__processed[CALCULATE(%Variable%local__processed[0]|%) + (CHARACTERS(%Variable%local__processed%) != 0)]|

Totally silly: Yes. But I must admit it, I really enjoyed "solving" this here problem I had created for myself!

TEST Replace full stop new line with pipe character- without leaving trailing pipe.kmmacros (3.8 KB)

1 Like

Which is where the fun happens!

2 Likes