RegEx -- Harvest Variables and Trim Trailing White Space

For the given text:

Expand
Bohrer, Phyllis

The following works well. thank you @ccstone

Sometimes there is a space at the end of the first name and/or last name and the action then fails.
I looked up filtering white space and used = [ \t]+$ to no avail.
Appreciate any help

  • if we could maybe be prepared for a space at the end of the last name and/or at the end of the first name, so that it works whether the space is there or not. - possible? - Thanx

Hey @troy,

What I usually do with this sort of problem is run a “sanity” script to prepare input text for find and/or find/replace operations.

So for instance I'll:

Strip leading & trailing vertical whitespace	⇢ \A\s+|\s+\Z
Strip leading horizontal whitespace				⇢	^\h+
Strip trailing  horizontal whitespace			⇢	\h+$
Fix comma spacing								⇢	\h*,\h* to “, ”

Every time I encounter an anomaly in my input text I'll add another fix to the sanity filter.

This method keeps me from having to go crazy trying to predict every eventuality in my input text.

Makes sense?

-Chris

I tried the following to no avail. It did not remove any spaces.
In addition, if that worked it would not take care of the internal space after the last name,

Expand
Campbell , Colleen 

it only addresses 'line ending' white spaces. Correct?

This worked for removing the space before the comma, which is the immediate issue. I hear what you are saying though, to not fix just one issue but be able to resolve a number of issues as they appear.
As always, grateful for your help.

Search for a Regular Expression lie this:

Expand\n(w+)(\s)?,\s(\w+)(\s)?

The ? mean that the (\s) maybe is there or not...

Check it out here...

that does not work for me to set the variables First name and last name on the text:
Nor does it work on regex101.com

Expand
Boland, Lauren

or

Expand
Boland , Lauren

the regex

Expand\n(\w+)\s?,\s(\w+)\s?

Capture first name and last name, just keep in ( ) what you need to capture

42%20PM

1 Like

\s* is generally better than \s? for this kind of thing. * means zero or more, ? means 0 or 1.

Note that \s includes line endings, so you need to be slightly careful when using it. \h matches only horizontal white space, but is not supported until 10.11+, so if you happen to be using 10.10 or earlier, that will not work.

1 Like

Here is my solution:

RegEx:
Expand\R([\w'\-]+)\h*,\h*([\w'\-]+)

See https://regex101.com

Differences

  1. I prefer using \h* which will match 0 or more horizontal white spaces.
  2. this pattern will match names that contain "-" and "'", as in:
    O'conner, Phyllis-Adams
  3. It uses \R to match any new line character, not just LF

Note that \h and \R require OS X 10.11+

2 Likes

I just updated my post to correct a very minor error (which doesn't impact the match). I had an extra single quote (') not needed.

1 Like