Text columnar layout made from PDF to text scraped into single column

A way of anonymizing the data would be to change all the digits to 3 and all the letters to c. If there are fixed strings you want to preserve (eg “Spouse”), change them to something else, then change them back later.

For example, say the data is like this:

Temple, Shirley
Spouse: Bill Mayer
Street: Main Street
City, ST, ZIP: Washington, 947564
Email: bla@bla.com

Stick it in BBEdit, and then do the sequence:

Search and Replace regex [0-9] with 3
Search and Replace Spouse (case sensitive) with 1
Search and Replace Email (case sensitive) with 2
Search and Replace Name (case sensitive) with 4
Search and Replace regex [a-zA-Z] with c
Search and Replace ccc+ with ccc
Search and Replace 1 with Spouse
Search and Replace 2 with Email
Search and Replace 4 with Name

to get something like:

ccc, ccc
Spouse: ccc ccc
Street: ccc ccc
City, ST, ZIP: ccc, 333333
Email: ccc@ccc.ccc

That way all the required data for figuring out the regex is there, but the confidential information is almost entirely obliterated.

Double check before posting!

1 Like