Text Soap macro: a macro to clean up text as in annotation summaries of pdf files?


I use PDF Expert (it could be any other PDF app, like Adobe Acrobat, etc) to annotate PDFs, and then export the annotations to a Markup (or text) file which I keep as a summary.

Unfortunately, the annotations summaries end up being too detailed as per below, which makes them tedious to read.
I am referring to the extra information: Highlight [page x]: where x is the page number.

There is no option in the PDF apps not to display the extra info.

Would there be a way to write a macro which would delete all the Highlight [page x]: text throughout the file ?

Thanks very much !

example: filename: /Users/ronald1/Documents/ScanSnap/nihms424567.markdown

Highlight [page 1]: Effectiveness of Measures to Eradicate Staphylococcus aureus Carriage in Patients with Community-Associated Skin and Soft Tissue Infections:

Highlight [page 1]: Background—Despite a paucity of evidence, decolonization measures are prescribed for outpatients with recurrent Staphylococcus aureus skin and soft tissue infections (SSTI).

Highlight [page 1]: Participants were randomized to receive no therapeutic intervention (controls) or perform one of three 5-day regimens: 2% mupirocin ointment applied to the nares twice daily, intranasal mupirocin plus daily 4% chlorhexidine body washes, or intranasal mupirocin plus daily dilute bleach water baths.

Here you go:

Text Soap macro.kmmacros (6.7 KB)

It's the two Search and Replace actions that do the work. Normally I'd explain the regular expressions, but they're too long. So instead, here's the links to these examples on regex101.com:


1 Like

Thank you very much Dan.
I don’t understand where the input and output are defined.

sorry, I understand now. It is a search and replace. It is up to the user to trigger the macro within the right file.

Right. All I was showing was what action to do to do the search and replace.

Let me know if you need more information.

Not to be confused with TextSoap :grinning:

1 Like

Text Soap app: I did buy it a while back. Too complex for me, and I did try !

Hello Dan,
The macro works perfectly. If I want to create a duplicate macro with a small variation:
1- I would trigger the macro within the markdown or text file itself to clean it up.
2- there are some other annotations where instead of ‘highlight’ (as in ‘Highlight [page 1]:’) , it’s just another type of text, like Rectangle [page 1]:which is no problem. I can adapt your reg formula and simply change Highlight for Rectangle. My question is how to structure the macro in such a way that I could automatically do a series of search and replaces.
thanks very much

You’d need to:

  1. Create a macro group that is active when the editor you’re using is the front (active) application.

  2. Create a macro in that group that does the following:
    a. Select all
    b. Copy
    c. Do the search and replace, against the clipboard instead of a variable
    d. Paste

  3. Give yourself some way to trigger the macro. HotKey, Palette, Status Menu, etc.

My question is how to structure the macro in such a way that I could automatically do a series of search and replaces.

In “2. c.” above, just have multiple Search and Replace actions for each condition, or get creative and modify the regular expression to support multiple conditions (probably more work than it’s worth).

I’m intentionally not writing the entire thing for you, for these reasons:

  1. If you know how to do it, I’m just wasting time and perhaps insulting your intelligence.

  2. If you don’t know how to do it but are willing to try, you can probably figure it out yourself, and it’s a good learning experience.

With that said, if you get stuck, just ask! I don’t mind writing it, if need be. :slight_smile:

1 Like

thank you for your constructive reply. I will start working on it, and … happy new year !

1 Like

I am sorry Dan. I am lost. I had other work to do and am now coming back to this macro.
What is the input for the macro. I am trying to determine where / how to start the macro going.
thanks very much

My suggested macro assumes you’re in whatever editor you use for writing markdown files. So the macro selects all the text in the editor, does the search-and-replace, then pastes the result back into the editor.

but what I get in the end is:

  • text in the editor unchanged
  • a display text box with the same text as in your macro.
    I can image how irritating this is for you. Very sorry.

If I understand the macro correctly, @DanThomas has set the variable PlainText to your example text. Of, course, in real live you want to use the actual text, not the example text. (See Dan’s comment in the macro: “The only thing that’s relevant is the "Search and Replace action, which I’ve colored Magenta.”)

So, you have at least two basic possibilities to get your actual text “into” the PlainText variable:

  • Use a “Set Variable PlainText to Clipboard” action. (And copy your text to the clipboard before running the macro.)

  • Use a “Read File to…Variable PlainText” action, where the file to read is your exported text file.

The same for the green part of the macro.

1 Like

No, don’t be sorry. I love figuring this kind of thing out.

Here’s the issue: The screenshot you showed me is from the original macro I posted.

After I posted that, I posted an explanation of the macro you should write. Please see this post.

So, follow those steps and create that macro. It should do the trick. If you need help, just ask.

1 Like

@Tom @DanThomas

Thanks to both of your for your patience and help.

My mistake was obviously dumb: I was just executing the demo !

Dan: I read the post. I want to fix the 'engine' first. The rest select all, copy to clipboard, etc is simple.

As per below, I cannot find the clipboard option to set the plaintext variable to clipboard.

Once again, there is something that I did obviously did not understand.

thanks very much !

You have to look in the Insert Token menu:

1 Like

Well, you have said you wanted to set the PlainText variable to the clipboard content.

Just select the “Current Clipboard” token as shown in the screenshot. Of course, before that you should remove the demo text from the field.

With the token from the menu inserted, it should look like this:

1 Like

thank you Tom.
My mistake was thinking that by changing the token to current clipboard, the content in the box would automatically change to %CurrentClipboard%, whereas in appends %CurrentClipboard% to whatever was already is in the box.
thanks again very much. You are very patient.