Extract text from PDF, do some cleaning up and write to text file

Here is a macro that can be used for PDF files with columns with unintended overflow to other columns. If no other tool can get you a clean extract, you can use this macro to manually extract the content:

  • Select a column (or text block).
  • Press the keyboard shortcut.

The extracted text will be written to a text file.

Demo:
1
Result:


As you can see, some manual work is needed to add the missing part (first lines of second and third column).
Extract text from PDF, do some cleaning up and write to text file.kmmacros (4.8 KB)

I have created another macro to mark the spotted errors during the extraction process:

Edit: One more addition for the record: In Preview.app you can press the Option key to switch to selection of a vertical block. Very handy to copy columns!

1 Like

I posted a reply on the Q&A forum. Sorry, but here's a link to it.

1 Like