Thanks! I installed the Satimage software. Also, I OCR'd the pdf using PDFPenPro and figured out how to use Hazel to OCR all the files in the folder...
Make sure you have at least one PDF in the “Sample pdfs” on the desktop, and run this script from the Script Editor.app.
set sourceFolder to alias ((path to desktop as text) & "Sample pdfs")
tell application "Finder"
set thePdfFile to first file of sourceFolder as alias
end tell
set thePdfFile to quoted form of (POSIX path of thePdfFile)
set shCMD to "
export PATH=/opt/local/bin:/opt/local/sbin:/usr/local/bin:$PATH;
pdftotext -layout " & thePdfFile & " -
"
do shell script shCMD
I’m using your resume as a model for this, so drop any others into a temp folder for this test.
-------------------------------------------------------------------------------------------
# dNam: Kurt Kessler → KM Forum → Working
# dCre: 2016/07/29 13:12
# dMod: 2016/07/29 14:02
-------------------------------------------------------------------------------------------
set sourceFolder to alias ((path to desktop as text) & "Sample pdfs")
tell application "Finder"
set thePdfFile to first file of sourceFolder as alias
end tell
set thePdfFile to quoted form of (POSIX path of thePdfFile)
set shCMD to "
export PATH=/opt/local/bin:/opt/local/sbin:/usr/local/bin:$PATH;
pdftotext -layout " & thePdfFile & " -
"
set pdfText to do shell script shCMD
set educationText to fndUsing("(?m)^(Education.*\\s.*)(?=(^\\w|\\Z))", "\\1", pdfText, false, true) of me
-------------------------------------------------------------------------------------------
--» HANDLERS
-------------------------------------------------------------------------------------------
on cng(_find, _replace, _data)
change _find into _replace in _data with regexp without case sensitive
end cng
-------------------------------------------------------------------------------------------
on fnd(_find, _data, _all, strRslt)
try
find text _find in _data all occurrences _all string result strRslt with regexp without case sensitive
on error
return false
end try
end fnd
-------------------------------------------------------------------------------------------
on fndUsing(_find, _capture, _data, _all, strRslt)
try
set findResult to find text _find in _data using _capture all occurrences _all ¬
string result strRslt with regexp without case sensitive
on error
false
end try
end fndUsing
-------------------------------------------------------------------------------------------
NOTE that this is only a demonstration. I expect the various resume formats will not be uniform and will require more clever parsing.