Search all PDF files in a folder for a specific word - with HoudahSpot

Both this forum and the HoudahSpot manual contain examples of how to search with HoudahSpot via Applescript:

From the manual:

tell application id "com.houdah.HoudahSpot4"
	set myLocations to {"/Users/hl/Dropbox/CT/Jobs"}
	set myDocument to search "Rechnungsdaten" locations myLocations
	repeat until search completed of myDocument
	end repeat
	set myResults to results of myDocument
	set myCount to count of myResults
	display dialog "Found " & myCount & " files"
	set mySelection to selection of myDocument
	if mySelection is not {} then
	end if
end tell

I am trying to create a macro to search a subfolder of the Documents folder (e.g. ~/Documents/PDF) for (1) all PDF files with (2) the string _de-DE in their names for a certain word (stored in a variable). Once all found files are listed in HoudahSpot, I open one of them manually in Skim (or as a second choice in Preview) by clicking in the Results list. I then want to (3) see the first occurrence of the searched word in Skim (Preview).

The reason why I want to use Skim is that this PDF viewer is more scriptable (not that I have the knowledge): I want to (4) get the page number of the first occurrence of the searched word and open another (related) PDF and (5) navigate to the same page number.

Can anyone tell me:
(1) How to limit the search via AppleScript to PDF files only?
(2) How to limit the search to all (PDF) files that contain the language code _de-DE in their names?
(3) How to navigate to the first occurrence of the searched string when opening a PDF from HoudahSpots Results list.
(4) How to get the page number of the PDF opened in Skim.
(5) How to navigate to a page number in Skim.
(6) I can instruct macOS to always open all PDF files in Skim, but that's a bit drastic. (I like to keep Preview as the default app for many PDF files.). Is there another way to use Skim when opening results in HoudahSpot?

Thanks for any help!

Edit: I've found that items (1) and (2) can be solved by setting the file type, folder and language code and saving the settings as the default settings:

I can't help you with the script regarding your request @ALYB.

Do you know the possibility to create your own templates in HoudahSpot?
This way you could integrate it into a KM macro and then access your PDF Viewer with further actions.

ScreenFlow

1 Like

Thanks for the very professional movie!

Was I able to help you with it or did I get AppleScript @ALYB?

1 Like

@appleianer I have set the default search parameters in HoudahSpot. But I'm still looking for the extension of the Applescript that I'm using to search PDF files open in Skim:

set textToFind to get the clipboard
tell application "Skim"
	set foundText to find front document text textToFind
	select foundText with animation
end tell

(Thanks @ccstone for that snippet of Applescript.)

I have searched already in the Skim mailing list archive, but I cannot find a ready made example to

  • Return the page number of the page containing the first instance of the search string.
set thePageNo to (get index for thepage)

Requires setting of thepage, which I don't know how to.

  • Navigate to a page number.

With this information (if the desired action is possible at all), I can finish my macro.

I'm afraid I can't help you there :pensive:

It looks like it is impossible to get the page number of a PDF opened in Skim. So I went for GUI scripting. This is the whole macro:


The source:

Open corresponding target PDF.kmmacros (16.8 KB)

Hey Hans,

This is simple but not at all obvious.

You might even say it was a trifle odious.  :sunglasses:

----------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/09/03 21:21
# dMod: 2018/09/03 21:21 
# Appl: Skim
# Task: Show Page Number Where Text is Found.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Skim, @Show, @Page, @Number, @Where, @Text, @Found
----------------------------------------------------------------

set textToFind to " health "

tell application "Skim"
   set foundText to find front document text textToFind
   select foundText
   set foundText to item 1 of foundText
   
   try
      thePage of foundText
   on error errMsg number n
      set AppleScript's text item delimiters to {" of text of ", " of document \""}
      set pageNumOfFoundText to text item 2 of errMsg
   end try
   
end tell

----------------------------------------------------------------

Now that is straightforward:

----------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/09/03 22:49
# dMod: 2018/09/03 22:49 
# Appl: Skim
# Task: Discover Index of Current Page of Front Document.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Skim, @Discover, @Index, @Current, @Page, @Front, @Document
----------------------------------------------------------------

tell application "Skim"
   set thePage to index of current page of front document
end tell

----------------------------------------------------------------

Also pretty straightforward:

-------------------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/09/03 22:43
# dMod: 2018/09/03 22:43 
# Appl: Skim
# Task: Go To a Specific Page in the Front Document.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Skim, @Go, @To, @GoTo, @Go-To, @Specific, @Page, @Front, @Document
-------------------------------------------------------------------------------------------

set pageIndex to 1

tell application "Skim"
   tell front document
      set current page to page pageIndex
   end tell
end tell

-------------------------------------------------------------------------------------------

NOTE – For those who don't already know – AppleScripts are run via Keyboard Maestro's Execute an AppleScript action.

-Chris

1 Like

Hello Chris,

Beautiful as always. Many thanks! I'm now trying to build the whole macro, and:

I'm having troubles to get the variable pageNumOfFoundText to Keyboard Maestro.

After reading this:

https://wiki.keyboardmaestro.com/AppleScript

I thought that this would be as simple as adding these lines to the script that reports the page number:

tell application "Keyboard Maestro Engine"
	
	setvariable myKMVar to pageNumOfFoundText
	
end tell

But obviously, it isn't: when I try this action in KM, I get an empty message:

(I'm trying to remove the part 'page ' from pageNumOfFoundText in KM and use the plain number in another action, to jump to that page number in another PDF.)

----------------------------------------------------------------

# Auth: Christopher Stone

# dCre: 2018/09/03 21:21

# dMod: 2018/09/03 21:21 

# Appl: Skim

# Task: Show Page Number Where Text is Found.

# Libs: None

# Osax: None

# Tags: @Applescript, @Script, @Skim, @Show, @Page, @Number, @Where, @Text, @Found

----------------------------------------------------------------

set textToFind to get the clipboard

tell application "Skim"
	
	set foundText to find front document text textToFind
	
	select foundText
	
	set foundText to item 1 of foundText
	
	
	
	try
		
		thePage of foundText
		
	on error errMsg number n
		
		set AppleScript's text item delimiters to {" of text of ", " of document \""}
		
		set pageNumOfFoundText to text item 2 of errMsg
		
	end try
	
	
	
end tell

tell application "Keyboard Maestro Engine"
	
	setvariable myKMVar to pageNumOfFoundText
	
end tell

----------------------------------------------------------------

For now I'll solve this via the clipboard:

set the clipboard to pageNumOfFoundText

This works, but why doesn't this work:

set pageIndex to get the clipboard



tell application "Skim"
	
	tell front document
		
		set current page to page pageIndex
		
	end tell
	
end tell

Hey Hans,

Here's the correct syntax:

tell application "Keyboard Maestro Engine"
   setvariable "pageNumOfFoundText" to yourAppleScriptVar
end tell

The Keyboard Maestro variable MUST be quoted with double-quotes.

(Unless you use an AppleScript variable and assign the actual Keyboard Maestro variable name to it.

set kmVarName to "pageNumOfFoundText"

tell application "Keyboard Maestro Engine"
   setvariable kmVarName to yourAppleScriptVar
end tell

I personally would NOT do it this way, because I find it harder to read, but perhaps it helps illustrate that the Keyboard Maestro name MUST be a string rather than an AppleScript variable object.

-Chris

1 Like

I've figured it out how to convert the clipboard content to a number:

set theNumber to (the clipboard)
set theNumber to theNumber as number

set pageIndex to theNumber

tell application "Skim"
	
	tell front document
		
		set current page to page pageIndex
		
	end tell
	
end tell

Hey Hans,

You can streamline that a trifle:

set pageIndex to (the clipboard) as integer

tell application "Skim"
   tell front document
      set current page to page pageIndex
   end tell
end tell

Here's how you can extract the number from “page 1”:

----------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/09/05 17:47
# dMod: 2018/09/05 17:47 
# Appl: Keyboard Maestro Engine
# Task: Use Keyboard Maestro's search and replace in AppleScript.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Keyboard_Maestro_Engine, @Use, @Keyboard_Maestro's, @Search, @Replace
----------------------------------------------------------------

set someText to "page 1"

set pageNumber to kmReplace("\\D", "", someText, true, true, false) of me

----------------------------------------------------------------
--» kmReplace()
----------------------------------------------------------------
--  Task: Find and Replace with RegEx using Keyboard Maestro's AppleScript search command.
--  dMod: 2018/04/06 04:55
----------------------------------------------------------------
on kmReplace(findPattern, replacePattern, dataStr, regExBool, caseBool, tokensBool)
   tell application "Keyboard Maestro Engine"
      set foundDataList to search dataStr for findPattern replace replacePattern ¬
         regex regExBool case sensitive caseBool process tokens tokensBool
   end tell
end kmReplace
----------------------------------------------------------------

-Chris

1 Like

Again, thanks for all help.

This is the whole macro:

And a very simple macro to search via HoudahSpot (I use a template to specify the file type etc.):

And here is a demo:

I think that this is a very nice solution that Mac using translators can be happy with.

Here are the macro files:

Search clipboard content.kmmacros (3.9 KB)

Open corresponding target PDF.kmmacros (13.8 KB)

Hello @appleianer
I am looking for a quick way to call up my HoudahSpot templates, but I don't understand your macro based on your video.
thank you !