When writing the footnote for a reference, I use KM to trigger applescripts that fetch the title and URL from the front window in Safari. This not only saves wear and tear on me, it eliminates typos and makes the references I use (from web sites like the Wall Street Journal, Bloomberg, New York Times, etc.) accurate. I trigger these KM macros within Pages or Adobe Acrobat DC.
Is there any way to use scripting to fetch the pub date or the author(s) from the front window in Safari?
There doesn’t seem to be a way to do this with applescript, which I’m capable of figuring out from others’ examples.
Over the last 10 evenings, I’ve tried to figure this out with JavaScript or Python, but I’m over my head and can’t figure out how to do this.
Is there an easy way to do this with JavaScript or Python?
I have been wanting to get the same info for a long time. The main issue is that there does NOT appear to any standard for publishing/identifying/labeling Author and Publication Date.
If the web sites of interest to you do have a standard format or a standard HTML codes for these, let us know and maybe we can devise something.
It is actually pretty easy using RegEx to parse this data, as long as it is always in a standard format, like:
The architecture of a web page includes standardised and universal representations of title and URL, but there are no universal conventions which a script can use in a search for a publication date or authorship field.
You could certainly write a script to 'scrape' that particular WSJ page, and with luck your script might conceivably work with some useful proportion of other WSJ pages (if they are adopting consistent internal conventions) but it wouldn't work with pages on other websites.
Thanks for the reply. I use a number of regular sources for 95% of what I write, so I’ve already got partial customized macros with apple scripts for each of these. I get one to work and then duplicate and make adjustments for the others.
FYI, I’d figure out how to scrape the date first (and then authors), creating a separate macro for each. Then, I would trigger the date macro and the author macro as part of the customized multi-step macro I already use on a particular web site for, such as, extracting the data from Wall Street Journal articles.
Advice on the easiest way to scrape these data. Between python, javascript and other approaches, not sure where to begin. What would you recommend for someone who (probably like most) self teaches himself on this kind of stuff?
Try this script in a Execute JavaScript in Safari Action.
Note that there are several HTML meta tags that report publication date. You can choose the one you prefer.
##javascript to Extract Author and Pub Date from WSJ
'use strict';
(function run() { // this will auto-run when script is executed
var authorMeta = document.querySelector('meta[name="author"]')
var authorStr = authorMeta ? authorMeta.content : "UNKNOWN"
var datePubMeta = document.querySelector('meta[itemprop="datePublished"]')
var datePubStr = datePubMeta ? datePubMeta.content : "UNKNOWN"
//--- Example of a Bad Meta (not found) ---
var badMeta = document.querySelector('meta[itemprop="BadMeta"]')
var badStr = badMeta ? badMeta.content : "UNKNOWN"
return authorStr + "\n" + datePubStr;
} // END of function run()
)();
The querySelector approach is good, and although the META name= scheme for author and publication-date is probably rather WSJ-specific, its possible that you could widen the hits by loosening the query criteria a little.
Something like this, for example (ES6 version only here, so you will need an up to date Safari), uses the *= selector to pick up content from meta tags which have author or published anywhere in the name value. You could, could of course, cast nets more widely.
Dumb problem on my part. When I test the JavaScript in ScriptEditor I get the same error (on different lines) with the scripts you wrote: "Can't find variable: document"
I thought that "Selecting Safari Tab 1" as step 1 of the KM macro, followed by either of your scripts would "create" the "document" by pointing the scripts to the front/current tab in Safari.
Then, I tried "do JavaScript"......
tell application "Safari"
tell front window
tell current tab
do JavaScript "JMichaelTX_script here OR ComplexPoint_script here"
end tell
end tell
end tell
No luck. So, I'm obviously not feeding the front tab url to the script. Assume I'm making some simple right in front of my nose mistake.
Script Editor JavaScript is not, alas, relevant to browser JavaScript - it doesn’t have any link to the browser’s DOM libraries.
The way to test these snippets is in a Keyboard Maestro Execute JavaScript in Safari action.
JavaScript is an embedded scripting language. The browser embedding is quite separate from and unlinked to the macOS system scripting embedding. Any link would actually be a more or less lethal security breach
@ComplexPoint nailed it. The script was not intended to work directly in Script Editor.
Try this macro:
##Macro Library@Meta Extract Author and Pub Date from WSJ Meta @Web@HTML@Example
####DOWNLOAD:
<a class="attachment" href="/uploads/default/original/2X/4/41ff40a86d27c1adc2d0ca282afbd8307b91b680.kmmacros">@Meta Extract Author and Pub Date from WSJ Meta @Web @HTML @Example.kmmacros</a> (7.6 KB)
---
###ReleaseNotes
Author.@JMichaelTX
**PURPOSE:**
* **Extract the Author and Publication Date from Meta tags in the WSJ**
HOW TO USE:
1. Open WSJ page in either Safari or Chrome
2. Trigger this Macro
**MACRO SETUP**
* **Carefully review the Release Notes and the Macro Actions**
* Make sure you understand what the Macro will do.
* You are responsible for running the Maco, not me. 😉
.
* Assign a Trigger to this maro. I prefer TBD.
* Move this macro to a Macro Group that is only Active when you need this Macro.
* Enable this Macro (if needed).
.
* **REVIEW/CHANGE THE FOLLOWING MACRO ACTIONS:**
(all shown in the magneta color)
*
TAGS:
USER SETTINGS:
* Any Action in _magenta color_ is designed to be changed by end-user
ACTION COLOR CODES
* To facilitate the reading, customizing, and maintenance of this macro,
key Actions are colored as follows:
* GREEN -- Key Comments designed to highlight main sections of macro
* MAGENTA -- Actions designed to be customized by user
* YELLOW -- Primary Actions (usually the main purpose of the macro)
* ORANGE -- Actions that permanently destroy Varibles or Clipboards,
OR IF/THEN and PAUSE Actions
REQUIRES:
1. Keyboard Maestro Ver 7.3+ (don't even ask me about KM 6 support).
2. El Capitan 10.11.6+
* It make work with Yosemite, but I make no guarantees.
**USE AT YOUR OWN RISK**
* While I have given this limited testing, and to the best of my knowledge will do no hard, I cannot guarantee it.
* If you have any doubts or questions:
* **Ask first**
* Turn on the KM Debugger from the KM Status Menu, and step through the macro, making sure you understand what it is doing with each Action.
---
<img src="/uploads/default/original/2X/a/a204d70d1d06f1043dae6696836ee81f0fcc38ae.png" width="496" height="566">
Nothing happens. JavaScript is enabled in Safari/Preferences/Security. Perhaps I'm overlooking something else that I need to "toggle" to make this work?
If you want to run Javascript in Safari from the Script Editor, then you will need to use a script something like this.
Very Important: You must ESCAPE all quotes in the JavaScript using \"
set jsStr to "
'use strict';
(function run() { // this will auto-run when script is executed
var authorMeta = document.querySelector('meta[name=\"author\"]')
var authorStr = authorMeta ? authorMeta.content : \"UNKNOWN\"
return authorStr;
} // END of function run()
)();
"
set scriptResults to my doJSInSafari(jsStr)
return scriptResults
on doJSInSafari(javascriptStr)
try
tell application "Safari" to do JavaScript javascriptStr in front document
on error e
error "Error in handler doJSInSafari()" & return & return & e
end try
end doJSInSafari
Thank you to you both. It's obvious what a tremendous resource you are to the KM discussion forum - and the others to which you contribute as well.
I add/replace a thousand + citations every time I update a textbook, which is an annual process. And since I have several books and am working on other writing projects as well, this has enormous utility to me in terms of accuracy and wear and tear on my hands, wrists, and forearms.
When I get the big "compiled" KM macro working, I'll post the whole thing, including the individual macros that "fold up" into the larger KM macro.
That will be my small contribution to others to repay the assistance that the two of you have kindly shown me.