I'm running a macro that loops over a list of URLs. On each page, it executes a small snippet of JavaScript to grab a couple pieces of data, then moves on to the next url. I have thousands of urls to check, but the macro sometimes hangs after a while, so I've been running it in smaller chunks of several hundred at a time. When it hangs, it does so just before the JavaScript has had a chance to execute on the current page.
The error I get is first these 3 messages:
Launch Task /bin/sh -c /usr/bin/osascript '/var/folders/wv/4xzw109n2gvdfknthc1xf6m80000gn/T/Keyboard-Maestro-Script-4239E526-C9EE-47DD-8048-53C29E7DE3E7' failed with exception Failed to set posix_spawn_file_actions for fd -1 at index 0 with errno 9
Task failed with status -1
Launch Task /Applications/Keyboard Maestro.app/Contents/MacOS/Keyboard Maestro Engine.app/Contents/MacOS/CompileAppleScript nothing failed with exception Failed to set posix_spawn_file_actions for fd -1 at index 0 with errno 9
After that, I get this one over and over again until I cancel the macro:
My guess would be the system is running out of some resource related to launching osascript, maybe file descriptors, maybe something else.
How many lines are there in this list? You may need to add a periodic large pause to allow the system to “catch its breath” and clear out the cache of resources.
Got it, thanks, Peter! It's able to complete without issue in the low hundreds of urls, but if I initially select around 500 or more, it eventually chokes.
What is the advantage of running the JavaScript in an anonymous function? The script is very simple:
let label = document.getElementsByClassName('factlist__text')[2].innerText;
let date = document.getElementsByClassName('factlist__text')[3].innerText;
let labelDate = `${label}\t${date}`;
labelDate;
My JavaScript.fu is pretty white belt, but from what I understand you can pollute the name space in a webpage when setting variables over and over using Apple Events driven JavaScript.
What happens when you run this instead of the bare code:
(() => {
let label = document.getElementsByClassName('factlist__text')[2].innerText;
let date = document.getElementsByClassName('factlist__text')[3].innerText;
let labelDate = `${label}\t${date}`;
return labelDate;
})();
Unfortunately neither running the urls from a file nor using the immediately invoked function expression seemed to make a difference.
The main thing I noticed was that previously, if I tried to throw too many urls into the clipboard, the macro would refuse to run at all. However, it seems I can have however many lines in a file as I want (even if the macro can't finish running all of them).
I tried adding a 30 second, 60 second, 2 minute, and 6 minute break after 50 loops, as well as a few different length pauses after 100 loops, but it doesn't seem to help.
I also tried "unloading" the container variable periodically to see if that building up was causing issues. To do that, after 50 or 100 loops, I'd close the tab that was iterating through the urls to paste what it found thus far in the Sheet, then set that variable to an empty string to reset it. Then I'd reopen the tab, continue on, and return to the Sheet later with a new batch. This didn't seem to help either.
I also tried running the urls one at a time. So, copy a single url into the clipboard, open it in a tab to get the data, come back and enter the data into the sheet, then get the url from the next row. This also eventually hung up.
Thanks for the additional suggestions. I did notice that when I had the counter running, during a hang, the counter kept going up even though the loop wasn't visibly running – that is, it stopped navigating to the next url in the list, but it would still wait the minimum amount of time for Chrome to load, add a blank line to my data container variable, and then increment the counter.
I've never done cURL before, but I'm up for learning how if you think that might be a better approach. I have tried KM's "Get URL" action, but that doesn't work because the page source only shows certain info when the page actually loads in the browser.
I'm on 10.13.6 High Sierra.
Edit: Just tried to curl the page in question and I get the state of the page before any runtime code has had a chance to execute, which doesn't help.