Get Browser Page Title & URL without Opening Browser?

hayleyh · April 17, 2020, 5:34am

Hi there, so I see that there are tokens to get the browser page title and url when it's the front most window, but I'm wondering if it's possible to get this information from just a link? I saw this great post with a macro that lets you download webarchives without opening the browser and got interested: Create Web Archives (Download Web Page) from a List of URLs
@ccstone

I download a lot of video from spreadsheets with custom naming. The only thing keeping me from making this a fully background task is that I use the browser title and full urls to generate the various names, which means opening each link one by one to get that info.

Any ideas are most welcome

tiffle · April 17, 2020, 9:44am

I'm pretty sure there are some clever ways of doing this but my simple-minded approach to get the title of the page from its URL without loading it up in a browser is this:

If you've got the URL in a variable already then use that instead of the clipboard.

Hope this helps!

hayleyh · April 17, 2020, 11:58pm

Hey there, this is pretty close! I tested it on a variety of sources, the only one that doesn't work is instagram, any idea why?

The only thing I changed was to filter out the HTML entities, so now they match pretty much exactly the results I was getting before.

hayleyh · April 18, 2020, 4:02am

Oh hello, if only I'd looked a bit closer. Seems for instagram there's a new line, so for that one the regex search will be:

<title>
(.*)
</title>

Completely works for me now, thanks a bunch!

tiffle · April 18, 2020, 7:39am

Ooo! I didn’t know that the Filter action could do that! Thanks for the tip and glad I could help. Cheers.

hayleyh · June 9, 2020, 7:37pm

Re-opening this because I'm suddenly having an issue with Twitter links. The Get URL function is not giving me any info about the Browser Title, if anyone has any ideas that would be excellent.

Random Cat Video EX:

Link: https://twitter.com/damn_elle/status/1269873428804190208
Browser Title expected results: I think we should all watch this cat try ice cream for the first time.

tiffle · June 10, 2020, 10:07pm

Just tried this out again and it seems that what the Get URL action returns is not the same as the source of the web page (which you can look at in a browser). I'm not sure why they're different but it must be to do with the way the browser renders the web page, which is not something available to the Get URL action.

Perhaps @peternlewis or another expert could clarify what's going on.

peternlewis · June 11, 2020, 3:28am

Any number of things could cause a different overall result of the web page as returned to the browser or Keyboard Maestro:

Cookies or other status
Login states (related to cookies)
JavaScript run in the browser after the page is loaded
Web server serving different things depending on implicit details of the request (like the user-agent).

Probably more things I can't think of.

I'd expect you'd get similar results if you used Private Browsing and set the User Agent to something generic.

peternlewis · June 11, 2020, 7:28am

I checked this particular URL, and if you try to get it with Keyboard Maestro it detects the absence of JavaScript - you can try this in Safari if you disable JavaScript:

If you follow the link, you get to https://mobile.twitter.com/greg_doucette/status/1266886293562286086 and unfortunately for you, that does not set the title at all.

So it appears the title is set via JavaScript (which you can actually see if you visit the original link, it goes through several variants before it finishes with the final title).

Get Browser Page Title & URL without Opening Browser?

Options