[WEB] Get URLs from Search Engine Results [Example]

MACRO:   Get URLs from Search Engine Results [Example]

-~~~ VER: 2.0    2019-10-26 ~~~
Requires: KM 8.2.4+   macOS 10.11 (El Capitan)+
(Macro was written & tested using KM 9.0+ on macOS 10.14.5 (Mojave))

UPDATED: 2019-10-26 18:46 GMT-5
Ver 2.0 is a Major Update

  • Adds option for using Search Engines other than Google
  • Includes LinkedIn with Option for "Other"
  • Prompts for Search Engine to use and other data

UPDATED: 2020-10-26 15:29 GMT-5

  • You will need to change the Xpath in this Action because Google has changed the HTML code on their Search Results page:
    • image
    • //*[@class="rc"]/div/a

DOWNLOAD Macro File:

Get URLs from Search Engine Results [Example].kmmacros
Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.


Example Output

image
image


ReleaseNotes

Author.@JMichaelTX based on script by @ccstone

PURPOSE:

  • Get List of URLs (or MD Links) from Search Engine Results
  • Works for both Google, LinkedIn, and Other

HOW TO USE:

  1. Do a Search in any Browser Supported by KM
  2. Trigger this macro
  3. Select the Search Engine

WHAT IT DOES:

  1. Scans the Search results, and returns one link per result
  2. IF GS__Return_Data is set to "MD", then it returns Markdown links
  • Default: URL
  • Markdown Format: [Link Text](URL)
  1. The max number of links returned is set by the Variable GS__Max_Links
  • Default: 10

MACRO SETUP:

  1. Review the Default Settings in the Prompt and change as desired
    • GS__Max_Links
    • GS__Return_Data
  2. Review the Actions in magena color, and change as desired
    • GS_XPath for "OTHER"
  3. ADD the Actions you want to process the results (list of URLs)

TAGS: @XPath @Google @Search @Links @JavaScript

USER SETTINGS:
• Any Action in magenta color is designed to be changed by end-user

REF:

ACTION COLOR CODES

• To facilitate the reading, customizing, and maintenance of this macro,
key Actions are colored as follows:
• GREEN -- Key Comments designed to highlight main sections of macro
• MAGENTA -- Actions designed to be customized by user
• YELLOW -- Primary Actions (usually the main purpose of the macro)
• ORANGE -- Actions that permanently destroy Varibles or Clipboards

REQUIRES:

(1) Keyboard Maestro Ver 7.2.1+
(2) Yosemite (10.10.5)+
(3) Any Scriptable Browser Supported by KM

1 Like

A post was split to a new topic: How would I get the line number of Selected Text in a variable?

@JMichaelTX This is amazing, how tricky would this be to alter it for a different search engine/website?

Depends on how the results page of the other search engine is designed.
The more similar it is to Google, the easier it will be.

Would you mind taking a look at the linkedin search engine if it's not to time intensive?

It's just the //--- SETUP XPATH --- and //--- GET FIRST LINK --- sections that would require alterations, correct?

Just posted a major update.

Just posted an update which should meet your needs.

Amazing, thanks so much Jmichael

1 Like

@JMichaelTX

I'm currently working on implementing this with a site that lazy loads the search results, is there anyway in keyboard maestro to force load all the results beyond implementing a scroll and pause sequence?

If by "lazy loads" you mean the page stops loading until the user scrolls to the bottom of the current page, then no, I don't know of any other method. This is exactly how this site (KM Forum / Discourse forum) works. It's a PITA. :wink:

You might try posing your question on the stackoverflow.com site.

1 Like

Hi @JMichaelTX I had used this macro a year? or so ago for a bit, and it worked good, thank you.
I've just had reason to revisit and it is not working for me.
I downloaded the updated version and I get no text in the result window... it's blank.
I'm on Catalina 10.15.7 with Safari 14.0 -
thoughts?
Using Google Search engine, 10 lines and URL as options.
When I disable the last action to delete the variables - GS__LinkList is blank
cheers

@troy, I will take a look at my macro when I have some time, but you should be aware of these issues:
ALERT❗ Before You Update to macOS Catalina 10.15.7 Read This

1 Like

Hi @JMichaelTX - I see where it is not setting the GS__LinkList but do not know the fix...

Goggle has changed the HTML on their Search Results page.

You can try this XPATH:

image

//*[@class="rc"]/div/a

That worked for me running Keyboard Maestro 9.0.6 on macOS 10.14.6 (Mojave).
Let me know if this works for you.

However, I did note that it includes some links that I did NOT expect it to.
All of the links in this block:

image

When I get some time (could be a while) I'll look at excluding those links.

1 Like

Thank you for your time @JMichaelTX very appreciated. Yes, I was able to get it to work and as before I'm able to put in certain url's that I want to show SERP for and get a dialog displaying the results placement, by going thru 'for each' line, containing the urls I'm looking for.
Cheers

1 Like

Thank you @JMichaelTX, where do I find the //*[@class="rc"]/div/a so that I can find it next time it changes and not bother you! - just curious to learn. I looked at the 'page source' code and found nothing.
cheers

You won't find that exact Xpath anywhere in the web page. It is something you construct.
The key is identifying some key HTML element, usually a class name, and then constructing the logical path to the target element (the anchor "a" in this case).

The way you read this Xpath is:

  • Find the First (or next) element that has a class name of "rc"
  • Then the first "div" sub-element
  • Then the first "a" sub-element

like this:

2 Likes

Did they change the Xpath again? I am again not getting the GS__LinkList.
I took a look to see if I could figure it out but could not.
Appreciate your time and expertise.
Cheers

Hey Troy,

You're searching Google?

I think this is picking up all the links...

(Array.from(document.querySelectorAll("div.yuRUbf > a"), x => x.href)).join('\n')

-Chris

Hi Chris, I set the GS__XPath variable in the macro to
(Array.from(document.querySelectorAll("div.yuRUbf > a"), x => x.href)).join('\n')
and ran the macro, cancelling it after it sets the GS__LinkList and that variable is blank.

I'd really like to use this macro, used to use it a year ago all the time.