Need Macro/JavaScript to Extract Contents of Code Block

While we are in mid-August optimisation and tweaking territory, glancing again at my JS code I notice a few things that could be adjusted or tightened up.

  1. If we're only looking for one match, no need to heat the CPU in boreal summer by searching for more. Instead of leaving it to the default XPathResult.ANY_TYPE and using .iterateNext() to collect just the first (if any) match, we can look up the Result Type constants, specify XPathResult.FIRST_ORDERED_NODE_TYPE, and collect any match with .singleNodeValue().
  2. XPATH expressions are chains of alternating 'Steps' and 'Filters'. We can broaden out the filter here with as many 'or' operators as we need. The simplest example might be: ./ancestor-or-self::*[self::code or self::pre]
  3. We can include text collection inside the function with return oSeln.toString().

(and then perhaps, drop it into an Execute JS in Safari (and/or Chrome) action, directing the output to the clipboard).

Select and copy code.kmmacros (19.2 KB)

(function () {
  var oSeln = window.getSelection(),
    nodeCode = document.evaluate(
      './ancestor-or-self::*[self::code or self::pre]',
      oSeln.anchorNode,
      null, XPathResult.FIRST_ORDERED_NODE_TYPE, 0
    ).singleNodeValue,
    rngDoc = nodeCode ?
      document.createRange() : null;

  if (nodeCode) {
    oSeln.removeAllRanges();
    rngDoc.selectNode(nodeCode);
    oSeln.addRange(rngDoc);
  }
  
  return oSeln.toString();
})();

Hmmm, I don't know about you, but I'm avoiding the hot, humid dogs days of August by hunkering down in my cold basement with lots of cold beer on ice. I'm fully optimized on being cool :sunglasses:

But thanks for the update. It looks good as it adds more coverage.
I knew you could do it. :wink:

Same code seems to work in Chrome as well.
Or am I missing something?

I just tried it on the Evernote forum, and it worked well. :+1:
I'll test it on my other candidate sites.

Yes – that kind of code should always work unchanged in either browser.

OK, this is working for:

Fails for:

Sorry I can't be more technical help right now -- but I'm glad to be the gopher, researcher, & tester. :smile:

If you need anything other than coding JS, let me know.

OK, I have updated my macro, which opens the code in AS Editor, with your latest JS:

BRW Open Web Page Code Block in Apple Script Editor.kmmacros (27.6 KB)

The cases really look a bit too diverse for a single XPATH

MacScripter

//blockquote/div/p

Veritrope:

//span[@class='coMULTI']
//div[@id="content"]/*[self::pre or (self::p and @class='code')]

Github is a bit special. You might be better off clicking the RAW button:

(The raw page uses a <pre>)

On the pretty pages extend select would capture all the line numbers, so if you really wanted to copy from there you would have to write a slightly different, row-by-row (code cell but not number cell) function. Perhaps something roughly like:

(function () {
  var xrLines = document.evaluate(
        '//td[contains(@class, "blob-code")]',
        document,
        null, 0, 0
    ),
  oLine = xrLines.iterateNext(),
  lstLines = [];

    while (oLine) {
    lstLines.push(oLine.textContent);
    oLine = xrLines.iterateNext();
  }
  
  return lstLines.join('\n');
})();

So, is it possible to build a SWITCH statement that includes these cases?

Yes, I think you should be able to switch/branch on the .URL of the front tab of the active browser …

That’s too specific.

I’d rather switch based on what’s available on the page in question.
Is this possible?

Not sure I’m following you. As far as I can see the XPATH patterns vary by website …

You can know the paths on which particular site hold their code, but I don’t think there’s any way of travelling in the opposite direction – looking at the myriad branches and pathways of an HTML tree, and intuiting that some locations are holding code.

But I’m probably misunderstanding you : - )

What I’d like to do, if possible, is develop a number of switch cases that capture the majority of popular web sites that offer code snippets.

It would be great that as we (or the user of this macro) identify more cases of popular sites, we can easily add to the JS cases.

Does this make sense?

For the cases where extend-select works, and all that needs to change is the XPATH you could use the pattern of this macro:

http://forum.keyboardmaestro.com/uploads/default/original/2X/5/516deefedadf787efaf3c2436fcabc18ecdd1fd5.kmmacros

to place the XPath which matches a particular site or set of sites in a KM variable, and execute a standard function in the relevant browser, with only the XPath changing.

I think you'd have to do a lookup on the url of the site you were visiting to see whether it kept code on a known XPATH, or was a member of a groups of sites whose code section were on a particular path.

Thanks. I think we're getting closer. :smile:

But I would like to avoid tracking on an URL basis, unless there is no other choice. IOW, I'd like to examine the structure of the current page, and determine if it fits one of a number of "common" setups for displaying code snippets.

I'm sure there will be some sites that will never fit, but hopefully those are in the minority of sites.

So, what do you think? Can we do something like this?

Alas no : - )

Page structure analysis and automatic identification of code would be a very big project.

Another approach, assuming that you have selected (or are hovering over) a node in the code that interests you, would simply be to cycle through a list of XPATHs, reaping a harvest if a hit is found, and reporting [perplexity|novelty] if no harvest is forthcoming.

If I understand you, that is what I was talking about.
If the user selects a word at the top of the code block, than can’t you look at the containers immediately above it, and determine if they fit one of our cases?

Hey JM,

No, I'm using Rob's JavaScript code. But all can be done in one fell swoop with an AppleScript without resorting to driving the UI. It's faster and less prone to error.

-Chris

Thanks, but it's working really well the way it is in the macro. I'm not seeing any performance issues -- it's quite fast.
For the moment, I think keeping the actions separate gives us more flexibility, easier to change, and easier for other users to change. IMO, it best for non-script users to stay out of the script code, so that's why I like stuff like the output script header to be KM var. Easier to see and change, and no danger of screwing up code. :smile:

Here's a great example. If you're running Mavericks or earlier, then the macro needs to open AppleScript Editor. Whereas with Yosemite and later it needs to open Script Editor.

It has been kindly pointed out to me that this approach is not working with Safari at the moment.

The horse seems to fall at the starting gate, with window.getSelection()

There is a workaround, but unfortunately it's name is Chrome ...

(or Firefox, very new and snappy now, and all re-written in Rust, but not supported by the power and convenience of a Keyboard Maestro action)

(Not sure if Apple made some kind of security decision, or if someone was just sleepy one afternoon. It doesn't seem to be specific to the rights of code introduced from outside, in the manner of a KM Action - window.getSelection now pretends to work but really just returns a blank sheet even from Safari's own JS Console.

The better news is that it looks fixed in Safari Technology Preview

No idea what that means over the next year for the consumer Safari, or for Keyboard Maestro's Execute JS in Safari action.

JM found a method that works with Safari 11.0.1 (12604.3.5.1.1).

Get Selected Discourse Code Block on Web Page & Open in Editor

I've tested it out, and it works well.

-Chris

2 Likes