Paste and highlight

PS I notice this lurking in my library. It looks slow but it may give you a start with encoding special characters for the HTML stage of the script:

on EscapeChars(str)
	-- QUOTE < > & ETC
	set strEncoded to (do shell script "python -c 'import sys; from xml.sax.saxutils import quoteattr; print quoteattr(sys.argv[1])' " & ¬
		quoted form of str)
	
	-- ENCODE DIACRITICS AND SPECIAL CHARACTERS
	set lstChars to characters of strEncoded
	repeat with i from 1 to length of lstChars
		set lngCode to id of item i of lstChars
		if lngCode > 127 then set item i of lstChars to ("&#" & lngCode as string) & ";"
	end repeat
	lstChars as Unicode text
end EscapeChars
1 Like

Wow this is great! When I have some time today I’ll play around with it.

Have fun. FWIW I’ve just updated the version above to slot in the EscapeChars() function.

That should, I think, deal with most of the special character issues, but inserting a CSS fragment to get the formatting you want is still an exercise for the reader : -)

UPDATE

Oops. Hubris. Just updated again. I had skipped the Applescript keyword my before the call to escapeChars().

Fixed now in the version above.

Rob, you are just tooooo awesome! :sunglasses: You never cease to amaze me.
I wasn't expecting you to go this far. Many thanks!

Rob, just tested it, and it works like a charm!

I added one Action so I would know something happened. :smile:

Notification ‘Web Link Completed’
Play Sound: Hero.
Ready to paste into any rich-text document.

Now I've got to work on the CSS so I can get the style I want.

Thanks again.

1 Like

Thanks Rob for creating this macro, it works great with Evernote. And thanks tomslug and JMichaelTX for your input too.

Rob, you have a great macro/applescript. But I want to copy, and retain, the rich text from the web page.

In your AppleScript you have:

-- LIST OF COPIED PARAGRAPHS
set lstParas to paragraphs of (my EscapeChars(the clipboard as Unicode text))

I think this converts the clipboard to plain text, correct?

This has the effect of dropping [BR] / LF in the text that is put on the clipboard at the end. Also, any links in the selected/copied text are lost.

I want to retain the formatting of the web page to be pasted later.
Can you suggest the best approach to do this?
Is it as simple as copying the clipboard to a KM variable, an then, somehow, combining that with the rich-text hyperlink?

To be clear, I'm not asking for you to do this for me, but just give me some hints on how to do it. :wink:

The key moment of that script is when HTML is rewritten by textutil to RTF

The first draft just gathers the plain text of the paragraphs, and wraps <p> <\p> around them.

To preserve elements of formatting, you could try feeding the original HTML source of the paragraphs (with HTML for the link appended) straight to textutil

Two routes to capturing the HTML source come to mind:

  1. From the DOM (executing Javascript in the browser), starting with window.getSelection(), and then working through the selected nodes, getting the .outerHTML property of the parent elements.
  2. By processing the HTML content of the clipboard object. The first step there is to convert it from a binary to a textual representation, and some code is sketched at: http://macscripter.net/viewtopic.php?id=31297

Good luck !

PS there’s also this:

http://forums.omnigroup.com/showthread.php?t=18784

Here's my version of ComplexPoint's macro. It's really minor changes since I took the easy way out trying to retain the full style of the selected/clipped web page text.

It's still not working as I'd like. When I combine the web clip clipboard with the Applescript hyperlink clipboard, the web clip looses its formatting.

Sorry, Rob, I haven't had time to look into your suggestions/links. I was hoping by using two different clipboards I could make it easy. Maybe not.

It's clearly a work-in-progress.

Anyway here's my version:

Macro File: Copy from web as linked RTF (Ver 1.1).kmmacros (32.3 KB)

Macro Image:

I need some help adding Opera to the browsers in this macro. I’ve copied the existing If All Conditions Met Execute Actions for Chrome part and pasted it below, then changed the settings to Opera.

The problem I’m having is with AppleScript content. I copied the Chrome AppleScript code and pasted it into AppleScript editor replacing “Google Chrome” in the tell application line with “Opera”

However, when I compile it I get the following error: “Syntax Error Expected end of line but found property”

Any ideas on what I’m doing wrong anyone?

Expanding the disclosure triangles at the top should reveal the browser-specific stage which copies text through the menu, and captures URL and document title through Applescript.

You will need to find a couple of lines for doing that with Opera, analogous to the variants here:

Thanks Rob, reading the original of the GitHub post it looks like Opera doesn’t support AppleScript script anymore. There is no dictionary for Opera in AppleScript either.

At least I understand why it doesn’t work now.

In addition to following the copied text with a link to (and the name of) the web page, you could also precede it with the most recent heading title by pasting this snippet into an Execute Javascript in [ Chrome | Safari ] action, and capturing its output into a KM variable.

(function (strXPath) {
var xr = document.evaluate(
	strXPath,
	window.getSelection().anchorNode,
	null, 0, 0
);
		
return (
	n = xr.iterateNext()
) ? n.textContent : '';

}).apply(null, ['./preceding::*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6][1]']);

How did you get on with preserving formatting while appending an RTF link ?

I've just taken a closer look at the Safari and Chrome clipboards – they both store RTF and HTML representations in a binary format that can be unpacked, but FWIW I notice that the Chrome clipboard has slightly fewer quirks and glitches, and is more script-accessible.

If you copy this slightly challenging moment of the page at:
http://plato.stanford.edu/entries/category-theory/

The Chrome clipboard RTF is clean and correct, but if you paste from the Safari clipboard into TextEdit, for example, you get some noise:

FWIW Chrome's clipboard HTML also produces cleaner RTF output from textutil than Safari's, which additionally imposes one extra scripting hoop – its binary-encoded HTML is itself inside a binary WebArchive plist.

Thanks. But this is starting to hurt my head. :pensive:

If you don't mind, can you give me the code to get the RTF from both using AppleScript (and JavaScript if you wish -- I'll use later) ?

Here's my thought: Pull the RTF of the selection on the web page, and combine that with the RTF of your hyperlink.

What do you think?

Sure – I’ll tidy them into something intelligible at the weekend.

My personal preference might be for working with HTML representations until the last moment – a bit more legible, and more scope for applying custom styles – but either route should be possible.

In the simplest case, in which you have copied something from a Chrome page, you could get to the HTML by writing:

-- HTML OF COPIED TEXT
set classHTML to "«class HTML»"
set classWebA to "«class weba»"
set strCMD to "osascript -e 'the clipboard as " & classHTML & "' | perl -ne 'print chr foreach unpack(\"C*\",pack(\"H*\",substr($_,11,-3)))'"
set strHTML to (do shell script strCMD)

and that perl unpacking should also work with «class RTF » (you need the space after the F, incidentally)

I’ll aim to share something more civilised on Sunday :- )

PS «class weba» is the Safari format, which I would probably advise against – more processing and more glitches.

Many thanks!

Maybe I've been doing something wrong, but in my macros when I copy from web page and combine with another RTF clipboard, it looses much of the web page clip.

If I just do a straight copy/paste, the web page formatting is properly retained. BTW, I'm mostly pasting into Evernote. I guess I should also try TextEdit.

Hi and good day,

I wonder how to adapt this script* / macro* in two ways:

star: * #19 = Paste and highlight
from July 14.

1st: to have the Variable “nameAndUrl” to the top (instead of at the end).

2nd: to have the url (onlyURL) at the end, so it’ll be visible.
(It’s sometimes useful to see the url, nameAndUrl is not HTML-copied f.ex. into Facebook.)

I have a little FetchURL macro with KM, but I can’t wrap my head around how to use 2 clipboards after each other. The URL goes into a Serial Clipboard but this one overwrites the results from the original (ComplexPoint)-script.

What should I do here?
It’s only difficult for the beginners, I know!

/
with best regards,
Omar K N
Stockholm, Sweden

yeah, confused me too

Based on my experiments with pasting and selecting I would recommend turning around after the ⇧← sequence and repeat the same number of → (without the ⇧) because that’s where the cursor ends up in a normal application’s paste. The whole left/right dance ends up looking pretty silly, especially with long strings, but there doesn’t appear to be any other way to handle it.