Copying System Clipboard as Styled Text Converts It to Plain Text

To see the set of macOC UTIs (Uniform Type Identifiers) for the pasteboard flavours in your clipboard at any given moment, you can use a macro like:

Many applications define their own UTI strings, for specific data types, but the key system-defined types for general interchange include:

  • public.utf8-plain-text
  • public.html
  • public.rtf

Keyboard Maestro can list out the clipboard flavors with the SystemClipboardFlavors token, but that is about it.

There are tools that will show information about the clipboard flavors.

Apple lists some of the flavors but there is no one list of all possible clipboard flavors.

and for a line-by-line listing of just the UTIs in the system clipboard at the moment, you could write something like this:

UTI flavors in clipboard (listed one per line).kmmacros (1.9 KB)

UTIs are just standardized names for particular data formats, so your question is effectively about how each named format encodes a combination of label and url.

In turn, all of this goes back to the origin of "hyperlinks" in HTML, where the url is stored as an attribute value, and the label as the content of the element.

Plain text is a flat, single-layered data model (no layering into tag contents vs tag attribute values) so the solution which has stuck is the Markdown pattern of a sequence of two bracket types in [label](url)

The least tractable of the 3 usual suspects is the RTF encoding, which is described here: What is the RTF syntax for a hyperlink? - Stack Overflow

Apple does define a dedicated public.url, and you can see a smallish number of applications including a pasteboard of that type in the clipboard, but generally the most tractable, and best supported with tooling, is the original source of the notion of a "link": public.html, from which, in an <a> tag, you can either take:

  1. the value of the href attribute, (the url) or
  2. the text content (the label).

I'll leave the clever stuff to the two smarter-than-me people who've already posted...

But to start working towards a solution I'd begin with a single known app -- Notes or TextEdit look like a good choice -- and get the macro working there. Then switch the destination to your app, to make sure that whatever you did in the rebuild didn't break clipboard handling for incoming data. Then switch the source to the Chrome extension to see if you can munge one of the data types it provides into something that works as input.

Given that the only "useful" format in your list appears to be HTML, you may be looking at an additional HTML->RTF step, perhaps using textutil.

1 Like

Remember, of course, that there's a difference between:

  1. A pasteable public.html clipboard, and
  2. the HTML source of that clipboard (which you may need as raw material for further processing).

See, for example, JS of this pattern:
(assuming the – now default – modern syntax of Keyboard Maestro JavaScript for Automation actions)

Expand disclosure triangle to view JS source
return (() => {
    "use strict";

    ObjC.import("AppKit");

    // Rob Trew @2021

    // Either the public.html component of the clipboard,
    // (if any) **as HTML source text**, 
    // or an explanatory message.
    const main = () =>
        either(
            alert("Pasting HTML")
        )(
            html => html
        )(
            clipOfTypeLR("public.html")
        );

    // ----------------------- JXA -----------------------

    // alert :: String => String -> IO String
    const alert = title =>
        s => {
            const sa = Object.assign(
                Application("System Events"), {
                includeStandardAdditions: true
            });

            return (
                sa.activate(),
                sa.displayDialog(s, {
                    withTitle: title,
                    buttons: ["OK"],
                    defaultButton: "OK"
                }),
                s
            );
        };


    // clipOfTypeLR :: String -> Either String String
    const clipOfTypeLR = utiOrBundleID => {
        const
            clip = ObjC.deepUnwrap(
                $.NSString.alloc.initWithDataEncoding(
                    $.NSPasteboard.generalPasteboard
                        .dataForType(utiOrBundleID),
                    $.NSUTF8StringEncoding
                )
            );

        return 0 < clip.length
            ? Right(clip)
            : Left(
                "No clipboard content found " + (
                    `for type '${utiOrBundleID}'`
                )
            );
    };

    // --------------------- GENERIC ---------------------

    // Left :: a -> Either a b
    const Left = x => ({
        type: "Either",
        Left: x
    });


    // Right :: b -> Either a b
    const Right = x => ({
        type: "Either",
        Right: x
    });

    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        // Application of the function fl to the
        // contents of any Left value in e, or
        // the application of fr to its Right value.
        fr => e => e.Left ? (
            fl(e.Left)
        ) : fr(e.Right);

    return main();
})();

I think, however, for shell commands, pbpaste may take care of that, as in:

pbpaste | textutil [etc etc]

Paste any HTML component of clipboard as HTML source.kmmacros (5.6 KB)

Thanks, @ComplexPoint, that macro is very useful. To see what's happening, I created a sample window in Chrome which has this listing of its tabs in Tabs Outliner:

image

Selecting the Window line and pressing ⌘C put this on the Clipboard, using your Clipboard View macro:

image

You can see how the HTML text and data flavors use nested <ul> tags to create the indenting and use <a href=...>New Tab</a> tags for the hyperlinks while the plain text flavors use simply "New Tab (chrome://newtab/)" with spaces for the indenting.

When I run the Apply Style to Clipboard action on the clipboard, I get:

image

As you can see, the plain text flavors have not changed, but the HTML has been removed and replaced by public.rtf flavors. And it is clear that the plain text string or data has been used as the source — indenting has been done with spaces, and the URLs are in parentheses.

This is all confirmation (done prior) of @Nige_S's suggestion:

I can strip the clipboard down to only the HTML using Remove Clipboard Flavors and I can run that through either textutil (thanks for the suggestion) or pandoc (which I already have) to get the RTF string. I can do that silently via a shell command. So far it seems OK.

BUT how do I get the RTF output of textutil or pandoc into the clipboard as a public.rtf flavor? Everything I can think of looks like it will put the RTF into the clipboard as a plain text flavor of the RTF code or the text of a pathname to an RTF file.

Any ideas?

you can pipe the textutil output into pbcopy

.... textutil [etc. etc] | pbcopy

...which is part of your OS install, so you have that too.

Since your data is already on the clipboard, something like

pbpaste -Prefer public.html | textutil -stdin -convert rtf | pbcopy

...may be enough.

1 Like

Thanks. This is looking doable. I hope I can try it tonight.

It looks like some debugging is on the agenda. Using the following macro fragment:

image

I got this for the public.html flavor in the clipboard, displayed by @ComplexPoint's Clipboard View macro:

image

Clearly some HTML in there. But when I run it through textutil, I get nothing:

image

As I said, some debugging to do.

Are you giving texutil anything to work with there ?

Does the pbpaste -Prefer option know about UTI names like public.html ?

At the macOS 14.5 command line, man pbpaste shows me these options:

-Prefer {txt | rtf | ps}

I think you should be able to try something like:

pbpaste | textutil -format html -convert rtf -stdin -stdout | pbcopy -Prefer rtf

but even before that, just check what is actually coming out of pbpaste at that stage.


PS it may be easier to work the the Input from System Clipboard option of the Keyboard Maestro Execute Shell Script action.

For example:

Copy HTML as RTF.kmmacros (3.2 KB)

Thanks @ComplexPoint,

I had just figured out that I need both -stdin and -stdout options to textutil to get the earlier shell command to work. But the output looks like an RTF-formatted version of the HTML code text. I suspect that's in part due to the erroneous option to -Prefer.

I'll give your new shell command line a try, as well as try the experiments you suggest.

1 Like

Yeah -- that'll teach me not to believe everything I read on the interwebz...

I've been going round in circles most of the morning, but the best I've come up with so far to get an RTF representation of what's copied from the Chrome extension is to use @ComplexPoint's clipboard viewer to extract the string of the public.html entry, then pass that through textutil:

Tab Outliner Copy to RTF.kmmacros (24.0 KB)

Image

What I don't know, having never played around with RTF or styles in KM, is what to do after that! I see there's a new "Styled Text to/from RTF plain text" Filter in KM v11, but that does appear to be "plain RTF" only -- I'm back to an empty string when using it on our formatted RTF.

I'm kinda losing track of the ultimate goal here. But doing the clipboard shuffle to first paste some styled boilerplate and then paste the data copied from the Chrome extension may be the sensible way to go. "Re-styling" already-styled RTF seems to be... problematic.

Yeah, that's the kind of scripting suggestions I've been getting from ChatGPT whenever I try to do anything out of the ordinary. It looks reasonable, but it's fictional.

A piece in the Scientific American today makes a case for more careful terminology:

ChatGPT Isn’t ‘Hallucinating’—It’s Bullshitting! | Scientific American

To be clear before the dog-piling starts -- this wasn't ChatGPT. It was me spotting someone using public.rtf and then me leaping to the use of public.html without properly testing.

So a totally human error followed by some blame deflection... My bad :frowning: