Convert HTML to Markdown in buffer for pasting?

Here is a slightly different approach using Dom Christie's Turndown (a Node module, which uses the DOM of a running browser to convert HTML -> MD).

In Keyboard Maestro, we can use a slightly adjusted version of it in either a Chrome Javascript or Safari Javascript action, and it requires the corresponding browser to be running.

Chrome seems to run it significantly faster.

(EDIT: That may have just have been because I had JS debugging enabled in Safari

(Note also that Safari use requires Develop > Allow JavaScript from Apple Events to be enabled in the Safari menu system)

Behaviour:

The macro is intended to:

  1. paste HTML or RTF as Markdown, and to
  2. paste any plain UTF8 text in the clipboard with its format unchanged.

Options

Turndown provides some Markdown options. This draft of the Execute JS action uses just two of them at the top of the code, and others can be added.

const main = () =>
    TurndownService({
        // https://www.npmjs.com/package/turndown#options
        headingStyle: 'atx',
        bulletListMarker: '-'
    }).turndown(
        document.kmvar.clipHTML
    );

Paste as Markdown (using Turndown thru browser).kmmacros (70.2 KB)

JS Source – JXA extraction of clipboard text for Turndown

(() => {
    'use strict';

    ObjC.import('AppKit');

    const main = () => {
        const
            ts = ObjC.deepUnwrap(
                $.NSPasteboard.generalPasteboard
                .pasteboardItems.js[0].types
            );
        return elem(
            'public.html', ts
        ) ? (
            Right(clipFromUTI('public.html'))
        ) : elem(
            'public.rtf', ts
        ) ? (
            htmlFromRTFLR([
                'doctype', 'html', 'body', 'xml',
                'style', 'p', 'font', 'head', 'span'
            ], clipFromUTI('public.rtf'))
        ) : elem(
            'public.utf8-plain-text', ts
        ) ? (
            Right(`<pre>${clipFromUTI('public.utf8-plain-text')}</pre>`)
        ) : Left('No HTML, RTF or UTF8 text in clipboard');
    };

    // GENERIC FUNCTIONS ----------------------------

    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });

    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });

    // bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
    const bindLR = (m, mf) =>
        undefined !== m.Left ? (
            m
        ) : mf(m.Right);

    // elem :: Eq a => a -> [a] -> Bool
    const elem = (x, xs) => xs.includes(x);

    // CLIPBOARD

    // clipFromUTI :: String -> String
    const clipFromUTI = strUTI =>
        ObjC.deepUnwrap(
            $.NSString.alloc.initWithDataEncoding(
                $.NSPasteboard.generalPasteboard
                .dataForType(strUTI),
                $.NSUTF8StringEncoding

            )
        );

    // RTF -> HTML

    // htmlFromRTFLR :: [String] -> String -> Either String String
    const htmlFromRTFLR = (exceptTags, strRTF) => {
        const
            as = $.NSAttributedString.alloc
            .initWithRTFDocumentAttributes($(strRTF)
                .dataUsingEncoding($.NSUTF8StringEncoding), 0
            );
        return bindLR(
            typeof as
            .dataFromRangeDocumentAttributesError !== 'function' ? (
                Left('String could not be parsed as RTF')
            ) : Right(as),

            // Function bound if Right value obtained above:
            rtfAS => {
                let error = $();
                const htmlData = rtfAS
                    .dataFromRangeDocumentAttributesError({
                            'location': 0,
                            'length': rtfAS.length
                        }, {
                            DocumentType: 'NSHTML',
                            ExcludedElements: exceptTags
                        },
                        error
                    );
                return Boolean(ObjC.unwrap(htmlData) && !error.code) ? Right(
                    ObjC.unwrap($.NSString.alloc.initWithDataEncoding(
                        htmlData,
                        $.NSUTF8StringEncoding
                    ))
                ) : Left(ObjC.unwrap(error.localizedDescription));
            }
        );
    };

    // MAIN ---
    return main();
})();

2 Likes