Insert Indents & Line Breaks into Text on Clipboard

Enjoy it in good health :slight_smile:

RTF to single line of plain text ?

I'll think about it. Perhaps at the weekend, now.

Yes - back to just plain text. I have it working here in the most inefficient, inelegant, and brute force way possible except it doesn't remove the numbers in the front.

And the RTF is in the clipboard ? Or in a file ?

The RTF is selected in the window that I'm using, so the macro would copy it to the clipboard, make the needed changes, and then paste.

The pattern of what gets into the clipboard is quite variable and application-dependent, but when rich text is copied, there will typically also be a plain text version in the clipboard too.

It might be worth experimenting which something which looks in the clipboard for any plain text content in parallel to the RTF, normalises line breaks and multiple spaces to single spaces, and then resets the clipboard to contain just that 'cleaned' plain text.

Here's a first sketch to try:

Test of clipboard cleaning.kmmacros (20.5 KB)

JS Source for KM Execute JXA action
(() => {
    'use strict';

    // Rob Trew 2020
    // Ver 0.0

    ObjC.import('AppKit');

    // ----------------------- TEST -----------------------
    // main :: IO ()
    const main = () =>
        either(msg => msg)(
            txt => copyText(
                txt.replace(/\s+/g, ' ')
            )
        )(
            clipTextLR()
        );

    // ----------------------- JXA ------------------------

    // clipTextLR :: () -> Either String String
    const clipTextLR = () => {
        const v = ObjC.unwrap($.NSPasteboard.generalPasteboard
            .stringForType($.NSPasteboardTypeString));
        return Boolean(v) && v.length > 0 ? (
            Right(v)
        ) : Left('No utf8-plain-text found in clipboard');
    };

    // copyText :: String -> IO String
    const copyText = s => {
        // String copied to general pasteboard.
        const pb = $.NSPasteboard.generalPasteboard;
        return (
            pb.clearContents,
            pb.setStringForType(
                $(s),
                $.NSPasteboardTypeString
            ),
            s
        );
    };

    // -------------------- GENERIC FUNCTIONS --------------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });

    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });

    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        fr => e => 'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;

    // MAIN ---
    return main();
})();

Got it! Just tried it but unfortunately it doesn't seem to work. Here is the output, which looks peculiar (the 1.1 is now on the left, etc.).

If its any help, this is all for QuickBooks online, and I'm just copying, editing, and pasting test from a field in Safari.

In that case, the next step is to find out what exactly gets into the clipboard, and in what form.

Clipboard Viewer (all textually representable data as JSON).kmmacros (21.5 KB)

If you could:

  • Copy something from a sample field,
  • run the macro above, and show me what it finds.

(perhaps by DM here if its big, etc – you may need to scroll down the output dialog to get all of the formats which it finds)

then we should be able to write something which extracts a more useful string.

Here is the output:

{
"public.utf8-plain-text as string": "‏‎ ‎ 1.1 ‏Lorem ipsum dolor sit amet, consectetur adipiscing elit Fusce accumsan urna a\n tellusultrices, sit amet rutrum mi aliquam Aliquam viverra mauris at lobortis rutrum Aenean\n miturpis, rhoncus ac leo in, hendrerit efficitur nisi Vivamus metus est, posuere nec\n sempereu, laoreet sed nibh Vestibulum vitae bibendum ipsum In semper enim sed arcu\n lobortisposuere Vestibulum eu magna non enim ornare lobortis In nec volutpat sem",
"public.utf8-plain-text as data": "‏‎ ‎ 1.1 ‏Lorem ipsum dolor sit amet, consectetur adipiscing elit Fusce accumsan urna a\n tellusultrices, sit amet rutrum mi aliquam Aliquam viverra mauris at lobortis rutrum Aenean\n miturpis, rhoncus ac leo in, hendrerit efficitur nisi Vivamus metus est, posuere nec\n sempereu, laoreet sed nibh Vestibulum vitae bibendum ipsum In semper enim sed arcu\n lobortisposuere Vestibulum eu magna non enim ornare lobortis In nec volutpat sem",
"org.nspasteboard.AutoGeneratedType as string": "\u0001",
"org.nspasteboard.AutoGeneratedType as data": "\u0001",
"org.nspasteboard.source as propertyList": "com.stairways.keyboardmaestro.engine",
"org.nspasteboard.source as string": "com.stairways.keyboardmaestro.engine",
"org.nspasteboard.source as data": "com.stairways.keyboardmaestro.engine"
}

This is after copying from a web page ?

It looks a bit unexpected - no sign of rich text or HTML ...

Could it be because I'm copying the text in a text form field on a webpage, rather than the webpage itself?

Yes, could be. It looks as if the copy is perhaps being made by a KM macro ?

The next test would be to see what (if anything) is returned by the following snippet, when the clipboard contains something copied from that form on the web-site.

If the output looks tractable, it may just be a question of chopping off the leading label, and pruning out the line breaks.

JS Source
(() => {
    'use strict';

    ObjC.import('AppKit');

    const main = () =>
        either(msg => msg)(txt => txt)(
            clipOfTypeLR('public.utf8-plain-text')
        );


    // ----------------------- JXA ------------------------

    // clipOfTypeLR :: String -> Either String String
    const clipOfTypeLR = utiOrBundleID => {
        const
            strClip = ObjC.deepUnwrap(
                $.NSString.alloc.initWithDataEncoding(
                    $.NSPasteboard.generalPasteboard
                    .dataForType(utiOrBundleID),
                    $.NSUTF8StringEncoding
                )
            );
        return 0 < strClip.length ? (
            Right(strClip)
        ) : Left(
            'No clipboard content found for type "' +
            utiOrBundleID + '"'
        );
    };


    // ---------------- GENERIC FUNCTIONS -----------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });

    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });

    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        fr => e => 'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;

    // MAIN ---
    return main();
})();

Here is the output - not sure if this is what we need.

So to figure out why the prefix is displayed at the end of the first line (a side-effect of the special indent spacing character, perhaps ?) let's have a look at the underlying sequence of codes:

Here is a version which aims to display a hex dump, while skipping any non-printing characters.

As before, trying copying from the web page, running this script, and showing us the output.

JS Source – hex dump version
(() => {
    'use strict';

    ObjC.import('AppKit');

    // Rob Trew 2020
    // Hex dump of plain text clipboard contents.

    const main = () =>
        either(msg => msg)(hexdump(16))(
            clipOfTypeLR('public.utf8-plain-text')
        );

    // hexDump :: Int -> String -> String
    const hexdump = intCols => s =>
        unlines(map(row => {
            const
                txt = concat(map(tpl => {
                    const c = fst(tpl);
                    return isAlphaNum(c) ? (
                        c
                    ) : ' ';
                })(row));
            return (
                `(${justifyLeft(intCols)(' ')(txt)}) ` +
                `(${unwords(map(snd)(row))})`
            );
        })(
            chunksOf(intCols)(
                map(ap(Tuple)(
                    compose(
                        justifyRight(2)(' '),
                        showHex,
                        ord
                    )
                ))(chars(s))
            )
        ));


    // ----------------------- JXA ------------------------

    // clipOfTypeLR :: String -> Either String String
    const clipOfTypeLR = utiOrBundleID => {
        const
            strClip = ObjC.deepUnwrap(
                $.NSString.alloc.initWithDataEncoding(
                    $.NSPasteboard.generalPasteboard
                    .dataForType(utiOrBundleID),
                    $.NSUTF8StringEncoding
                )
            );
        return 0 < strClip.length ? (
            Right(strClip)
        ) : Left(
            'No clipboard content found for type "' +
            utiOrBundleID + '"'
        );
    };


    // ---------------- GENERIC FUNCTIONS -----------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });


    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });


    // Tuple (,) :: a -> b -> (a, b)
    const Tuple = a =>
        b => ({
            type: 'Tuple',
            '0': a,
            '1': b,
            length: 2
        });


    // ap :: (a -> b -> c) -> (a -> b) -> a -> c
    const ap = f =>
        // Applicative instance for functions.
        // f(x) applied to g(x).
        g => x => f(x)(
            g(x)
        );


    // chars :: String -> [Char]
    const chars = s =>
        s.split('');


    // chunksOf :: Int -> [a] -> [[a]]
    const chunksOf = n =>
        xs => enumFromThenTo(0)(n)(
            xs.length - 1
        ).reduce(
            (a, i) => a.concat([xs.slice(i, (n + i))]),
            []
        );


    // compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
    const compose = (...fs) =>
        // A function defined by the right-to-left
        // composition of all the functions in fs.
        fs.reduce(
            (f, g) => x => f(g(x)),
            x => x
        );


    // concat :: [[a]] -> [a]
    // concat :: [String] -> String
    const concat = xs => (
        ys => 0 < ys.length ? (
            ys.every(x => 'string' === typeof x) ? (
                ''
            ) : []
        ).concat(...ys) : ys
    )(list(xs));


    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        fr => e => 'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;


    // enumFromThenTo :: Int -> Int -> Int -> [Int]
    const enumFromThenTo = x1 =>
        x2 => y => {
            const d = x2 - x1;
            return Array.from({
                length: Math.floor(y - x2) / d + 2
            }, (_, i) => x1 + (d * i));
        };


    // fst :: (a, b) -> a
    const fst = tpl =>
        // First member of a pair.
        tpl[0];


    // intToDigit :: Int -> Char
    const intToDigit = n =>
        n >= 0 && n < 16 ? (
            '0123456789ABCDEF'.charAt(n)
        ) : '?';


    // isAlphaNum :: Char -> Bool
    const isAlphaNum = c => {
        const n = c.codePointAt(0);
        return (48 <= n && 57 >= n) || (
            /[A-Za-z\u00C0-\u00FF]/.test(c)
        );
    };

    // justifyLeft :: Int -> Char -> String -> String
    const justifyLeft = n =>
        // The string s, followed by enough padding (with
        // the character c) to reach the string length n.
        c => s => n > s.length ? (
            s.padEnd(n, c)
        ) : s;

    // justifyRight :: Int -> Char -> String -> String
    const justifyRight = n =>
        // The string s, preceded by enough padding (with
        // the character c) to reach the string length n.
        c => s => n > s.length ? (
            s.padStart(n, c)
        ) : s;


    // list :: StringOrArrayLike b => b -> [a]
    const list = xs =>
        // xs itself, if it is an Array,
        // or an Array derived from xs.
        Array.isArray(xs) ? (
            xs
        ) : Array.from(xs);


    // map :: (a -> b) -> [a] -> [b]
    const map = f =>
        // The list obtained by applying f
        // to each element of xs.
        // (The image of xs under f).
        xs => [...xs].map(f);


    // ord :: Char -> Int
    const ord = c =>
        c.codePointAt(0);


    // quotRem :: Int -> Int -> (Int, Int)
    const quotRem = m => n =>
        Tuple(Math.floor(m / n))(
            m % n
        );


    // showHex :: Int -> String
    const showHex = n =>
        showIntAtBase(16)(
            intToDigit
        )(n)('');


    // showLog :: a -> IO ()
    const showLog = (...args) =>
        console.log(
            args
            .map(JSON.stringify)
            .join(' -> ')
        );


    // showIntAtBase :: Int -> (Int -> Char) -> Int -> String -> String
    const showIntAtBase = base => toChr => n => rs => {
        const go = ([n, d], r) => {
            const r_ = toChr(d) + r;
            return 0 !== n ? (
                go(Array.from(quotRem(n)(base)), r_)
            ) : r_;
        };
        return 1 >= base ? (
            'error: showIntAtBase applied to unsupported base'
        ) : 0 > n ? (
            'error: showIntAtBase applied to negative number'
        ) : go(Array.from(quotRem(n)(base)), rs);
    };


    // snd :: (a, b) -> b
    const snd = tpl => tpl[1];


    // unlines :: [String] -> String
    const unlines = xs =>
        // A single string formed by the intercalation
        // of a list of strings with the newline character.
        xs.join('\n');


    // unwords :: [String] -> String
    const unwords = xs =>
        // A space-separated string derived
        // from a list of words.
        xs.join(' ');

    // MAIN ---
    return main();
})();

1 Like

Got it! Here's my process. Go to QBO, and put in the lorem ipsum text into a text field (on an estimate in this case). Then I run the KM Macro that you developed to make it indented. Then I select the resulting text, and run a KM script with that latest bit of JS. Here's the result:

( 1 1 ) (200F 200E 20 200E 20 20 20 20 20 20 20 31 2E 31 A0 A0)
( Lorem ipsum do) (A0 200F 4C 6F 72 65 6D 20 69 70 73 75 6D 20 64 6F)
(lor sit amet co) (6C 6F 72 20 73 69 74 20 61 6D 65 74 2C 20 63 6F)
(nsectetur adipis) (6E 73 65 63 74 65 74 75 72 20 61 64 69 70 69 73)
(cing elit Fusce ) (63 69 6E 67 20 65 6C 69 74 20 46 75 73 63 65 20)
(accumsan urna a ) (61 63 63 75 6D 73 61 6E 20 75 72 6E 61 20 61 A)
( ) (A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0)
(tellusultrices ) (74 65 6C 6C 75 73 75 6C 74 72 69 63 65 73 2C 20)
(sit amet rutrum ) (73 69 74 20 61 6D 65 74 20 72 75 74 72 75 6D 20)
(mi aliquam Aliqu) (6D 69 20 61 6C 69 71 75 61 6D 20 41 6C 69 71 75)
(am viverra mauri) (61 6D 20 76 69 76 65 72 72 61 20 6D 61 75 72 69)
(s at lobortis ru) (73 20 61 74 20 6C 6F 62 6F 72 74 69 73 20 72 75)
(trum Aenean ) (74 72 75 6D 20 41 65 6E 65 61 6E A A0 A0 A0 A0)
( mitu) (A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 6D 69 74 75)
(rpis rhoncus ac) (72 70 69 73 2C 20 72 68 6F 6E 63 75 73 20 61 63)
( leo in hendrer) (20 6C 65 6F 20 69 6E 2C 20 68 65 6E 64 72 65 72)
(it efficitur nis) (69 74 20 65 66 66 69 63 69 74 75 72 20 6E 69 73)
(i Vivamus metus ) (69 20 56 69 76 61 6D 75 73 20 6D 65 74 75 73 20)
(est posuere nec) (65 73 74 2C 20 70 6F 73 75 65 72 65 20 6E 65 63)
( ) ( A A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0)
( sempereu laore) (A0 73 65 6D 70 65 72 65 75 2C 20 6C 61 6F 72 65)
(et sed nibh Vest) (65 74 20 73 65 64 20 6E 69 62 68 20 56 65 73 74)
(ibulum vitae bib) (69 62 75 6C 75 6D 20 76 69 74 61 65 20 62 69 62)
(endum ipsum In s) (65 6E 64 75 6D 20 69 70 73 75 6D 20 49 6E 20 73)
(emper enim sed a) (65 6D 70 65 72 20 65 6E 69 6D 20 73 65 64 20 61)
(rcu ) (72 63 75 A A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0)
( lobortisposu) (A0 A0 A0 A0 6C 6F 62 6F 72 74 69 73 70 6F 73 75)
(ere Vestibulum e) (65 72 65 20 56 65 73 74 69 62 75 6C 75 6D 20 65)
(u magna non enim) (75 20 6D 61 67 6E 61 20 6E 6F 6E 20 65 6E 69 6D)
( ornare lobortis) (20 6F 72 6E 61 72 65 20 6C 6F 62 6F 72 74 69 73)
( In nec volutpat) (20 49 6E 20 6E 65 63 20 76 6F 6C 75 74 70 61 74)
( sem ) (20 73 65 6D)

I have no idea where those 200E and 200F (RTL?) characters are coming from.

Good – thanks for running that test.

Getting late on Friday evening here in Europe, but I will post something Saturday night to drop that exotic prefix and return the rest without the linefeeds.

Thank you @ComplexPoint, for all of your help.

Are those exotic prefixes introduced when we use that indent script?

One guess would be that they might be used by the text field as part of display alignment, or in response to that special spacing character (might be worth looking up any Unicode role that it plays in RTL processing)

They are certainly not introduced by the script itself.

Have a good weekend !

1 Like

A first sketch of something to remove the special hanging indent from text in the clipboard:

  • Copy some material to the clipboard from that text field,
  • try evaluating the following JS

(Once the code has settled down, you can put it in a (KM) Execute JavaScript for Automation action).


JS Source
(() => {
    'use strict';

    // Plain text para in clipboard purged of:
    // - hanging and regular indents including
    //   nbsp, rtl and ltr characters,
    // - line breaks.

    // Rob Trew 2020

    // First draft:
    // Ver 0.00

    // Run with relevant content in clipboard.

    ObjC.import('AppKit');

    // main :: IO ()
    const main = () =>
        either(msg => msg)(withoutHangingIndent)(
            clipTextLR()
        );


    // ------ TEXT PURGED OF SPECIAL HANGING INDENT -------

    // withoutHangingIndent :: String -> String
    const withoutHangingIndent = s => {
        // Without any hanging indent line prefixes
        // which include nbsp rtl and ltr chararacters,
        // and without any line breaks.
        const
            nbsp = chr(readHex('A0')),
            rtl = chr(readHex('200F')),
            ltr = chr(readHex('200E'));
        return unwords(map(
            compose(
                // Reassembled ([Char] -> String)
                stringFromList,

                // Without any irrelevant low bytes.
                filter(le(' ')),

                // Longest suffix that contains no
                // NBSP, RTL, or LTR characters.
                takeWhileR(
                    compose(not, flip(elem)([
                        nbsp, rtl, ltr
                    ]))
                )
            )
        )(lines(s)));
    };


    // ----------------------- JXA ------------------------

    // clipTextLR :: () -> Either String String
    const clipTextLR = () => {
        const
            v = ObjC.unwrap($.NSPasteboard.generalPasteboard
                .stringForType($.NSPasteboardTypeString));
        return Boolean(v) && v.length > 0 ? (
            Right(v)
        ) : Left('No utf8-plain-text found in clipboard.');
    };


    // ---------------- GENERIC FUNCTIONS -----------------
    // https://github.com/RobTrew/prelude-jxa

    // Just :: a -> Maybe a
    const Just = x => ({
        type: 'Maybe',
        Nothing: false,
        Just: x
    });


    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });


    // Nothing :: Maybe a
    const Nothing = () => ({
        type: 'Maybe',
        Nothing: true,
    });


    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });


    // ap :: (a -> b -> c) -> (a -> b) -> a -> c
    const ap = f =>
        // Applicative instance for functions.
        // f(x) applied to g(x).
        g => x => f(x)(
            g(x)
        );


    // chr :: Int -> Char
    const chr = x =>
        // The character at code x.
        String.fromCharCode(x);


    // compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
    const compose = (...fs) =>
        // A function defined by the right-to-left
        // composition of all the functions in fs.
        fs.reduce(
            (f, g) => x => f(g(x)),
            x => x
        );


    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        // Application of the function fl to the
        // contents of any Left value in e, or
        // the application of fr to its Right value.
        fr => e => 'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;


    // elem :: Eq a => a -> [a] -> Bool
    const elem = x =>
        // True if xs contains an instance of x.
        xs => {
            const t = xs.constructor.name;
            return 'Array' !== t ? (
                xs['Set' !== t ? 'includes' : 'has'](x)
            ) : xs.some(eq(x));
        };


    // eq (==) :: Eq a => a -> a -> Bool
    const eq = a =>
        // True when a and b are equivalent in the terms
        // defined below for their shared data type.
        b => a === b;


    // filter :: (a -> Bool) -> [a] -> [a]
    const filter = p =>
        // The elements of xs which match
        // the predicate p.
        xs => [...xs].filter(p);


    // flip :: (a -> b -> c) -> b -> a -> c
    const flip = op =>
        // The binary function op with its arguments reversed.
        1 < op.length ? (
            (a, b) => op(b, a)
        ) : (x => y => op(y)(x));


    // init :: [a] -> [a]
    const init = xs => (
        // All elements of a list except the last.
        ys => 0 < ys.length ? (
            ys.slice(0, -1)
        ) : undefined
    )(list(xs));


    // le :: Ord a => a -> a -> a
    const le = x =>
        // True if x <= y;
        y => x <= y;


    // lines :: String -> [String]
    const lines = s =>
        // A list of strings derived from a single
        // newline-delimited string.
        0 < s.length ? (
            s.split(/[\r\n]/)
        ) : [];


    // list :: StringOrArrayLike b => b -> [a]
    const list = xs =>
        // xs itself, if it is an Array,
        // or an Array derived from xs.
        Array.isArray(xs) ? (
            xs
        ) : Array.from(xs);


    // map :: (a -> b) -> [a] -> [b]
    const map = f =>
        // The list obtained by applying f
        // to each element of xs.
        // (The image of xs under f).
        xs => [...xs].map(f);


    // not :: Bool -> Bool
    const not = b =>
        // Negation of the expression b.
        !b;


    // ord :: Char -> Int
    const ord = c =>
        // Unicode ordinal value of the character.
        c.codePointAt(0);


    // readHex :: String -> Int
    const readHex = s =>
        // Integer value of hexadecimal expression.
        parseInt(s, 16);


    // stringFromList :: [Char] -> String
    const stringFromList = cs =>
        // A String derived from a list of characters.
        cs.join('');


    // takeWhileR :: (a -> Bool) -> [a] -> [a]
    const takeWhileR = p =>
        // The longest suffix of xs in which
        // all elements satisfy p.
        xs => {
            const ys = list(xs);
            let i = ys.length;
            while (i-- && p(ys[i])) {}
            return ys.slice(i + 1);
        };


    // unlines :: [String] -> String
    const unlines = xs =>
        // A single string formed by the intercalation
        // of a list of strings with the newline character.
        xs.join('\n');


    // unwords :: [String] -> String
    const unwords = xs =>
        // A space-separated string derived
        // from a list of words.
        xs.join(' ');


    // words :: String -> [String]
    const words = s =>
        // List of space-delimited sub-strings.
        s.split(/\s+/);

    // MAIN ---
    return main();
})();

2 Likes

That works perfectly! I even put it into a KM macro and that works too. Thank you!

1 Like