Macro to sort multi-level outline

I'm looking for a macro that would sort multi-level outlines while preserving the indentation (2 spaces " " per level)

From:

111
333
  aaa
  ccc
  bbb
222

to:

111
222
333
  aaa
  bbb
  ccc

Just putting this out there in case someone already has the macro or can point me in the right direction. I will update this thread with the macro as I believe it might be useful for others too.

(adding these keywords for future forum searches: sort list, sort outline, preserve indentation)

Eh, that didn't work out. Attaching an image
notes

Found an extension on sublime to do that. Indent sort. Here is the result

https://packagecontrol.io/packages/Indent%20Respectful%20Sort#:~:text=Indent%20Respectful%20Sort%20is%20a,or%20all%204%20spaces%20etc.)

KM does NOT have that capability in any of its native Actions. You would have use a script or external app to perform it.

Then you might be able to write a macro that does this (untested):

  1. Set Clipboard to the source text
  2. Paste into Sublime
  3. Tell Subline to do the sort
  4. Copy the results
  5. Set a KM Variable to the Clipboard which contains the sorted results

The first thing I would check is to see if Sublime supports AppleScript.
That could make this very easy.

As an aside, it looks like what you have there is a tree structure.

I recently added to filterCSV the ability to sort the children of selected nodes.

It might well do what you want but:

  1. Iā€™m not sure it will sort the top level nodes - as I didnā€™t test that.
  2. Iā€™m not sure it will emit a file in your format - though Iā€™m sure I could add that.

It certainly can take in the format youā€™ve shown - tab/space-indented tree. (Again, if not, I can fix it.)

filterCSV is A Python 3 script and is command line / stream driven so should fit in well with KM. Donā€™t be put off by either ā€œfilterā€ or ā€œCSVā€ in the name.

As I say, just an aside but might be helpful here.

That should be feasible

For wider use, we would need a script that detects the indent unit (2 spaces is rarer than tabs or 4 spaces, for example)

But indentation would need to be consistent (a mixture of tabs and spaces would prevent correct reading of the outline levels, and might scramble the sorting.

So first, a check for consistent indents. Lets use JavaScript:

JS ā€“ detecting mixed spaces and tabs in the indentation
// consistentIndentChar :: String -> 
// Either String String
const consistentIndentChar = outlineText => {
    const indentChars = lines(outlineText)
        .reduce(
            (cs, s) => 1 < cs.length ? cs : [
                ...takeWhile(c => ' \t'.includes(c))(s)
            ].reduce(
                (a, c) => a.includes(c) ? (
                    a
                ) : a.concat(c),
                cs
            ),
            []
        ),
        lng = indentChars.length;
    return 1 < lng ? (
        Left('Mixed indentation: both tabs + spaces')
    ) : 1 > lng ? (
        Left('No indentation detected')
    ) : Right(indentChars[0]);
};

Then we would need to know how long the indent unit is. One tab ? Four spaces ? Two spaces ?

Again in JavaScript:

JS ā€“ Minimum indent unit
// unitIndent :: String -> String
const unitIndent = outlineText =>
    // The smallest non-zero indent
    // (tab or spaces) of any line
    // in outlineText.
    minimumBy(comparing(length))(
        lines(outlineText)
        .flatMap(s => {
            const
                indentation = takeWhile(
                    char => ' \t'.includes(char)
                )(s);
            return 0 < indentation.length ? (
                [indentation]
            ) : [];
        })
    );

So what is the exact structure of this outline, if we think of it as a forest (a list of trees) ?

How do we parse the list of lines to a list of nested trees ? Sometimes, when confronted with a text pattern recognition problem, we think "I know, I'll use regular expressions", but in the case of trees (and other nested structures, like bracketing or html tagging) there is a problem ā€“ nested patterns are not directly representable in regular languages.

An alternative is to compose:

  • a forest parser,
  • from an indented tree parser,
  • from an indented line parser, which is built from:
    • an indent unit parser
    • a non-EOL character stream parser,
    • an EOL parser,
    • and an EOS parser.

(clicking all the pieces together with some generic parser combinator functions)

Again, in JavaScript:

The forest parser
// forestOf :: String -> Parser Forest String
const forestOf = indentUnit =>
    // Parser for a list of
    // top-level outlines.
    many(treeAtIndentOf(indentUnit)(0));
The indented tree parser
// treeAtIndentOf :: String Int ->
// Parser (Tree String)
const treeAtIndentOf = indentUnit =>
    // Parser for an outline block at a
    // given level of indentation.
    // An indented line and all lines
    // further indented beneath it.
    n => bindP(
        lineAtIndentOf(indentUnit)(n)
    )(txt => fmapP(Node(txt))(
        many(treeAtIndentOf(indentUnit)(1 + n))
    ));
The indented line parser
// lineAtIndentOf :: String -> Int -> Parser String
const lineAtIndentOf = indentUnit =>
    // Parser for a line at a minimum depth
    // of indentation.
    n => bindP(
        count(n)(string(indentUnit))
    )(_ => bindP(
        fmapP(concat)(many(noneOf('\n\r')))
    )(x => bindP(
        many(
            altP(oneOf('\n\r'))(
                eos()
            )
        )
    )(_ => pureP(x))));

So combining these, a first rough draft, (check that it does what you expect, before you apply it to real work) which aims to:

  • copy the selected text
  • parse it to a forest (a list of nested trees of lines at different indent levels)
  • sort the lines at each level of each tree lexically (A-Z, rather than numerically ā€“ that would be a different task)
  • and paste the nested and sorted forest structure as a reflattened (reserialized) series of lines

Sort each level of selected text outline (lexically).kmmacros (36.0 KB)


For an introduction to constructing parsers as a composition of simpler parser functions, see:
Functional Parsing - Computerphile - YouTube


Full JS Source of rough draft
(() => {
    'use strict';

    ObjC.import('AppKit');

    // Lexical sort of any text outline 
    // in the clipboard.

    // Generics (except parser combinators) at:
    // https://github.com/RobTrew/prelude-jxa

    // Rob Trew @ 2020
    // Ver 0.01 

    // main IO ()
    const main = () => either(
        alert('Outline sorting')
    )(
        sortedOutline => (
            copyText(sortedOutline),
            sortedOutline
        )
    )(bindLR(
        clipTextLR()
    )(outline => bindLR(
        consistentIndentChar(outline)
    )(_ => {
        const
            // What does the indent unit of this outline
            // seem to be (assuming that it's consistent) ?
            // tab ? 4 spaces ? some other # of spaces ?
            indentUnit = unitIndent(outline),

            // Parse of outline to a list of trees.
            parsedAsForest = parse(
                forestOf(indentUnit)
            )(outline);

        return 0 < parsedAsForest.length ? (() => {
            const
                // A single virtual tree containing
                // the whole forest,
                virtualTreeRoot = Node({})(
                    fst(parsedAsForest[0])
                ),

                // sorted recursively at each level,
                sortedTree = foldTree(
                    x => xs => Node(x)(
                        sortBy(comparing(root))(xs)
                    )
                )(virtualTreeRoot);

            // and reserialized as a string.
            return Right(
                indentedTextFromForest(indentUnit)(root)(
                    sortedTree.nest
                )
            );
        })() : Left(
            'Input could not be parsed as an outline'
        );
    })));

    // ------------- OUTLINE-PARSING FUNCTIONS -------------

    // forestOf :: String -> Parser Forest String
    const forestOf = indentUnit =>
        // Parser for a list of
        // top-level outlines.
        many(treeAtIndentOf(indentUnit)(0));


    // treeAtIndentOf :: String Int ->
    // Parser (Tree String)
    const treeAtIndentOf = indentUnit =>
        // Parser for an outline block at a
        // given level of indentation.
        // An indented line and all lines
        // further indented beneath it.
        n => bindP(
            lineAtIndentOf(indentUnit)(n)
        )(txt => fmapP(Node(txt))(
            many(treeAtIndentOf(indentUnit)(1 + n))
        ));


    // lineAtIndentOf :: String -> Int -> Parser String
    const lineAtIndentOf = indentUnit =>
        // Parser for a line at a minimum depth
        // of indentation.
        n => bindP(
            count(n)(string(indentUnit))
        )(_ => bindP(
            fmapP(concat)(many(noneOf('\n\r')))
        )(x => bindP(
            many(
                altP(oneOf('\n\r'))(
                    eos()
                )
            )
        )(_ => pureP(x))));


    // consistentIndentChar :: String -> 
    // Either String String
    const consistentIndentChar = outlineText => {
        const indentChars = lines(outlineText)
            .reduce(
                (cs, s) => 1 < cs.length ? cs : [
                    ...takeWhile(c => ' \t'.includes(c))(s)
                ].reduce(
                    (a, c) => a.includes(c) ? (
                        a
                    ) : a.concat(c),
                    cs
                ),
                []
            ),
            lng = indentChars.length;
        return 1 < lng ? (
            Left('Mixed indentation: both tabs + spaces')
        ) : 1 > lng ? (
            Left('No indentation detected')
        ) : Right(indentChars[0]);
    };


    // unitIndent :: String -> String
    const unitIndent = outlineText =>
        // The smallest non-zero indent
        // (tab or spaces) of any line
        // in outlineText.
        minimumBy(comparing(length))(
            lines(outlineText)
            .flatMap(s => {
                const
                    indentation = takeWhile(
                        char => ' \t'.includes(char)
                    )(s);
                return 0 < indentation.length ? (
                    [indentation]
                ) : [];
            })
        );

    // ---------- RE-SERIALIZATION OF PARSE TREES ----------

    // indentedTextFromForest :: (Tree -> String) ->
    //      [Tree] -> String
    const indentedTextFromForest = singleIndent =>
        fnText => trees => {
            const go = indent => tree =>
                indent + fnText(tree) +
                (tree.nest.length > 0 ? (
                    '\n' + tree.nest
                    .map(go(singleIndent + indent))
                    .join('\n')
                ) : '');
            return trees.map(go('')).join('\n');
        };


    // ------- GENERAL COMPOSABLE PARSING FUNCTIONS --------

    // Parser :: String -> [(a, String)] -> Parser a
    const Parser = f =>
        // A function lifted into a Parser object.
        ({
            type: 'Parser',
            parser: f
        });


    // altP (<|>) :: Parser a -> Parser a -> Parser a
    const altP = p =>
        // p, or q if p doesn't match.
        q => Parser(s => {
            const xs = p.parser(s);
            return 0 < xs.length ? (
                xs
            ) : q.parser(s);
        });


    // apP <*> :: Parser (a -> b) -> Parser a -> Parser b
    const apP = pf =>
        // A new parser obtained by the application 
        // of a Parser-wrapped function,
        // to a Parser-wrapped value.
        p => Parser(
            s => pf.parser(s).flatMap(
                vr => fmapP(vr[0])(
                    p
                ).parser(vr[1])
            )
        );


    // bindP (>>=) :: Parser a -> 
    // (a -> Parser b) -> Parser b
    const bindP = p =>
        // A new parser obtained by the application of 
        // a function to a Parser-wrapped value.
        // The function must enrich its output, lifting it 
        // into a new Parser.
        // Allows for the nesting of parsers.
        f => Parser(
            s => p.parser(s).flatMap(
                tpl => f(tpl[0]).parser(tpl[1])
            )
        );


    // char :: Char -> Parser Char
    const char = x =>
        // A particular single character.
        satisfy(c => x == c);


    // count :: Int -> Parser a -> Parser [a]
    const count = n =>
        // A list of n successive instances of p.
        p => sequenceP(
            replicate(n)(p)
        );


    // eos :: Parser String
    const eos = () =>
        // Matches the end of the input string.
        Parser(
            s => 0 < s.length ? (
                []
            ) : [Tuple('')('')]
        );


    // fmapP :: (a -> b) -> Parser a -> Parser b  
    const fmapP = f =>
        // A new parser derived by the structure-preserving 
        // application of f to the value in p.
        p => Parser(
            s => p.parser(s).flatMap(
                vr => Tuple(f(vr[0]))(vr[1])
            )
        );


    // liftA2P :: (a -> b -> c) -> 
    // Parser a -> Parser b -> Parser c
    const liftA2P = op =>
        // The binary function op, lifted
        // to a function over two parsers.
        p => apP(fmapP(op)(p));


    // many :: Parser a -> Parser [a]
    const many = p => {
        // Zero or more instances of p.
        // Lifts a parser for a simple type of value 
        // to a parser for a list of such values.
        const some_p = p =>
            liftA2P(
                x => xs => [x].concat(xs)
            )(p)(many(p));
        return Parser(
            s => (
                0 < s.length ? (
                    altP(some_p(p))(pureP([]))
                ) : pureP([])
            ).parser(s)
        );
    };


    // noneOf :: String -> Parser Char
    const noneOf = s =>
        // Any character not found in the
        // exclusion string.
        satisfy(c => !s.includes(c));


    // oneOf :: [Char] -> Parser Char
    const oneOf = s =>
        // One instance of any character found
        // the given string.
        satisfy(c => s.includes(c));


    // parse :: Parser a -> String -> [(a, String)]
    const parse = p =>
        // The result of parsing s with p.
        s => {
            // showLog('s', s)
            return p.parser(s);
        };


    // pureP :: a -> Parser a
    const pureP = x =>
        // The value x lifted, unchanged, 
        // into the Parser monad.
        Parser(s => [Tuple(x)(s)]);


    // satisfy :: (Char -> Bool) -> Parser Char
    const satisfy = test =>
        // Any character for which the 
        // given predicate returns true.
        Parser(
            s => 0 < s.length ? (
                test(s[0]) ? [
                    Tuple(s[0])(s.slice(1))
                ] : []
            ) : []
        );


    // string :: String -> Parser String
    const string = s =>
        // A particular string.
        fmapP(cs => cs.join(''))(
            sequenceP(s.split('').map(char))
        );


    // sequenceP :: [Parser a] -> Parser [a]
    const sequenceP = ps =>
        // A single parser for a list of values, derived
        // from a list of parsers for single values.
        Parser(
            s => ps.reduce(
                (a, q) => a.flatMap(
                    vr => q.parser(snd(vr)).flatMap(
                        first(xs => fst(vr).concat(xs))
                    )
                ),
                [Tuple([])(s)]
            )
        );


    // ------------------------ JXA ------------------------

    // alert :: String => String -> IO String
    const alert = title => s => {
        const sa = Object.assign(
            Application('System Events'), {
                includeStandardAdditions: true
            });
        return (
            sa.activate(),
            sa.displayDialog(s, {
                withTitle: title,
                buttons: ['OK'],
                defaultButton: 'OK'
            }),
            s
        );
    };


    // clipTextLR :: () -> Either String String
    const clipTextLR = () => (
        // ObjC.import('AppKit')
        v => Boolean(v) && 0 < v.length ? (
            Right(v)
        ) : Left('No utf8-plain-text found in clipboard.')
    )(
        ObjC.unwrap($.NSPasteboard.generalPasteboard
            .stringForType($.NSPasteboardTypeString))
    );


    // copyText :: String -> IO String
    const copyText = s => {
        const pb = $.NSPasteboard.generalPasteboard;
        return (
            pb.clearContents,
            pb.setStringForType(
                $(s),
                $.NSPasteboardTypeString
            ),
            s
        );
    };


    // ----------------- GENERIC FUNCTIONS -----------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });


    // Node :: a -> [Tree a] -> Tree a
    const Node = v =>
        // Constructor for a Tree node which connects a
        // value of some kind to a list of zero or
        // more child trees.
        xs => ({
            type: 'Node',
            root: v,
            nest: xs || []
        });


    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });


    // Tuple (,) :: a -> b -> (a, b)
    const Tuple = a =>
        b => ({
            type: 'Tuple',
            '0': a,
            '1': b,
            length: 2
        });


    // bindLR (>>=) :: Either a ->
    // (a -> Either b) -> Either b
    const bindLR = m =>
        mf => undefined !== m.Left ? (
            m
        ) : mf(m.Right);


    // comparing :: (a -> b) -> (a -> a -> Ordering)
    const comparing = f =>
        x => y => {
            const
                a = f(x),
                b = f(y);
            return a < b ? -1 : (a > b ? 1 : 0);
        };


    // concat :: [[a]] -> [a]
    // concat :: [String] -> String
    const concat = xs => (
        ys => 0 < ys.length ? (
            ys.every(Array.isArray) ? (
                []
            ) : ''
        ).concat(...ys) : ys
    )(list(xs));


    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        // Application of the function fl to the
        // contents of any Left value in e, or
        // the application of fr to its Right value.
        fr => e => 'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;


    // first :: (a -> b) -> ((a, c) -> (b, c))
    const first = f =>
        // A simple function lifted to one which applies
        // to a tuple, transforming only its first item.
        xy => Tuple(f(xy[0]))(
            xy[1]
        );


    // foldTree :: (a -> [b] -> b) -> Tree a -> b
    const foldTree = f => {
        // The catamorphism on trees. A summary
        // value obtained by a depth-first fold.
        const go = tree => f(tree.root)(
            tree.nest.map(go)
        );
        return go;
    };


    // fst :: (a, b) -> a
    const fst = tpl =>
        // First member of a pair.
        tpl[0];


    // length :: [a] -> Int
    const length = xs =>
        // Returns Infinity over objects without finite
        // length. This enables zip and zipWith to choose
        // the shorter argument when one is non-finite,
        // like cycle, repeat etc
        'GeneratorFunction' !== xs.constructor.constructor.name ? (
            xs.length
        ) : Infinity;


    // lines :: String -> [String]
    const lines = s =>
        // A list of strings derived from a single
        // newline-delimited string.
        0 < s.length ? (
            s.split(/[\r\n]/)
        ) : [];


    // list :: StringOrArrayLike b => b -> [a]
    const list = xs =>
        // xs itself, if it is an Array,
        // or an Array derived from xs.
        Array.isArray(xs) ? (
            xs
        ) : Array.from(xs || []);


    // minimumBy :: (a -> a -> Ordering) -> [a] -> a
    const minimumBy = f => xs =>
        list(xs).reduce((a, x) => undefined === a ? x : (
            0 > f(x)(a) ? x : a
        ), undefined);


    // replicate :: Int -> a -> [a]
    const replicate = n =>
        // A list of n copies of x.
        x => Array.from({
            length: n
        }, () => x);


    // root :: Tree a -> a
    const root = tree => tree.root;

    // showLog :: a -> IO ()
    const showLog = (...args) =>
        console.log(
            args
            .map(JSON.stringify)
            .join(' -> ')
        );


    // snd :: (a, b) -> b
    const snd = tpl =>
        // Second member of a pair.
        tpl[1];


    // sortBy :: (a -> a -> Ordering) -> [a] -> [a]
    const sortBy = f =>
        xs => list(xs).slice()
        .sort((a, b) => f(a)(b));


    // takeWhile :: (a -> Bool) -> [a] -> [a]
    // takeWhile :: (Char -> Bool) -> String -> String
    const takeWhile = p => xs =>
        xs.constructor.constructor.name !==
        'GeneratorFunction' ? (() => {
            const n = xs.length;
            return xs.slice(
                0, 0 < n ? until(
                    i => n === i || !p(xs[i])
                )(i => 1 + i)(0) : 0
            );
        })() : takeWhileGen(p)(xs);


    // until :: (a -> Bool) -> (a -> a) -> a -> a
    const until = p => f => x => {
        let v = x;
        while (!p(v)) v = f(v);
        return v;
    };

    // MAIN ---
    return main();
})();
1 Like

The ā€œdetect indentationā€ thing I did in filterCSV. For exactly the same reasons. And, obviously, I had to tackle the inconsistent indentation issue. Ironic as filterCSV is written in Python. :slight_smile:

BTW I added ā€˜emit tabbed-/space-indentedā€ to my todo list for filterCSV. It should be very simple to do, though requiring the indentation character string to carry through from input to output would be a bit more work. (I might get this done today, in fact.)

2 Likes

It turned out to be easier to support than I expected, with lots of flexibility. Documentation is here in the filterCSV repository.

It was fun to add, I think it might be more generally useful, and I don't mind whether anybody uses it or not.

1 Like