I'm looking for a macro that would sort multi-level outlines while preserving the indentation (2 spaces " " per level)
From:
111
333
aaa
ccc
bbb
222
to:
111
222
333
aaa
bbb
ccc
Just putting this out there in case someone already has the macro or can point me in the right direction. I will update this thread with the macro as I believe it might be useful for others too.
(adding these keywords for future forum searches: sort list, sort outline, preserve indentation)
As an aside, it looks like what you have there is a tree structure.
I recently added to filterCSV the ability to sort the children of selected nodes.
It might well do what you want but:
Iām not sure it will sort the top level nodes - as I didnāt test that.
Iām not sure it will emit a file in your format - though Iām sure I could add that.
It certainly can take in the format youāve shown - tab/space-indented tree. (Again, if not, I can fix it.)
filterCSV is A Python 3 script and is command line / stream driven so should fit in well with KM. Donāt be put off by either āfilterā or āCSVā in the name.
As I say, just an aside but might be helpful here.
For wider use, we would need a script that detects the indent unit (2 spaces is rarer than tabs or 4 spaces, for example)
But indentation would need to be consistent (a mixture of tabs and spaces would prevent correct reading of the outline levels, and might scramble the sorting.
So first, a check for consistent indents. Lets use JavaScript:
JS ā detecting mixed spaces and tabs in the indentation
Then we would need to know how long the indent unit is. One tab ? Four spaces ? Two spaces ?
Again in JavaScript:
JS ā Minimum indent unit
// unitIndent :: String -> String
const unitIndent = outlineText =>
// The smallest non-zero indent
// (tab or spaces) of any line
// in outlineText.
minimumBy(comparing(length))(
lines(outlineText)
.flatMap(s => {
const
indentation = takeWhile(
char => ' \t'.includes(char)
)(s);
return 0 < indentation.length ? (
[indentation]
) : [];
})
);
So what is the exact structure of this outline, if we think of it as a forest (a list of trees) ?
How do we parse the list of lines to a list of nested trees ? Sometimes, when confronted with a text pattern recognition problem, we think "I know, I'll use regular expressions", but in the case of trees (and other nested structures, like bracketing or html tagging) there is a problem ā nested patterns are not directly representable in regular languages.
An alternative is to compose:
a forest parser,
from an indented tree parser,
from an indented line parser, which is built from:
an indent unit parser
a non-EOL character stream parser,
an EOL parser,
and an EOS parser.
(clicking all the pieces together with some generic parser combinator functions)
Again, in JavaScript:
The forest parser
// forestOf :: String -> Parser Forest String
const forestOf = indentUnit =>
// Parser for a list of
// top-level outlines.
many(treeAtIndentOf(indentUnit)(0));
The indented tree parser
// treeAtIndentOf :: String Int ->
// Parser (Tree String)
const treeAtIndentOf = indentUnit =>
// Parser for an outline block at a
// given level of indentation.
// An indented line and all lines
// further indented beneath it.
n => bindP(
lineAtIndentOf(indentUnit)(n)
)(txt => fmapP(Node(txt))(
many(treeAtIndentOf(indentUnit)(1 + n))
));
The indented line parser
// lineAtIndentOf :: String -> Int -> Parser String
const lineAtIndentOf = indentUnit =>
// Parser for a line at a minimum depth
// of indentation.
n => bindP(
count(n)(string(indentUnit))
)(_ => bindP(
fmapP(concat)(many(noneOf('\n\r')))
)(x => bindP(
many(
altP(oneOf('\n\r'))(
eos()
)
)
)(_ => pureP(x))));
So combining these, a first rough draft, (check that it does what you expect, before you apply it to real work) which aims to:
copy the selected text
parse it to a forest (a list of nested trees of lines at different indent levels)
sort the lines at each level of each tree lexically (A-Z, rather than numerically ā that would be a different task)
and paste the nested and sorted forest structure as a reflattened (reserialized) series of lines
(() => {
'use strict';
ObjC.import('AppKit');
// Lexical sort of any text outline
// in the clipboard.
// Generics (except parser combinators) at:
// https://github.com/RobTrew/prelude-jxa
// Rob Trew @ 2020
// Ver 0.01
// main IO ()
const main = () => either(
alert('Outline sorting')
)(
sortedOutline => (
copyText(sortedOutline),
sortedOutline
)
)(bindLR(
clipTextLR()
)(outline => bindLR(
consistentIndentChar(outline)
)(_ => {
const
// What does the indent unit of this outline
// seem to be (assuming that it's consistent) ?
// tab ? 4 spaces ? some other # of spaces ?
indentUnit = unitIndent(outline),
// Parse of outline to a list of trees.
parsedAsForest = parse(
forestOf(indentUnit)
)(outline);
return 0 < parsedAsForest.length ? (() => {
const
// A single virtual tree containing
// the whole forest,
virtualTreeRoot = Node({})(
fst(parsedAsForest[0])
),
// sorted recursively at each level,
sortedTree = foldTree(
x => xs => Node(x)(
sortBy(comparing(root))(xs)
)
)(virtualTreeRoot);
// and reserialized as a string.
return Right(
indentedTextFromForest(indentUnit)(root)(
sortedTree.nest
)
);
})() : Left(
'Input could not be parsed as an outline'
);
})));
// ------------- OUTLINE-PARSING FUNCTIONS -------------
// forestOf :: String -> Parser Forest String
const forestOf = indentUnit =>
// Parser for a list of
// top-level outlines.
many(treeAtIndentOf(indentUnit)(0));
// treeAtIndentOf :: String Int ->
// Parser (Tree String)
const treeAtIndentOf = indentUnit =>
// Parser for an outline block at a
// given level of indentation.
// An indented line and all lines
// further indented beneath it.
n => bindP(
lineAtIndentOf(indentUnit)(n)
)(txt => fmapP(Node(txt))(
many(treeAtIndentOf(indentUnit)(1 + n))
));
// lineAtIndentOf :: String -> Int -> Parser String
const lineAtIndentOf = indentUnit =>
// Parser for a line at a minimum depth
// of indentation.
n => bindP(
count(n)(string(indentUnit))
)(_ => bindP(
fmapP(concat)(many(noneOf('\n\r')))
)(x => bindP(
many(
altP(oneOf('\n\r'))(
eos()
)
)
)(_ => pureP(x))));
// consistentIndentChar :: String ->
// Either String String
const consistentIndentChar = outlineText => {
const indentChars = lines(outlineText)
.reduce(
(cs, s) => 1 < cs.length ? cs : [
...takeWhile(c => ' \t'.includes(c))(s)
].reduce(
(a, c) => a.includes(c) ? (
a
) : a.concat(c),
cs
),
[]
),
lng = indentChars.length;
return 1 < lng ? (
Left('Mixed indentation: both tabs + spaces')
) : 1 > lng ? (
Left('No indentation detected')
) : Right(indentChars[0]);
};
// unitIndent :: String -> String
const unitIndent = outlineText =>
// The smallest non-zero indent
// (tab or spaces) of any line
// in outlineText.
minimumBy(comparing(length))(
lines(outlineText)
.flatMap(s => {
const
indentation = takeWhile(
char => ' \t'.includes(char)
)(s);
return 0 < indentation.length ? (
[indentation]
) : [];
})
);
// ---------- RE-SERIALIZATION OF PARSE TREES ----------
// indentedTextFromForest :: (Tree -> String) ->
// [Tree] -> String
const indentedTextFromForest = singleIndent =>
fnText => trees => {
const go = indent => tree =>
indent + fnText(tree) +
(tree.nest.length > 0 ? (
'\n' + tree.nest
.map(go(singleIndent + indent))
.join('\n')
) : '');
return trees.map(go('')).join('\n');
};
// ------- GENERAL COMPOSABLE PARSING FUNCTIONS --------
// Parser :: String -> [(a, String)] -> Parser a
const Parser = f =>
// A function lifted into a Parser object.
({
type: 'Parser',
parser: f
});
// altP (<|>) :: Parser a -> Parser a -> Parser a
const altP = p =>
// p, or q if p doesn't match.
q => Parser(s => {
const xs = p.parser(s);
return 0 < xs.length ? (
xs
) : q.parser(s);
});
// apP <*> :: Parser (a -> b) -> Parser a -> Parser b
const apP = pf =>
// A new parser obtained by the application
// of a Parser-wrapped function,
// to a Parser-wrapped value.
p => Parser(
s => pf.parser(s).flatMap(
vr => fmapP(vr[0])(
p
).parser(vr[1])
)
);
// bindP (>>=) :: Parser a ->
// (a -> Parser b) -> Parser b
const bindP = p =>
// A new parser obtained by the application of
// a function to a Parser-wrapped value.
// The function must enrich its output, lifting it
// into a new Parser.
// Allows for the nesting of parsers.
f => Parser(
s => p.parser(s).flatMap(
tpl => f(tpl[0]).parser(tpl[1])
)
);
// char :: Char -> Parser Char
const char = x =>
// A particular single character.
satisfy(c => x == c);
// count :: Int -> Parser a -> Parser [a]
const count = n =>
// A list of n successive instances of p.
p => sequenceP(
replicate(n)(p)
);
// eos :: Parser String
const eos = () =>
// Matches the end of the input string.
Parser(
s => 0 < s.length ? (
[]
) : [Tuple('')('')]
);
// fmapP :: (a -> b) -> Parser a -> Parser b
const fmapP = f =>
// A new parser derived by the structure-preserving
// application of f to the value in p.
p => Parser(
s => p.parser(s).flatMap(
vr => Tuple(f(vr[0]))(vr[1])
)
);
// liftA2P :: (a -> b -> c) ->
// Parser a -> Parser b -> Parser c
const liftA2P = op =>
// The binary function op, lifted
// to a function over two parsers.
p => apP(fmapP(op)(p));
// many :: Parser a -> Parser [a]
const many = p => {
// Zero or more instances of p.
// Lifts a parser for a simple type of value
// to a parser for a list of such values.
const some_p = p =>
liftA2P(
x => xs => [x].concat(xs)
)(p)(many(p));
return Parser(
s => (
0 < s.length ? (
altP(some_p(p))(pureP([]))
) : pureP([])
).parser(s)
);
};
// noneOf :: String -> Parser Char
const noneOf = s =>
// Any character not found in the
// exclusion string.
satisfy(c => !s.includes(c));
// oneOf :: [Char] -> Parser Char
const oneOf = s =>
// One instance of any character found
// the given string.
satisfy(c => s.includes(c));
// parse :: Parser a -> String -> [(a, String)]
const parse = p =>
// The result of parsing s with p.
s => {
// showLog('s', s)
return p.parser(s);
};
// pureP :: a -> Parser a
const pureP = x =>
// The value x lifted, unchanged,
// into the Parser monad.
Parser(s => [Tuple(x)(s)]);
// satisfy :: (Char -> Bool) -> Parser Char
const satisfy = test =>
// Any character for which the
// given predicate returns true.
Parser(
s => 0 < s.length ? (
test(s[0]) ? [
Tuple(s[0])(s.slice(1))
] : []
) : []
);
// string :: String -> Parser String
const string = s =>
// A particular string.
fmapP(cs => cs.join(''))(
sequenceP(s.split('').map(char))
);
// sequenceP :: [Parser a] -> Parser [a]
const sequenceP = ps =>
// A single parser for a list of values, derived
// from a list of parsers for single values.
Parser(
s => ps.reduce(
(a, q) => a.flatMap(
vr => q.parser(snd(vr)).flatMap(
first(xs => fst(vr).concat(xs))
)
),
[Tuple([])(s)]
)
);
// ------------------------ JXA ------------------------
// alert :: String => String -> IO String
const alert = title => s => {
const sa = Object.assign(
Application('System Events'), {
includeStandardAdditions: true
});
return (
sa.activate(),
sa.displayDialog(s, {
withTitle: title,
buttons: ['OK'],
defaultButton: 'OK'
}),
s
);
};
// clipTextLR :: () -> Either String String
const clipTextLR = () => (
// ObjC.import('AppKit')
v => Boolean(v) && 0 < v.length ? (
Right(v)
) : Left('No utf8-plain-text found in clipboard.')
)(
ObjC.unwrap($.NSPasteboard.generalPasteboard
.stringForType($.NSPasteboardTypeString))
);
// copyText :: String -> IO String
const copyText = s => {
const pb = $.NSPasteboard.generalPasteboard;
return (
pb.clearContents,
pb.setStringForType(
$(s),
$.NSPasteboardTypeString
),
s
);
};
// ----------------- GENERIC FUNCTIONS -----------------
// https://github.com/RobTrew/prelude-jxa
// Left :: a -> Either a b
const Left = x => ({
type: 'Either',
Left: x
});
// Node :: a -> [Tree a] -> Tree a
const Node = v =>
// Constructor for a Tree node which connects a
// value of some kind to a list of zero or
// more child trees.
xs => ({
type: 'Node',
root: v,
nest: xs || []
});
// Right :: b -> Either a b
const Right = x => ({
type: 'Either',
Right: x
});
// Tuple (,) :: a -> b -> (a, b)
const Tuple = a =>
b => ({
type: 'Tuple',
'0': a,
'1': b,
length: 2
});
// bindLR (>>=) :: Either a ->
// (a -> Either b) -> Either b
const bindLR = m =>
mf => undefined !== m.Left ? (
m
) : mf(m.Right);
// comparing :: (a -> b) -> (a -> a -> Ordering)
const comparing = f =>
x => y => {
const
a = f(x),
b = f(y);
return a < b ? -1 : (a > b ? 1 : 0);
};
// concat :: [[a]] -> [a]
// concat :: [String] -> String
const concat = xs => (
ys => 0 < ys.length ? (
ys.every(Array.isArray) ? (
[]
) : ''
).concat(...ys) : ys
)(list(xs));
// either :: (a -> c) -> (b -> c) -> Either a b -> c
const either = fl =>
// Application of the function fl to the
// contents of any Left value in e, or
// the application of fr to its Right value.
fr => e => 'Either' === e.type ? (
undefined !== e.Left ? (
fl(e.Left)
) : fr(e.Right)
) : undefined;
// first :: (a -> b) -> ((a, c) -> (b, c))
const first = f =>
// A simple function lifted to one which applies
// to a tuple, transforming only its first item.
xy => Tuple(f(xy[0]))(
xy[1]
);
// foldTree :: (a -> [b] -> b) -> Tree a -> b
const foldTree = f => {
// The catamorphism on trees. A summary
// value obtained by a depth-first fold.
const go = tree => f(tree.root)(
tree.nest.map(go)
);
return go;
};
// fst :: (a, b) -> a
const fst = tpl =>
// First member of a pair.
tpl[0];
// length :: [a] -> Int
const length = xs =>
// Returns Infinity over objects without finite
// length. This enables zip and zipWith to choose
// the shorter argument when one is non-finite,
// like cycle, repeat etc
'GeneratorFunction' !== xs.constructor.constructor.name ? (
xs.length
) : Infinity;
// lines :: String -> [String]
const lines = s =>
// A list of strings derived from a single
// newline-delimited string.
0 < s.length ? (
s.split(/[\r\n]/)
) : [];
// list :: StringOrArrayLike b => b -> [a]
const list = xs =>
// xs itself, if it is an Array,
// or an Array derived from xs.
Array.isArray(xs) ? (
xs
) : Array.from(xs || []);
// minimumBy :: (a -> a -> Ordering) -> [a] -> a
const minimumBy = f => xs =>
list(xs).reduce((a, x) => undefined === a ? x : (
0 > f(x)(a) ? x : a
), undefined);
// replicate :: Int -> a -> [a]
const replicate = n =>
// A list of n copies of x.
x => Array.from({
length: n
}, () => x);
// root :: Tree a -> a
const root = tree => tree.root;
// showLog :: a -> IO ()
const showLog = (...args) =>
console.log(
args
.map(JSON.stringify)
.join(' -> ')
);
// snd :: (a, b) -> b
const snd = tpl =>
// Second member of a pair.
tpl[1];
// sortBy :: (a -> a -> Ordering) -> [a] -> [a]
const sortBy = f =>
xs => list(xs).slice()
.sort((a, b) => f(a)(b));
// takeWhile :: (a -> Bool) -> [a] -> [a]
// takeWhile :: (Char -> Bool) -> String -> String
const takeWhile = p => xs =>
xs.constructor.constructor.name !==
'GeneratorFunction' ? (() => {
const n = xs.length;
return xs.slice(
0, 0 < n ? until(
i => n === i || !p(xs[i])
)(i => 1 + i)(0) : 0
);
})() : takeWhileGen(p)(xs);
// until :: (a -> Bool) -> (a -> a) -> a -> a
const until = p => f => x => {
let v = x;
while (!p(v)) v = f(v);
return v;
};
// MAIN ---
return main();
})();
The ādetect indentationā thing I did in filterCSV. For exactly the same reasons. And, obviously, I had to tackle the inconsistent indentation issue. Ironic as filterCSV is written in Python.
BTW I added āemit tabbed-/space-indentedā to my todo list for filterCSV. It should be very simple to do, though requiring the indentation character string to carry through from input to output would be a bit more work. (I might get this done today, in fact.)