Need help with logic to keep just two lines of text that match

dglancy · January 9, 2024, 12:27am

The goal is to iterate thru lines of text and compare the last two characters. If they match, output just the first two matching lines to a file. In my example the third line ending in B8 is ignored as desired but the last three lines are output instead of two lines. lines of text will always be sorted by the last two characters.

I prefer to use just KM actions to accomplish this goal.

Group.kmactions (8.3 KB)

griffman · January 9, 2024, 5:26am

Is this the output you'd expect to find in the file when done?

112 656  16  unblocked //*[@id="tileid66"] B8
125 494  78  unblocked //*[@id="tileid67"] B8
144 494 147  unblocked //*[@id="tileid70"] B9
193 440 278  unblocked //*[@id="tileid69"] B9
101  62  16  unblocked //*[@id="tileid16"] C5
231 494 423  unblocked //*[@id="tileid18"] C5
216 494 354  unblocked //*[@id="tileid134"] DR
233  62 499  unblocked //*[@id="tileid135"] DR
130 224 147  unblocked //*[@id="tileid73"] N1
232 548 430  unblocked //*[@id="tileid72"] N1
127 116 154  unblocked //*[@id="tileid76"] N2
174 764 257  unblocked //*[@id="tileid77"] N2
195 494 285  unblocked //*[@id="tileid142"] S"
199 116 361  unblocked //*[@id="tileid140"] S"
126 548  85  unblocked //*[@id="tileid113"] WS
146 602 154  unblocked //*[@id="tileid112"] WS

I just want to make sure I understand the objective.

-rob.

guxianbang · January 9, 2024, 9:51am

head -2

ComplexPoint · January 9, 2024, 10:35am

Yes, these questions are always best asked by showing an (input, output) pair.

Iterating is certainly a possible solution, but the goal appears to be a derived list which includes only the first two lines in each group (where grouping is defined by sharing the last two characters).

(Note, incidentally that there may be a subtlety in your data source – it comes from Windows, or a Microsoft application, perhaps ? – in which the lines are delimited not by a single standard macOS/Unix "\n" linefeed, but by a two byte 0d 0a (CR LF) pair)

The components of your derived output list seem to be:

An input string delimited by CR LF pairs
a grouping of the delimited lines by the last two (non-delimiting) characters
taking only the first two lines of each such group
a concatenation of the pruned groups (lists of lists) down to a single flat list
a concatenation of that flat list to a single string (LF delimited for macOS ? CR LF for compatibility with some other context ?

all of those ingredients (splitting, grouping, taking a subset, flattening, forming a delimited string from a list) could be found, in one way or another, in KM action blocks or in a scripting language.

I happen, in this context, to reach first for Keyboard Maestro Execute JavaScript for Automation actions, so, perhaps, assuming Keyboard Maestro version 11, and assuming an output string delimited only by LF, one approach might look something like:

First two lines from each group defined by last two characters.kmmacros (4.5 KB)

Expand disclosure triangle to view JS source

    const main = () =>
        groupOn(x => x.slice(-2))(
            lines(kmvar.local_Source).filter(Boolean)
        )
        .flatMap(lineGroup => lineGroup.slice(0, 2))
        .join("\n");


    // --------------------- GENERIC ---------------------

    // groupOn :: (a -> b) -> [a] -> [[a]]
    const groupOn = f =>
        // A list of lists, each containing only elements
        // which return equal values for f,
        // such that the concatenation of these lists is xs.
        xs => 0 < xs.length
            ? groupBy(a => b => a[0] === b[0])(
                xs.map(x => [f(x), x])
            )
            .map(gp => gp.map(ab => ab[1]))
            : [];


    // groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
    const groupBy = eqOp =>
    // A list of lists, each containing only elements
    // equal under the given equality operator, such
    // that the concatenation of these lists is xs.
        xs => 0 < xs.length
            ? (() => {
                const [h, ...t] = xs;
                const [groups, g] = t.reduce(
                    ([gs, a], x) => eqOp(a[0])(x)
                        ? [gs, [...a, x]]
                        : [[...gs, a], [x]],
                    [[], [h]]
                );

                return [...groups, g];
            })()
            : [];


    // lines :: String -> [String]
    const lines = s =>
    // A list of strings derived from a single string
    // which is delimited by \n or by \r\n or \r.
        0 < s.length
            ? s.split(/\r\n|\n|\r/u)
            : [];

    return main();

dglancy · January 9, 2024, 2:07pm

Yes that is the desired output. realize that there could be 2 to 4 lines that would end with the same two characters.

griffman · January 9, 2024, 2:21pm

ComplexPoint's solution, which I assume works but haven't tested, will be cleaner (and probably way faster) than what I was going to do, which was a variation on the looping that you were using. I'd go with that one :).

-rob.

dglancy · January 9, 2024, 3:43pm

thanks. his solution will work and will be fast. I am trying to understand the design logic so that I can apply it in the future. I do not read JS or AS , etc... and to show my age I am trying to forget Fortran 4 and MS Basic. Hoping to stick with Keyboard Maestro and some shell commands. hence my request from the initial post. any help is appreciated.

griffman · January 9, 2024, 3:48pm

Yea, I am not anything close to a Javascript expert, and I haven't got a clue how his script works.

-rob.

ComplexPoint · January 9, 2024, 4:37pm

@peternlewis

There may be a Maestronic way of grouping collection items (by some shared feature, like the last two characters, in the case of text lines) that I am failing to immediately spot, but if not, I wonder whether some kind of GROUP action might work well with any MAP action that you think of adding, at some point ?

Perhaps, in the case of grouping lines, one could let the user choose a single-character group delimiter for the grouped output,
so that Keyboard Maestro Variable Arrays (with custom delimiters) could yield access to groups as well as to individual lines ?

ComplexPoint · January 9, 2024, 7:02pm

Using Keyboard Maestro For Each actions (and KM variable arrays) without shell commands (which tend to cost too much debugging time) we can:

Obtain a list of (one-based) indexes for the first lines of each group
use these indexes with Keyboard Maestro Variable Arrays to build the pruned list

Starting index of each group- used with LF delimiter for variable array.kmmacros (12 KB)

ComplexPoint · January 9, 2024, 7:35pm

and of course, if you prefer, you can directly refactor from two passes to a single composite pass, once that seems to be working.

(i.e. just one For Each action)

Single pass "Functor Law" refactor.kmmacros (10,3 Ko)

dglancy · January 10, 2024, 4:38am

I appreciate the time you took to solve this request. Many thanks. It will take me some study to follow the process but it should help me in future efforts.

ComplexPoint · January 10, 2024, 5:34am

The key lies in KM Variable Arrays, in which:

index 0 holds the array length,
and we can use custom single-character delimiters, including \n ( %Linefeed% ).

This gives us line indexing, and for each group of lines defined by a shared affix, we can obtain the line number of the first line in the group. (Any line which doesn’t share an affix with its previous sibling)

The pruned copy of the input can be built by iteratively appending Lines[groupIndex] and Lines[1 + groupIndex] to an initially empty accumulator.

The main outstanding detail is whether or not you want to:

keep that redundant MS-style \r ( %Return% ) in each of your line delimiters, or
normalise your output to the macOS default of \n (%Linefeed%) only.

The macro above also assumes (perhaps implausibly) that each group of affix-sharing lines will have at least two members. You may need to add a conditional to avoid appending Lines[1 + GroupIndex] to the accumulator in cases where a particular affix is only found in one isolated line, which is followed immediately by a new group.

Need help with logic to keep just two lines of text that match

Options