Arrange words in string in a specific order

Saul · October 20, 2021, 1:55pm

Hello and thanks for reading this, any help will be greatly appreciated.

I have a column in Excel 2019 (for Mac) populated with a combination of the following words:

Eternal
Oldies
Special
Anytime
2Long1
2Long2
2Long3
Nostalgia
Quiet
(and about 10- 15 additional ones).

A cell in that column can contain any combination, anywhere from 1 word per cell (any word) to any number of words in any order.
For example:

Anytime Quiet
Instrumental Quiet Anytime
Special
Anytime
Children
Anytime Shorten
Celtic International
Nostalgia Classical Dance

I already took care of deleting leading spaces/non printing characters/double/triple spaces/etc with Excel functions but…

I would like to be able to define the order of the words in all cells (EG whenever “Anytime” is present, then it will be the 1st word, 2nd word would be Eternal and so on, according to my choice).

(I tried to learn Regex for that but… not there yet).

ComplexPoint · October 20, 2021, 6:37pm

Show works where tell fails – you always need to show us a before and after:

Input – a screen image of an area of Excel cells before the custom sorting, and
Ouput – a screen image of what those cells would look like after the custom sorting.

ComplexPoint · October 20, 2021, 6:44pm

And if there is a custom sort order, you need to show us a list of:

All the words that can occur, in
the order they would appear in if every word appeared in the same cell.

A particular format, which you can use directly in recent builds of Excel, is an Array Constant string, in which you could pair each possible word with an integer giving its sorting rank.

For one of the subsets you show above for example, and with an Array Constant in which:

Strings need to be quoted,
columns are divided by commas,
rows are divided by semicolons,
and the whole in enclosed in braces:

{"Eternal",1;"Oldies",2;"Special",3;"Anytime",4;"2Long1",5;"2Long2",6;"2Long3",7;"Nostalgia",8;"Quiet",9}

from which a cell like:

={"Eternal",1;"Oldies",2;"Special",3;"Anytime",4;"2Long1",5;"2Long2",6;"2Long3",7;"Nostalgia",8;"Quiet",9}

would overspill the individual cell to create a range like:

and you could

store that ranking table in your spreadsheet as a named value (which doesn't have to physically appear on any of the sheets) using Formulas > Define Name
Reference it in an Excel SORTBY function expression.

(Even easier to use if your build of Excel is recent enough to define the LAMBDA function, in terms of which the string-splitting, sorting, and rejoining can all be defined together)

If your version of Excel doesn't support the LAMBDA function, then the other approach (still needing a table of keywords paired with their ordinal values) would be to:

capture the table in KM through the clipboard,
split, sort, and rejoin each relevant cell
paste back

Saul · October 21, 2021, 12:09am

Hello ComplexPoint.
Thanks for your help.
I loved that you pointed me to this direction and not gave me a solution right away.
I'll try to figure out how to work with it and be back when I am done.
Thanks again!

ComplexPoint · October 21, 2021, 7:14pm

For later, once you have had a look at this yourself, an Excel LAMBDA approach (which might not be viable if your Excel build is not recent) might have the rough shape of this worksheet,

customSortedWordString.xlsx.zip (8.6 KB)

in which a couple of cells contain strings which consists of several (space delimited) names of letters in the Greek alphabet (in no particular order).

The second column shows the same strings reordered, so that the sequence of latter names is not in Roman AZ order, but in the order of the Greek Alphabet, as in:

={"Alpha";"Beta";"Gamma";"Delta";"Epsilon";"Zeta";"Eta";"Theta";"Iota";"Kappa";"Lambda";"Mu";"Nu";"Xi";"Omicron";"Pi";"Rho";"Sigma";"Tau";"Upsilon";"Phi";"Kai";"Psi";"Omega"}

The formula looks like:

and the definitions under Formulas > Define Name are:

Expand disclosure triangle to view Excel LAMBDA definitions

APPENDROWS
=LAMBDA(xs,
    LAMBDA(ys,
        LET(
            nx, ROWS(xs),
            rowIndexes, SEQUENCE(nx + ROWS(ys)),
            colIndexes, SEQUENCE(
                1,
                MAX(COLUMNS(xs), COLUMNS(ys))
            ),

            IFERROR(
                IF(rowIndexes <= nx,
                    INDEX(xs, rowIndexes, colIndexes),
                    INDEX(ys, rowIndexes - nx, colIndexes)
                ),
                NA()
            )
        )
    )
)


greekAlpha
={"Alpha";"Beta";"Gamma";"Delta";"Epsilon";"Zeta";"Eta";"Theta";"Iota";"Kappa";"Lambda";"Mu";"Nu";"Xi";"Omicron";"Pi";"Rho";"Sigma";"Tau";"Upsilon";"Phi";"Kai";"Psi";"Omega"}


sortWordsByTable
=LAMBDA(tokenTable,
    LAMBDA(s,
        LET(
            ws, wordsCol(s),

            TEXTJOIN(" ",TRUE,SORTBY(ws,MATCH(ws,tokenTable,0)))
        )
    )
)


SPLITS
=LAMBDA(needle,
    LAMBDA(haystack,
        LET(
            mbi, IFERROR(
                FIND(needle, haystack, 1),
                -1
            ),

            IF(mbi < 1,
                haystack,
                APPENDROWS(
                    MID(haystack, 1, mbi - 1)
                )(
                    SPLITS(needle)(
                        MID(haystack, mbi + 1, LEN(haystack) - mbi)
                    )
                )
            )
        )
    )
)


wordsCol
=SPLITS(" ")

But an Excel-only solution may not be what you are looking for here, (and it looks as if LAMBDA may only be defined in Beta-channel Excel 365)

A more KM-based approach might involve an Execute JavaScript for Automation action, or the equivalent for AppleScript.

ComplexPoint · October 21, 2021, 9:34pm

Copying several lines with Keyboard Maestro,
and custom sorting their word order in the clipboard, might broadly look like this, in a JS flavour. (AppleScript would also be possible, though it doesn't have much sort logic built in):

Custom word order sort demo.kmmacros (9.2 KB)

Expand disclosure triangle to view JS source

(() => {
    "use strict";

    ObjC.import("AppKit");

    // main :: IO ()
    const main = () =>
        either(
            alert("Custom word sort")
        )(
            sortedLines => sortedLines
        )(
            bindLR(
                clipTextLR()
            )(
                txt => Right(
                    // Restoring Microsoft \r\n
                    // line breaks.
                    msUnlines(
                        lines(txt).map(
                            compose(
                                unwords,
                                sortOn(
                                    ordinalVal(alphaBeta)
                                ),
                                words
                            )
                        )
                    )
                )
            )
        );

    // ---------------- CUSTOM SORT ORDER ----------------

    const alphaBeta = {
        "alpha": 1,
        "beta": 2,
        "gamma": 3,
        "delta": 4,
        "epsilon": 5,
        "zeta": 6,
        "eta": 7,
        "theta": 8,
        "iota": 9,
        "kappa": 10,
        "lambda": 11,
        "mu": 12,
        "nu": 13,
        "xi": 14,
        "omicron": 15,
        "pi": 16,
        "rho": 17,
        "sigma": 18,
        "tau": 19,
        "upsilon": 20,
        "phi": 21,
        "kai": 22,
        "psi": 23,
        "omega": 24
    };

    // ordinalVal :: Dict -> String -> Int
    const ordinalVal = dict =>
        s => dict[toLower(s)] || 0;

    // ----------------------- JXA -----------------------

    // alert :: String => String -> IO String
    const alert = title =>
        s => {
            const sa = Object.assign(
                Application("System Events"), {
                    includeStandardAdditions: true
                });

            return (
                sa.activate(),
                sa.displayDialog(s, {
                    withTitle: title,
                    buttons: ["OK"],
                    defaultButton: "OK"
                }),
                s
            );
        };


    // clipTextLR :: () -> Either String String
    const clipTextLR = () => {
        // Either a message, (if no clip text is found),
        // or the string contents of the clipboard.
        const
            v = ObjC.unwrap(
                $.NSPasteboard.generalPasteboard
                .stringForType($.NSPasteboardTypeString)
            );

        return Boolean(v) && 0 < v.length ? (
            Right(v)
        ) : Left("No utf8-plain-text found in clipboard.");
    };

    // --------------------- GENERIC ---------------------

    // Left :: a -> Either a b
    const Left = x => ({
        type: "Either",
        Left: x
    });


    // Right :: b -> Either a b
    const Right = x => ({
        type: "Either",
        Right: x
    });


    // Tuple (,) :: a -> b -> (a, b)
    const Tuple = a =>
        b => ({
            type: "Tuple",
            "0": a,
            "1": b,
            length: 2,
            *[Symbol.iterator]() {
                for (const k in this) {
                    if (!isNaN(k)) {
                        yield this[k];
                    }
                }
            }
        });


    // bindLR (>>=) :: Either a ->
    // (a -> Either b) -> Either b
    const bindLR = m =>
        mf => m.Left ? (
            m
        ) : mf(m.Right);


    // comparing :: (a -> b) -> (a -> a -> Ordering)
    const comparing = f =>
        x => y => {
            const
                a = f(x),
                b = f(y);

            return a < b ? -1 : (a > b ? 1 : 0);
        };


    // compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
    const compose = (...fs) =>
        // A function defined by the right-to-left
        // composition of all the functions in fs.
        fs.reduce(
            (f, g) => x => f(g(x)),
            x => x
        );


    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl =>
        // Application of the function fl to the
        // contents of any Left value in e, or
        // the application of fr to its Right value.
        fr => e => e.Left ? (
            fl(e.Left)
        ) : fr(e.Right);


    // lines :: String -> [String]
    const lines = s =>
        // A list of strings derived from a single
        // string delimited by newline and or CR.
        0 < s.length ? (
            s.split(/[\r\n]+/u)
        ) : [];


    // sortOn :: Ord b => (a -> b) -> [a] -> [a]
    const sortOn = f =>
        // Equivalent to sortBy(comparing(f)), but with f(x)
        // evaluated only once for each x in xs.
        // ('Schwartzian' decorate-sort-undecorate).
        xs => xs.map(
            x => Tuple(f(x))(x)
        )
        .sort(uncurry(comparing(x => x[0])))
        .map(x => x[1]);


    // toLower :: String -> String
    const toLower = s =>
        // Lower-case version of string.
        s.toLocaleLowerCase();


    // uncurry :: (a -> b -> c) -> ((a, b) -> c)
    const uncurry = f =>
        // A function over a pair, derived
        // from a curried function.
        (...args) => {
            const [x, y] = Boolean(args.length % 2) ? (
                args[0]
            ) : args;

            return f(x)(y);
        };


    // msUnlines :: [String] -> String
    const msUnlines = xs =>
        // A single string formed by the intercalation
        // of a list of strings with Microsoft line breaks.
        xs.join("\r\n");


    // unwords :: [String] -> String
    const unwords = xs =>
        // A space-separated string derived
        // from a list of words.
        xs.join(" ");


    // words :: String -> [String]
    const words = s =>
        // List of space-delimited sub-strings.
        s.split(/\s+/u);


    // MAIN ---
    return main();
})();

AppleScript variant:

Custom word order sort demo (AppleScript variant).kmmacros (18.7 KB)

Expand disclosure triangle to view AppleScript source

use framework "Foundation"
use scripting additions

property alphaBeta : {alpha:1, beta:2, gamma:3, delta:4, epsilon:5, zeta:6, eta:7, theta:8, iota:9, kappa:10, lambda:11, mu:12, nu:13, xi:14, omicron:15, |pi|:16, rho:17, sigma:18, tau:19, upsilon:20, phi:21, kai:22, psi:23, omega:24}

-- Words in each line reordered
on run
    script customWordSorting
        script ordinalValue
            on |λ|(k)
                tell lookupDict(toLower(k), alphaBeta)
                    if its |Nothing| then
                        0
                    else
                        its |Just|
                    end if
                end tell
            end |λ|
        end script
        
        on |λ|(txt)
            script wordsReordered
                on |λ|(textLine)
                    unwords(sortOn(ordinalValue, my |words|(textLine)))
                end |λ|
            end script
            
            -- Microsoft line endings are "\r\n"
            |Right|(intercalate(return & linefeed, ¬
                (map(wordsReordered, paragraphs of txt))))
        end |λ|
    end script
    
    
    -- Words in each text line in the clipboard 
    -- reordered to a custom sequence for known tokens.
    -- (given in alphaBeta)
    either(alert("Custom sorted words in line"), ¬
        identity, ¬
        bindLR(clipTextLR(), customWordSorting))
end run


-- Just :: a -> Maybe a
on Just(x)
    -- Constructor for an inhabited Maybe (option type) value.
    -- Wrapper containing the result of a computation.
    {type:"Maybe", Nothing:false, Just:x}
end Just


-- Left :: a -> Either a b
on |Left|(x)
    {type:"Either", |Left|:x, |Right|:missing value}
end |Left|


-- Nothing :: Maybe a
on Nothing()
    -- Constructor for an empty Maybe (option type) value.
    -- Empty wrapper returned where a computation is not possible.
    {type:"Maybe", Nothing:true}
end Nothing


-- Right :: b -> Either a b
on |Right|(x)
    {type:"Either", |Left|:missing value, |Right|:x}
end |Right|


-- Tuple (,) :: a -> b -> (a, b)
on Tuple(a, b)
    -- Constructor for a pair of values, possibly of two different types.
    {type:"Tuple", |1|:a, |2|:b, length:2}
end Tuple


-- alert :: String -> String -> IO ()
on alert(strTitle)
    script
        on |λ|(s)
            tell current application
                activate
                display dialog s with title strTitle buttons {"OK"} default button "OK"
                return s
            end tell
        end |λ|
    end script
end alert


-- bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
on bindLR(m, mf)
    if missing value is not |Left| of m then
        m
    else
        mReturn(mf)'s |λ|(|Right| of m)
    end if
end bindLR


-- clipTextLR :: () -> Either String String
on clipTextLR()
    set v to unwrap(current application's ¬
        NSPasteboard's generalPasteboard's ¬
        stringForType:("public.utf8-plain-text"))
    if v is not missing value then
        |Right|(v)
    else
        |Left|("No utf8-plain-text found in clipboard.")
    end if
end clipTextLR


-- composeList :: [(a -> a)] -> (a -> a)
on composeList(fs)
    script
        on |λ|(x)
            script go
                on |λ|(f, a)
                    mReturn(f)'s |λ|(a)
                end |λ|
            end script
            foldr(go, x, fs)
        end |λ|
    end script
end composeList


-- concatMap :: (a -> [b]) -> [a] -> [b]
on concatMap(f, xs)
    set lng to length of xs
    set acc to {}
    tell mReturn(f)
        repeat with i from 1 to lng
            set acc to acc & (|λ|(item i of xs, i, xs))
        end repeat
    end tell
    if {text, string} contains class of xs then
        acc as text
    else
        acc
    end if
end concatMap


-- either :: (a -> c) -> (b -> c) -> Either a b -> c
on either(lf, rf, e)
    if missing value is |Left| of e then
        tell mReturn(rf) to |λ|(|Right| of e)
    else
        tell mReturn(lf) to |λ|(|Left| of e)
    end if
end either


-- flatten :: NestedList a -> [a]
on flatten(t)
    -- A flat list derived from a nested list.
    if list is class of t then
        concatMap(my flatten, t)
    else
        t
    end if
end flatten


-- foldr :: (a -> b -> b) -> b -> [a] -> b
on foldr(f, startValue, xs)
    tell mReturn(f)
        set v to startValue
        set lng to length of xs
        repeat with i from lng to 1 by -1
            set v to |λ|(item i of xs, v, i, xs)
        end repeat
        return v
    end tell
end foldr


-- identity :: a -> a
on identity(x)
    -- The argument unchanged.
    x
end identity


-- intercalate :: String -> [String] -> String
on intercalate(delim, xs)
    set {dlm, my text item delimiters} to ¬
        {my text item delimiters, delim}
    set s to xs as text
    set my text item delimiters to dlm
    s
end intercalate


-- lines :: String -> [String]
on |lines|(xs)
    paragraphs of xs
end |lines|


-- lookupDict :: a -> Dict -> Maybe b
on lookupDict(k, dct)
    -- Just the value of k in the dictionary,
    -- or Nothing if k is not found.
    set ca to current application
    set v to (ca's NSDictionary's dictionaryWithDictionary:dct)'s objectForKey:k
    if missing value ≠ v then
        Just(item 1 of ((ca's NSArray's arrayWithObject:v) as list))
    else
        Nothing()
    end if
end lookupDict


-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    -- The list obtained by applying f
    -- to each element of xs.
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map


-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    -- 2nd class handler function lifted into 1st class script wrapper. 
    if script is class of f then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn


-- sortOn :: Ord b => (a -> b) -> [a] -> [a]
-- sortOn :: Ord b => [((a -> b), Bool)]  -> [a] -> [a]
on sortOn(f, xs)
    -- Sort a list by comparing the results of a key function applied to each
    -- element. sortOn f is equivalent to sortBy(comparing(f), xs), but has the
    -- performance advantage of only evaluating f once for each element in
    -- the input list. This is called the decorate-sort-undecorate paradigm,
    -- or Schwartzian transform.
    -- Elements are arranged from from lowest to highest.
    
    -- In this Applescript implementation, f can optionally be [(a -> b)]
    -- or [((a -> b), Bool)]) to specify a compound sort order
    
    --    xs:  List of items to be sorted. 
    --          (The items can be records, lists, or simple values).
    --
    --    f:    A single (a -> b) function (Applescript handler),
    --          or a list of such functions.
    --          if the argument is a list, any function can 
    --          optionally be followed by a bool. 
    --          (False -> descending sort)
    --
    --          (Subgrouping in the list is optional and ignored)
    --          Each function (Item -> Value) in the list should 
    --          take an item (of the type contained by xs) 
    --          as its input and return a simple orderable value 
    --          (Number, String, or Date).
    --
    --          The sequence of key functions and optional 
    --          direction bools defines primary to N-ary sort keys.
    script keyBool
        on |λ|(x, a)
            if boolean is class of x then
                {asc:x, fbs:fbs of a}
            else
                {asc:true, fbs:({Tuple(x, asc of a)} & fbs of a)}
            end if
        end |λ|
    end script
    set {fs, bs} to {|1|, |2|} of unzip(fbs of foldr(keyBool, ¬
        {asc:true, fbs:{}}, flatten({f})))
    
    set intKeys to length of fs
    set ca to current application
    script dec
        property gs : map(my mReturn, fs)
        on |λ|(x)
            set nsDct to (ca's NSMutableDictionary's ¬
                dictionaryWithDictionary:{val:x})
            repeat with i from 1 to intKeys
                (nsDct's setValue:((item i of gs)'s |λ|(x)) ¬
                    forKey:(character id (96 + i)))
            end repeat
            nsDct as record
        end |λ|
    end script
    
    script descrip
        on |λ|(bool, i)
            ca's NSSortDescriptor's ¬
                sortDescriptorWithKey:(character id (96 + i)) ¬
                    ascending:bool
        end |λ|
    end script
    
    script undec
        on |λ|(x)
            val of x
        end |λ|
    end script
    
    map(undec, ((ca's NSArray's arrayWithArray:map(dec, xs))'s ¬
        sortedArrayUsingDescriptors:map(descrip, bs)) as list)
end sortOn


-- toLower :: String -> String
on toLower(str)
    -- String in lower case. 
    tell current application
        ((its (NSString's stringWithString:(str)))'s ¬
            lowercaseStringWithLocale:(its NSLocale's currentLocale())) as text
    end tell
end toLower


-- unwords :: [String] -> String
on unwords(xs)
    set {dlm, my text item delimiters} to ¬
        {my text item delimiters, space}
    set s to xs as text
    set my text item delimiters to dlm
    return s
end unwords


-- unwrap :: NSValue -> a
on unwrap(nsValue)
    if nsValue is missing value then
        missing value
    else
        set ca to current application
        item 1 of ((ca's NSArray's arrayWithObject:nsValue) as list)
    end if
end unwrap


-- unzip :: [(a,b)] -> ([a],[b])
on unzip(xys)
    set xs to {}
    set ys to {}
    repeat with xy in xys
        set end of xs to |1| of xy
        set end of ys to |2| of xy
    end repeat
    return Tuple(xs, ys)
end unzip


-- words :: String -> [String]
on |words|(s)
    set ca to current application
    (((ca's NSString's stringWithString:(s))'s ¬
        componentsSeparatedByCharactersInSet:(ca's ¬
            NSCharacterSet's whitespaceAndNewlineCharacterSet()))'s ¬
        filteredArrayUsingPredicate:(ca's ¬
            NSPredicate's predicateWithFormat:"0 < length")) as list
end |words|

Saul · October 22, 2021, 2:06am

Thanks again ComplexPoint.
Unfortunately the LAMBDA function does not work with the XL version I'm using (2019).
I am going to have to take it slow (life gets in the middle) but I will be back either to tell you "problem solved" or if I'll need more help.
Stay safe.

Saul · October 25, 2021, 1:55am

Hi again. Still working on it. I think I have the solution

Saul · October 25, 2021, 11:04pm

Hi ComplexPoint.

I solved the problem, but, you were right. “Show works where tell fails – you always need to show us a before and after….” (Next time)…

So… here it is. From the beginning (also, just in case someone else will come across the same problem and hopefully benefit from my post):

Just wanted to mention: I’m sure there are many shorter/more efficient/elegant/etc. ways to do that. Mine is just that, mine. I have very limited knowledge and with that+ the few tools that I have it was hard to come up with the solution.

But thanks to you ComplexPoint I went to the Internet and turned it upside down till I found the answers. Just for you to know… I’m very proud about that!

I use Excel 2019 with a mac.

1- The beginning: getting rid of spaces

I had a column of cells with different combination of words in each (A). Some words had multiple spaces between them, some started with a space and some ended with an unneeded one.

I used the formula =TRIM(SUBSTITUTE(A3, CHAR(202), " ")) TO GET RID OF UNWANTED CHARACTERS. To get the number for the character(s) for the CHAR function I used the CODE function.

2- The middle:

After I had a “clean” table I separated the strings with the formula =IF(ISNUMBER(SEARCH(B$2,$A3)),B$2,"")

3- The end…

Eventually I wanted to arrange the cells with the words in a specific order (my choosing) for all the cells in the column.

With the help of the TEXTJOIN function I “united” all words in the row in column H, arranged by the order I wanted.

So, again, thanks a lot ComplexPoint!

Saul · October 28, 2021, 12:24pm

Explaining to my wife how this works in XL I realized there is a "shorter/simpler version"

I had a column of cells with different combination of words in each (A). Some words had multiple spaces between them, some started with a space and some ended with an unneeded one. I wanted to get rid of those.

Additionally, I wanted to arrange the cells with the words in a specific order (my choosing) for all the cells in the column.

So, first I separated the strings with the formula =IF(ISNUMBER(SEARCH(B$2,$A3)),B$2,"") in cell B3, and then joined all the words in the row, arranged in the order I decided with the help of the TEXTJOIN function (=TEXTJOIN(" ",TRUE,B3:G3) in cell H3.

The whole process:

Arrange words in string in a specific order

Options