Macro Choices: Sort List of FileNames

Continuing the discussion from Using RegEx to sort text in a variable?:

The above topic has generated much discussion, and several excellent solutions (macros/scripts) to properly sort a list of file names, particularly if you want to sort only on the file name without the extension.

Solutions Offered in Original Topic

(click on link to view post where macro is offered)

@Tom pointed out in his 2nd solution the need to handle file names that contain periods, which his solution does.

I have to say that @Tom’s 2nd solution, using Perl, is very powerful while also being compact. It used only 11 lines of code! :+1:
His code is also very readable and maintainable.

I was hoping that a JXA (JavaScript) solution would also be simple and compact, but unfortunately both JXA solutions are for me hard to read.

So, just as a matter of interest for myself, and anyone else that might like a JXA solution, I have developed a JXA script that is fairly compact (only 26 lines of code), and, IMO, very readable (others may disagree).


<img src="/uploads/default/original/2X/9/9fa034801953dfe7d1745afcb7087c7530a06e23.gif" width="70" height="17"> 2017-09-18 19:28 CT

### My JXA Script/Macro to Sort List of File Names
Revised the script so that it does a 2nd Level sort on file ext

* Handles Periods in the file name
* Sorts FIRST on the file name, THEN on the extension
* Makes use of a RegEx pattern in the JavaScript `split()` function to split the full file name into an array with [FileName, Ext], for each row, but also handles files without an extension.
* Uses @Tom's list of files plus these:
Water 01.jpg
clip 03.wav
File With No Ext

This script uses ES5 JavaScript, compatible with macOS 10.11.6+ (El Capitan+).  No Babel required.  :wink:

**As always, please feel free to ask any questions, and make any suggestion to improve my macro/script.**

---

### Example Output
Using Revised List of Files

<img src="/uploads/default/original/2X/f/f251989403dd5e056e464d1a6d4e74c28d264408.png" width="359" height="348">

Note how, now, "Water 01.jpg" sorts before "Water 01.wav".

---

<img src="/uploads/default/original/2X/9/9fa034801953dfe7d1745afcb7087c7530a06e23.gif" width="70" height="17"> 2017-09-18 19:28 CT


###MACRO:&nbsp;&nbsp;&nbsp;@Sort List of FileNames using JXA [Example]

~~~ VER: 2.0&nbsp;&nbsp;&nbsp;&nbsp;2017-09-18 ~~~

####DOWNLOAD:
<a class="attachment" href="/uploads/default/original/2X/1/10a5cfb22d570a95699794b20a654091805ed65a.kmmacros">@Sort List of FileNames using JXA [Example].kmmacros</a> (7.2 KB)
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**

---


<img src="/uploads/default/original/2X/e/ef59a9b3fd53f6596ce422906f7dad9b6e0b4121.png" width="565" height="1181">

JXA Script (ES5, El Capitan+ Compatible)

'use strict';
(function run() {      // this will auto-run when script is executed

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PURPOSE:  Sort List of FileNames on FileName Then Extension
VER:      2.0    2017-09-18
AUTHOR:    @JMichaelTX
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

var fileNamesStr = Application("Keyboard Maestro Engine").getvariable('SFN__FileList');

var scriptResults = "TBD"

//--- Convert String List to JS Array ---
var fileList = fileNamesStr.split(/[\r\n]/)

//--- Convert Each Row (FileName) into Array of (FileName), (Ext) ---
//     so that we can sort based ONLY on root file name

//--- Make use of RegEx Pattern in the JS split() function to split on last period ---
var table = fileList.map(function(lineStr) {
        return lineStr.split(/\.(?=[^.]*$)/);
    })
    
//--- Sort File List on Root FileName [0], Then on File Extension [1] ---
var sortedTable = table.sort( sortMultiCol([0,1]) );

//--- Re-Join Multil-Column Array into String for Return ---
scriptResults = sortedTable.map(function(e) { return e.join('.');}).join('\n');
//console.log(scriptresCompares);

return scriptResults;

//~~~~~~~~~~~~~~~~ END OF MAIN SCRIPT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
function sortMultiCol(pColsToSortArr) {
/*      Ver 2.0    2017-09-18
---------------------------------------------------------------------------------
  PURPOSE:  Sort Array on One or More Columns, In Specified Order
            Numbers will be sorted as numbers.
            Numbers in quotes will be sorted as strings.
            
  USAGE:    someArray.sort( sortMultiCol([3,1] );  // [3,1] is just an example.
            
  PARAMETERS:
    β€’ pColsToSortArr    | Array of Integers  |  Column Numbers to Sort
    
  RETURNS:  Sorted Array
  
  AUTHOR:    JMichaelTX
  REF:
    1.  robbmj, 2015-03-10, https://stackoverflow.com/a/28969807/915019 
    
  TODO:  Add option for sort direction.
β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”
*/    
  var resCompare, iCol;

  return function(a, b) {
  
        //--- Loop Through All Requested Cols Until Comparision β‰  0 ---
        for (var idx = 0; idx < pColsToSortArr.length; idx++) {
            iCol = pColsToSortArr[idx];
        
            if (typeof a[iCol] === 'number') {
                //--- Compare a Numbers ---
                resCompare = a[iCol] - b[iCol];
                
              //--- Compare as Strings (localeCompare() has other options ---
            } else {resCompare = a[iCol].localeCompare(b[iCol]);}   // case insensitive
            
            //--- Keep Looping Until Comparision β‰  0, or All Cols have been Compared ---
            if (resCompare !== 0) { break;  } // for
        }  // END for
        return resCompare;
    };
} //~~~~~~~~~~~~~~~ END OF function sortMultiCol ~~~~~~~~~~~~~~~~~~~~~~~~~~~
})();  // auto-run function when script is executed.
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ END OF FILE ~~~~~~~~~~~~~~~~~~~~~~~~~~

That wasn’t my understanding. In fact, if you add β€œWater 06.mpg” to the bottom of your list, it will stay there – which is not the expected behavior of a macro that sorts file names. You should handle extensions, too.

I think the point of delimiting the extension was just to distinguish the final period from preceding periods in the filename (which are legal). That was the task the unix sort version of the macro failed (although it will sort that .mpg properly).

Perhaps I misunderstood the requirements, but my macro behaves the same as @Tom's #2 solution. I just ran @Tom's macro, and here are the results:

You said "Water 06.mpg", but I assumed you meant ".jpg".

I believe that both my macro and @Tom's macro sorts the file names the same way the Finder does -- ignores the file extension.

The Finder on my machine sorts by base filename and where the base filenames are identical (as with β€˜Water 06’) sorts within them by extension. Which is what I would expect of a sort.

The sample data kept expanding as we discussed this so there wasn’t any formal specification (not even filenames for that matter). Which is why I thought a prompt for the regex might be useful, accommodating a different approach where necessary.

(I did mean to use the β€œmpg” extension because β€œm” would sort before β€œw” in β€œ.wav” but β€œjpg” works the same way. The point was to have identical base filenames with different extensions.)

I stand corrected. My Finder sorts the same way.
It is easy enough to add a 2nd level sort to my JXA script. I'll do that when I have a few minutes.

Example Sort by Finder

This is my Perl script from the other thread with a secondary sort level:

[example] Sort Lines (Perl ST with Secondary Sort).kmmacros (2.1 KB)

The script:

perl -s -e 'print join( "\n", map { $_->[0] } sort { lc($a->[1]) cmp lc($b->[1]) || lc($a->[0]) cmp lc($b->[0]) } map { [$_, /(.+)\./] } split("\n", $i) )' -- -i="$KMVAR_tmp"

or more readable:

perl -s -e '
  print 
    join( 
      "\n", map { $_->[0] } 
      sort { 
        lc($a->[1]) cmp lc($b->[1]) || 
        lc($a->[0]) cmp lc($b->[0]) 
      } 
      map { [$_, /(.+)\./] } 
      split("\n", $i) 
    )
' -- -i="$KMVAR_tmp"

β€Œ


By the way, @JMichaelTX , do you remember that "nice" topic about getting the root domain out of domain names like "files.google.co.uk" or "www.cs.tut.fi" ?

Eventually we'll run into the same problem here:

Sorting is relatively easy if all file names have exactly one extension:

photo.2017.09.18.png  [ext.: .png]
photo.original.tiff   [ext.: .tiff]
photo.jpg             [ext.: .jpg]

You get the "semantically" relevant part, i.e. the file name root, simply by going always for the last dot.

But similar to multi-part TLDs (co.uk, etc.) we have also file names with multiple extensions:

photo.2017.09.18.png      [ext.: .png]
photo.original.tiff.zip   [ext.: .tiff.zip]
photo.original.gray.tiff  [ext.: .tiff]
photo.tar.xz              [ext.: .tar.xz]

So I guess, without knowledge of valid file name extensions there is no way to solve this, not via regex, not via splitting.

Well done, @Tom! :+1:

I have just updated my OP with a revised macro/script that ADDs the 2nd level sort on file extension.

My thanks to @mrpasini for pointing out the need for the 2nd level sort on file extension. He was absolutely correct!

But, equally important (to me) is that it pushed me into doing some R&D to develop a new JavaScript Multi-Column Array sort function. My research showed quite a few solutions for this, but all of them were hard-coded for a max number of columns.

I’m sure someone else must have previously developed what I just did, but I couldn’t find it. For those interested in JavaScript (and JXA of course), here is my sort function.
As part of this I also discovered the JavaScript localeCompare() method, which is very cool. Among other things, it by default of a case insensitive string comparison.

As always, everyone please feel free to ask questions and post bugs/issues/improvements that you may find.

JavaScript Multi-Col Sort Function

//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
function sortMultiCol(pColsToSortArr) {
/*      Ver 2.0    2017-09-18
---------------------------------------------------------------------------------
  PURPOSE:  Sort Array on One or More Columns, In Specified Order
            Numbers will be sorted as numbers.
            Numbers in quotes will be sorted as strings.
            
  USAGE:    someArray.sort( sortMultiCol([3,1] );  // [3,1] is just an example.
            
  PARAMETERS:
    β€’ pColsToSortArr    | Array of Integers  |  Column Numbers to Sort
    
  RETURNS:  Sorted Array
  
  AUTHOR:    JMichaelTX
  REF:
    1.  robbmj, 2015-03-10, https://stackoverflow.com/a/28969807/915019 
    
  TODO:  Add option for sort direction.
β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”
*/    
  var resCompare, iCol;

  return function(a, b) {
  
        //--- Loop Through All Requested Cols Until Comparision β‰  0 ---
        for (var idx = 0; idx < pColsToSortArr.length; idx++) {
            iCol = pColsToSortArr[idx];
        
            if (typeof a[iCol] === 'number') {
                //--- Compare a Numbers ---
                resCompare = a[iCol] - b[iCol];
                
              //--- Compare as Strings (localeCompare() has other options ---
            } else {resCompare = a[iCol].localeCompare(b[iCol]);}   // case insensitive
            
            //--- Keep Looping Until Comparision β‰  0, or All Cols have been Compared ---
            if (resCompare !== 0) { break;  } // for
        }  // END for
        return resCompare;
    };
} //~~~~~~~~~~~~~~~ END OF function sortMultiCol ~~~~~~~~~~~~~~~~~~~~~~~~~~~

2 Likes