Help building a macro to compress Finder selection with external script + show progress bar

Seems I am not clear enough.

Here is the folder. I need to select the folder and it has to compress it by working on every included file (or by chunks) in parallel. I laso mean the files in the Media folder.

We are still testing! And iterating. And that means working on this a bit at a time, not leaping in with your total workflow.

The macro at present works on the items selected in Finder. So select the contents of LR fts to edit 0915, not the folder itself. The way you are running it now you are only spawning a single instance of zip, with no parallelisation -- that will be obvious if you run Activity Monitor, filtering with zip, when you'll only see one process listed.

It's much quicker and easier to test with a Finder selection as your input because you can select a representative subset of your data to work on. Once this bit is working as you want you can sort out the progress feedback. The final step would be "now do this to the entire contents of a folder selected in Finder", because that's when any errors will be most painful...

1 Like

@Rustam_Himadiiev -- here's something to ponder while you are testing...

For the best overall speed you want to maximise the number of CPU cores used for as long as you can. What does that mean for the order in which Finder items should be processed, and how might we achieve that ordering?

This decision feeds back to the "how do we get the file listing" step -- another reason for using Finder selection now while putting off the "proper" way to do it until later :wink:

1 Like

I see! At the moment, it creates the corresponding number of the zip file and one master zip file succesfully :slight_smile:

Now turn on "Display Progress" for the "For Each" action:

Does that give you enough feedback or do you still want to use a "Custom HTML Prompt"? If the latter, what information do you want the Prompt to show?

1 Like

Looks great! It's totally enough!

And if it's enough to just show "I'm doing something" for the zipmerge action you can set "Display Progress" to -1 to give a back-and-forth effect (remembering to change the shell action from "asynchronously" to "Ignore results":

image

If you want a more descriptive progress you could instead "For Each" through all you intermediate zip files, adding them to the master one at a time, with "Display Progress" selected:

image

So, next step -- given any thought to this?

For the best overall speed you want to maximise the number of CPU cores used for as long as you can. What does that mean for the order in which Finder items should be processed, and how might we achieve that ordering?

We need to get the list of the all files from the selected folder(s), process them in parallel by say 8 files in the same time. Say, alhabetically. But I have no idea how it can be achieved. :slight_smile:

The "For Each" loop already takes care of that -- we just have to manage the order of the items in the list.

Consider the following:

...and assume we are only using 2 cores. Compare processing alphabetically to by size (largest first) and it'll go something like

      Alphabetically                      By Size
      Core 1      Core 2                  Core 1      Core 2
      a.txt       b.txt                   others      d.txt
      -           c.txt                   -           -
      d.txt       -                       -           -
      -           -                       -           -
      -           e.txt                   -           h.txt
      -           f.txt                   -           -
      g.txt       -                       -           -
      -           h.txt                   -           -
      -           -                       -           c.txt
      others      -                       -           -
      -           -                       -           -
      -           idle                    -           g.txt
      -           idle                    -           -
      -           idle                    -           -
      -           idle                    -           b.txt
      -           idle                    -           -
      -           idle                    -           f.txt
      -           idle                    a.txt       e.txt
      -           idle
      -           idle
      -           idle
      -           idle
      -           idle
      -           idle
      -           idle
      -           idle

"For Each" does have a "by file size" sort option -- unfortunately, the file size of a directory is 0 so that doesn't help...

Is there a way to get a directory listing, sorted by size, that includes the contents of directories in the calculation?

Does it look into subfolders too?

We can use 6 or 8

It doesn't need to -- it recursively zips (zip -r) every item, so includes the contents of any folders.

There is a problem, though. Because it only "chunks" 1 level down it will work best when you have a folder that has lots of items at the next level in -- a folder that contains twenty folders and files, for example. Your screenshot implies you've only a couple of items -- the "Media" folder will be processed on one core, the "LR fts..." file on another, the rest of your cores will remain idle.

That is not what we want! :frowning: So back to the drawing board...

Another approach, and the one I think your shell script takes, is to recursively list all the items inside the the source folder and divide that into as many lists as you have CPU cores to use. You then chuck each sub-list at a core to zip the items in that list and, when everything's finished, merge the zips into one "master" zip file.

More fun with the "For Each" action and, again, executing the zip "Execute a shell script" action asynchronously lets us parallelise:

Parallel By-File Archiver Demo.kmmacros (10.3 KB)

Image

This time you select the folder whose contents you want to archive then run the macro -- the zip files still go to /tmp while we're testing.

A downside is that because we're zipping lots of files in single shell actions so we lose some of the progress reporting -- back to the back-and-forth bar to at least show something is happening...

If this is close to what you want then we can look at working it into your "two window" solution -- if it's still not right, shout!

1 Like

Yes, that's the problem with that approach.

I like it! It's totally fine in general and I like how it shows the progress. Currently it doesn't merge though. It shows the Merging Zip Files... but doesn't do it.

UPDATE. I get it. Wrong path.