KM Execute Shell Script Action with Diacritic Variable Names / Values

Nige_S · November 9, 2022, 8:19pm

So the difference is probably the shell -- use the same as you use in the Terminal and see if that helps.

If that posted shell script is complete then you didn't use a shebang line. IIRC when there's no shebang the default user shell is used which, in your case, is probably zsh. So try adding

#!/bin/zsh

...as the first line of your "Execute Shell Script" action (remembering that you may also have to set environment variables).

magobaol · November 13, 2022, 6:03pm

Thanks Nige_S, but unfortunately adding the shebang doesn't change the result

Nige_S · November 13, 2022, 7:53pm

Yep, I see what you mean -- there appears to be a difference in the encoding of the à when passed by variable or typed into the KM "Execute Shell Script" action. If you run the following:

String with accent test copy.kmmacros (3.3 KB)

Image

...then the displayed output looks the same:

displayedAccents

...but if you open the ~/Desktop/compOutput.txt file in BBEdit and "Zap Gremlins..." replacing with "HTML Entities" you get:

Produttivita&#768;
Produttivit&#224;

The first is a "Combining Grave Accent", "modifying" the previous character (the "a"), while the second is "Latin Small Letter A With Grave".

Rather than mess around any further I suggest you avoid the problem entirely by passing in the "fixed" string in your shell script as a KM variable as well -- that way, your diacritics will all be treated the same:

String with accent test pass both.kmmacros (3.6 KB)

Image

@peternlewis -- am I on the right track, or spouting my usual nonsense?

magobaol · November 13, 2022, 9:07pm

Thanks for the idea, but I can't do that. The fixed string I need to compare the parameter against is inside a database that is also used by other programs.
I guess I could try to add a "KM_value field" to the table, but definitely not my favourite option.
I'll think about it.

Thanks anyway!

Francesco

Nige_S · November 13, 2022, 10:19pm

How were you going to reference it from your shell script (if, indeed, you were)? Perhaps you can pull it into a KM variable then push it back out in the script action:

String with accent test from shell.kmmacros (4.1 KB)

Image

If you can't pull it from the DB then you might be able to build a "translation table" and convert between character entities. Difficult to suggest more without knowing more details of your proposed implementation. Plus I'm already w-a-y outside my Unicode abilities!

peternlewis · November 14, 2022, 2:03am

There are many characters in unicode that have multiple ways to encode essentially the same encoding (and that is ignoring case discrepancies).

In this case there is the U+00E0 à character, and the a+combining accent U+0300 character.

You either need to precompose or decompose both strings before comparing them, or use a comparison that is precompose/decompose agnostic if you want to ensure that they match despite their differences.

KM Execute Shell Script Action with Diacritic Variable Names / Values

Options