Prompt with List and German umlauts

snapitup · July 2, 2025, 11:36am

Hello everyone,

I have a specific German problem with German umlauts: I very often use the "Prompt with List" action to select an entry from a predefined list. The list is read from a file, for example.

Now the list also contains entries with German umlauts "ä". "ö" and "ü". Unfortunately, the "Prompt with List" action cannot find these entries if you enter the umlaut as a search criterion. The entry can be seen in the list if it is unfiltered, but you cannot filter for entries with umlauts.

I have already searched the forum but couldn't find anything.

Am I doing something wrong or does anyone know a solution? Many thanks in advance!

Michael.

hemicyon · July 2, 2025, 12:32pm

Hm. Prompt with List gives results from umlaut letters for me. Off the top of my head, I wonder if your list source is a file that is maybe using another code format for the characters? Would it be possible for you to upload your macro or a sample of the file perhaps? Is it just the one macro/file having the issue?

ComplexPoint · July 2, 2025, 1:27pm

Show is always better than tell.

( hard to be sure of what you're doing without seeing the macro itself )

Nige_S · July 2, 2025, 3:03pm

That's my guess too.

A demo of the difference between ümlaut-the-single-character and ümlaut-the-combining-diaeresis, showing the behaviour OP describes:

Umlaut-ish demo.kmmacros (2.5 KB)

snapitup · July 2, 2025, 4:11pm

OK, thank you very much for now. I don't know how to recognize which code format a ".md" file has. Here is my setting:

The list entries are loaded from "Listing.md", I don't know how to upload this Markdown file here; .md files cannot be selected.

In any case, the file has the following content:

/Users/xyz/Documents/111-25 ABC - XYZ/Journal 111-25.xlsx__111-25 ABC - XYZ
/Users/xyz/Documents/112-25 DEF - UVW/Journal 112-25.xlsx__112-25 DEF - UVW
/Users/xyz/Documents/113-25 GHI - ÖMK/Journal 113-25.xlsx__113-25 GHI - ÖMK
/Users/xyz/Documents/111-24 ABC - XYZ/Journal 111-24.xlsx__111-24 ABC - XYZ
/Users/xyz/Documents/114-24 LLM - ÜVR/Journal 114-24.xlsx__114-24 LLM - ÜVR
/Users/xyz/Documents/112-24 DEF - UVW/Journal 112-24.xlsx__112-24 DEF - UVW

The macro is very simple:

Trigger Listing.kmmacros (2,9 KB)

As soon as I call up the macro, the following prompt appears:

If I enter "Ü", the entry is found:

However, if I enter "Ö", the entry is not found.

Now when uploading I see that in the lines above in the code area the "Ö" is in red, but the "Ü" is in black... Is this perhaps due to the different coding?

I get the lines of the Markdown file using the following macro:

Pfad der markierten Datei in Zwischenablage.kmmacros (2,6 KB)

I manually copied everything in the lines after "__" from the front of the path name to make the prompt-lines smaller. It is therefore possible that the incorrect coding from the front was then also copied to the rear.

So is it possible that the macro for determining the path of a file writes a path to the clipboard in the case of an umlaut, which codes the umlauts in such a way that they are no longer recognized as umlauts in the prompt?

Nige_S · July 2, 2025, 4:46pm

The Ö in that listing is indeed an O followed by a combining diaeresis, as shown by "Zap Gremlins..." in BBEdit:

O\0x0308

...which is why you won't find it with the usual single-character "O-with-an-umlaut", which is

\0xDC

It looks like something to do with how Finder returns the paths -- TBH I'm surprised the Ü was found, since that appears broken as well for me.

What are your language and keyboard settings, and your macOS version?

hemicyon · July 2, 2025, 4:54pm

Not to add to the confusion or mystery, but I'm able to find both that ü and ö...

macOS 15.4.1
US keyboard layout using these keyboard inputs.

British English is the default input, and I always have to type umlaut characters with ⌥U then the appropriate vowel. Wish I had a German layout keyboard to test with.

snapitup · July 2, 2025, 5:23pm

macOS 15.5 and keyboard input source only:

I have the same problem in another macro. The input list is not loaded from a file, but created directly using a slightly modified Apple script provided by JMichaelTX in 2021 called “Build KM Prompt List of File Name and Path”:

It has the following content:

property ptyScriptName : "Build KM Prompt List of File Name and Path"
property ptyScriptVer : "1.1" -- CHG to exclude invisible files
property ptyScriptDate : "2021-02-18"
property ptyScriptAuthor : "JMichaelTX"

--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--»	Get Data from KM
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

set kmInst to system attribute "KMINSTANCE"
tell application "Keyboard Maestro Engine" to set sourceFolderPath to getvariable "DND_CF__TargetFolderPath" instance kmInst

--set sourceFolderPath to "/Volumes/LaCie 2Big 3TB/VIDEO/Projects - Video/PRJ - Trump In His Own Words/Source Media"

set folderSelected to POSIX file sourceFolderPath as alias

tell application "System Events"
	
	--- Get Lists of File Path and File Names of Selected Folder ---
	set filePathList to (POSIX path of folders of folderSelected)
	set fileNameList to (name of folders of folderSelected)
	
	--- Merget File Path and Name into List for KM Prompt with List ---
	set filePromptList to {}
	repeat with iFile from 1 to count of filePathList
		set fileName to item iFile of fileNameList
		if (fileName does not start with ".") then
			set end of filePromptList to item iFile of filePathList & ¬
				"__" & fileName
		end if
	end repeat
	
end tell -- System Events

set text item delimiters to linefeed

return filePromptList as text

Here, too, the entries are delivered to me with incorrect umlauts.

So the question is: How can I prevent this? Or is there a way to check whether incorrect umlauts are present in the prompt before output, possibly using a filter or a regex...?

Nige_S · July 3, 2025, 9:55am

The simple answer is: Only use simple ASCII characters in your file names -- a-zA-Z0-9-_. Which is not a good answer...

Handling Unicode characters is a pain. Your Mac does what it can to help, especially when moving text via the Clipboard, but there are a lot of variables at play so you'll have to test each situation for yourself.

Since we're talking about Finder paths and KM...

These selected files:

...processed with this action:

...then pasted into BBEdit and processed with "Zap Gremlins... Replace with Code" give this result:

/Users/nigel/Desktop/Umlaut Tests/A\0x0308-1.txt
/Users/nigel/Desktop/Umlaut Tests/a\0x0308.txt
/Users/nigel/Desktop/Umlaut Tests/O\0x0308-1.txt
/Users/nigel/Desktop/Umlaut Tests/o\0x0308.txt
/Users/nigel/Desktop/Umlaut Tests/U\0x0308-1.txt
/Users/nigel/Desktop/Umlaut Tests/u\0x0308.txt

So it looks like Finder always uses combining diaeresis. We want to keep those in our paths, so they remain valid, but change them in the Prompt's displayed text so we can filter using keystrokes (which are single code points, not combined characters).

The simple method is a brute-force Search and Replace for each in turn, using what we know already (above), a quick bit of zapping of typed characters in BBEdit:

Ä    \0xC4
ä    \0xE4
Ö    \0xD6
ö    \0xF6
Ü    \0xDC
ü    \0xFC

...and the fact that eg \0xC4 is the token %C4% in KM.

Putting that altogether gives us something like:

Filter by Umlauted Characters.kmmacros (4.8 KB)

Image

I'm no Unicode guru, so I'm not going to claim this is the best way of doing things! But it uses simple KM actions, so should be easily adjustable to do whatever you need -- if anything in it isn't clear, just ask

snapitup · July 3, 2025, 11:10am

Nige_S:

The simple method is a brute-force Search and Replace for each in turn, using what we know already (above), a quick bit of zapping of typed characters in BBEdit:
Ä    \0xC4
ä    \0xE4
Ö    \0xD6
ö    \0xF6
Ü    \0xDC
ü    \0xFC
...and the fact that eg \0xC4 is the token %C4% in KM.

Putting that altogether gives us something like:

Wow, thank you! I have tested the script and it works. I will adapt it later and make sure that it also works with my specific use cases. If I still have questions, I'll get back to you.

Until then, however, one thing is not yet clear to me with your script: The Search&Replace action of the Local_display variable searches for a regex term that contains a tabulator "\t" and also integrates the tabulator into the replacement. When I look at Local_display, there are no tabs before the umlaut. Do you have an explanation for me, just for better understanding? I don't have BBEdit and therefore can't test it. Many thanks!

Nige_S · July 3, 2025, 12:18pm

Yep -- it isn't part of the Search or Replace terms, it's part of the text token within them

If you look carefully at the % symbols you'll see that the search is two tokens, a Variable and a Unicode token:

%Variable%Local_term[1]\t%%0308%
<-------Variable---------><-Uni>

KM has a neat trick where it can treat any string as an array, using the character(s) you provide as the element separators.

%Variable%Local_term[1]\t%

...can be read as

"Take the text stored in the variable Local_term and, using \t as the separator, return the first element."

Local_term holds a line from our tab-delimited Local_snrTerms table, so the first time through the loop it will contain

A<tab>%C4%

...so the search field token

%Variable%Local_term[1]\t%%0308%

...will evaluate to

A%0308%

...while the replace field token takes the second element (the [2]):

%Variable%Local_term[2]\t%

...and evaluates to

%C4%

So this action

...can be explained as:

"For each line in our snrTerms table, search for the letter in column 1 followed by a combining diaeresis and replace with Unicode token in column 2."

We could have 6 individual, hard-coded, "Search and Replace" actions:

etc...

But the tab-delimited "table" is a convenient way to store our search-and-replace terms, and the "For Each" plus using variables in the search and replace fields lets us process all our terms in just two actions.

These "pseudo arrays" can be really useful when processing text -- we're also using them in the "Append Variable" action:

%Variable%Local_paths[Local_i]\n%__%Variable%Local_display[Local_i]\n%%LineFeed%

...where our separator is \n, so that can be read as "Line i of Local_paths, two underscores, line i of Local_display, and a linefeed".

See the "How to Use Custom Array Delimiter" section of "Using Variables" in the manual for more.

snapitup · July 3, 2025, 2:33pm

OMG, yes, now that I've gone through the % characters again, it's clear that it's not a regex but belongs to the variable. What I learned today is that there are "custom array delimiters" that are used in pseudo variables. Once again, a huge thank you for the great explanation!

And just to make sure I've understood it correctly:

By default the comma is the delimiter. So you could also have defined:

... and then you could have left out "\t" in the Search & Replace action, right?

Yes, I tested it, it works. So either by chance or I've got it

Thank you, thank you, thank you and have a great day!

Nige_S · July 3, 2025, 3:45pm

Yep, you've got it.

To me, the tab-delimiting makes the text more readable in this case -- plus it's how the text came out of BBEdit.

Which reminds me -- BBEdit has a free trial, and at the end of 30 days it reverts to "lite" mode. That remains free to use and has all the features most people need -- it's what I used for this experiment.

You can download BBEdit here: Bare Bones Software | Download BBEdit 15

snapitup · July 3, 2025, 7:52pm

Indeed, the text is much easier to read with tabs! Great that I got it right!

Thanks for the tip with BBEdit! Then I'll have a closer look at the file system listings produced by my Mac

Prompt with List and German umlauts

Options