How to Detect Invisible Unicode Characters in AppleScript?

A command in my AppleScript gets a string like this:

ERR-The book  Gen cannot be found.

As you can see, there are many signs that are not properly displayed here.
When I set the clipboard to the string and convert it into unicode character values, it shows as this:

E\x00R\x00R\x00-\x00T\x00h\x00e\x00 \x00b\x00o\x00o\x00k\x00 \x00\x1c G\x00e\x00n\x00\x1d  \x00c\x00a\x00n\x00n\x00o\x00t\x00 \x00b\x00e\x00 \x00f\x00o\x00u\x00n\x00d\x00.\x00

In other words, there are many \x00 signs in the string.
When I paste it to https://www.regextester.com I can search for it.

Search for `\u0000' will also find it:

However, when I use

if myStr contains "\\u0000" then 

or

if myStr contains "\\x00" then

It does not work. It cannot detect either "\u0000" or "\x00".

The issue with Keyboard Maestro is: if I pass the result to KM variables, the string will be cut off and I get only the first letter E. If I set the clipboard to the string, I can get the entire string, including those "\x00".

The result in Script Debugger:
(On the left, all results show only "E").

Some one pointed out to me that it is because utf-16 being parsed as utf-8. But I don't know how to fix it.

You're looking for a literal string there – not a meta-character for unicode or hex.

Ask on the Script Debugger forum.

-Chris

1 Like

Thanks, Chris. I got the answer from elsewhere and thought that was the solution.

I've posted there.
This has driven me nuts.

I have solved it by using a handler (from developer.apple.com):

on decodeCharacterHexString(theCharacters)
    copy theCharacters to {theIdentifyingCharacter, theMultiplierCharacter, theRemainderCharacter}
    set theHexList to "123456789ABCDEF"
    if theMultiplierCharacter is in "ABCDEF" then
        set theMultiplierAmount to offset of theMultiplierCharacter in theHexList
    else
        set theMultiplierAmount to theMultiplierCharacter as integer
    end if
    if theRemainderCharacter is in "ABCDEF" then
        set theRemainderAmount to offset of theRemainderCharacter in theHexList
    else
        set theRemainderAmount to theRemainderCharacter as integer
    end if
    set theASCIINumber to (theMultiplierAmount * 16) + theRemainderAmount
    return (ASCII character theASCIINumber)
end decodeCharacterHexString

I can then call it:

if myStr contains (decodeCharacterHexString("%00")) then

Hey @martin,

Ascii Character is a bit anachronistic these days.

See: Character ID.

Here's something simple:

Sending mail - #2 by kai - AppleScript | Mac OS X - MacScripter

But you should really ask on the Script Debugger forum to get up-to-date information. AppleScript has come a long way since the handler you found was written.

-Chris

Hi Chris,

Thank you so much for pointing me to Character ID!

I was very ignorant of pretty much everything.

This works great:

if myStr contains character id 0 then

Edit: I should change (ascii character 0) to character id 0.

1 Like