How to Reverse Characters in a Variable

Tom · November 8, 2018, 3:02pm

If you only have singlebyte characters in your input (for example [a-z][A-Z][0-9]) you can also get away with a very nice and short Perl solution:

39-pty-fs8

#!/usr/bin/env perl

use strict;
use warnings;

$_ =  reverse <>;
print;

Or as a command-line one-liner:

perl -pe '$_ = reverse'

‌

[Skip the rest of this post if you are not interested in reversing any kind of UTF characters.]

The pitfall (or: “what I have learned today ) is that this fails to a different degree if the string contains multibyte UTF characters like ä, é, ợ, đ etc.

For the more simple things like Käse, résumé etc. it seems to be sufficient to explicitly set the stdin mode to “utf8” with

binmode STDIN, ':utf8';

With this modification, the Perl script from above produces correctly esäK and émusér.

But with more “sophisticated” strings like Rượu đế this is not enough.

So far the only solutions I have found to work with any kind of strings are these:

1) — Splitting the string with the \X special (“Match Unicode eXtended grapheme cluster”):

#!/usr/bin/env perl

use strict;
use warnings;

binmode STDIN, ':utf8';

$_ = join('', reverse <> =~ /\X/g);
print;

From perldoc.perl.org

2) — Or using the Unicode::GCString module and reversing the result as list:

#!/usr/bin/env perl

use strict;
use warnings;
use Unicode::GCString;

binmode STDIN, ':utf8';

my $str = Unicode::GCString->new(<>);
$_ = join '', reverse @{ $str->as_arrayref };
print;

From Stack Overflow

As test case I have used the nice Vietnamese word Rượu đế, found in a related discussion on perlmonks.org.

The correct reverse string is ếđ uợưR.

Depending on the used technique, I’ve seen two different wrong results:

ÌÌeÄ u£ÌÌoÌuR when using the multibyte incompatible script at the top of this post, and
́̂eđ ựơuR when only setting the binmode to utf8.

If you want to try the scripts above:

All the scripts are written to be used in a KM Shell Script action with input from a variable (stdin), as shown in the screenshot at the top.
You should also set the output to a (non-local) variable, because the KM Shell Script Results window doesn’t show all characters correctly. (At least on my computer.)

Edit/PS:

If you want to test a script with the Vietnamese test string, you have to copy it from the following line (do not copy one of the inline instances in the text!) :

Rượu đế

Although the string looks different here, this will maintain the original composition of the characters. Once you paste the string into a Unicode-compliant editor (and using a compliant font) on your computer it will again look like “ Rượu đế ”, but with the correct composition.

Alternatively you can copy the string from inside this macro.

How to Reverse Characters in a Variable

Edit/PS:

Options