Here's what GPT4 has to say about it. I'm not near my mac at the moment so I can't try it yet, but have a look...
In Perl, handling emojis can be tricky due to their multi-byte nature in UTF-8 encoding. The script you have does not explicitly handle the variable-width characters like emojis, which may lead to alignment issues.
Here’s why: The length
function in Perl, by default, operates on the number of bytes in the string, not on the number of characters or grapheme clusters (which is what you'd be interested in for proper display width when dealing with emojis or other multi-byte characters).
Since Perl v5.6, Unicode support has been available, but you need to explicitly tell Perl to treat the input as Unicode. To properly handle emojis, you would need to ensure that your script:
- Reads the input as UTF-8.
- Counts characters in terms of Unicode grapheme clusters rather than bytes.
Here's how you can modify the script to better handle Unicode characters such as emojis:
#!/usr/bin/perl
use strict;
use warnings;
use utf8; # Indicate that this script is written in UTF-8
use Encode qw(decode encode);
use Unicode::GCString;
# Get the value of the Keyboard Maestro variable "Local__Input"
my $input = decode('UTF-8', $ENV{'KMVAR_Local__Input'} // '');
# Get the value of the Keyboard Maestro variable "Local__Spaces"
my $spaces = $ENV{'KMVAR_Local__Spaces'} // 2;
# Get the value of the Keyboard Maestro variable "Local__Delimiter"
my $delimiter = $ENV{'KMVAR_Local__Delimiter'} // ','; # default to a comma if no delimiter is provided
# Split input into lines
my @lines = split("\n", $input);
# Calculate the number of columns
my $num_columns = scalar(split($delimiter, $lines[0]));
# Calculate the length of the longest line in each column
my @max_lengths;
foreach my $line (@lines) {
my @columns = split($delimiter, $line);
for (my $i = 0; $i < $num_columns; $i++) {
my $gcstring = Unicode::GCString->new($columns[$i]);
$max_lengths[$i] = $gcstring->columns() if (!$max_lengths[$i] || $gcstring->columns() > $max_lengths[$i]);
}
}
# Calculate the total width of each column
my @widths = map {$_ + $spaces} @max_lengths;
# Print the aligned columns
foreach my $line (@lines) {
my @columns = split($delimiter, $line);
for (my $i = 0; $i < $num_columns; $i++) {
my $gcstring = Unicode::GCString->new($columns[$i]);
printf("%-${widths[$i]}s", $gcstring->as_string());
}
print "\n";
}
1;
Here's what's added/changed:
use utf8;
tells Perl that your script is written in UTF-8.
use Encode qw(decode encode);
allows you to explicitly decode and encode input/output as UTF-8.
use Unicode::GCString;
to use the Unicode::GCString module which correctly calculates the display width of a string, taking into account Unicode grapheme clusters.
To run this script, you would need the Unicode::GCString
module installed, which can be done via CPAN:
cpan Unicode::GCString
With these changes, the script will be better suited to handle emojis and other complex Unicode characters when aligning text in columns.