Filter Action (Line Count) behaviour is odd

When I use the "Filter Variable with Line Count" action, something odd happens. The filter always returns a value equal to the number of carriage returns in the variable, unless the variable contains zero carriage returns. I would expect an empty variable to return a line count of zero, but instead it returns a line count of one, as I think you can witness from this image... (ie, notice that the condition evaluates as TRUE, meaning "Data" contains no CR's, yet the Filter action still returns a line count of "1")


Notice how the empty variable "Data" above returns a line count of size "1". But if you add a single carriage return to the same variable, it still returns a line count of one. It took me an hour to troubleshoot this behavior, as I never imagined an empty variable could possibly have the same line count as a variable with one line.

This is difficult for me as I'm trying to count how many lines come back from a shell command, and in my way of thinking, "no lines of text should constitute a different line count than one line of text". But instead this action returns the same value "1" whether the command returns 0 or 1 line of output. Is this behavior actually desirable?

As a secondary proof that the value should return 0, I observe that the following shell command returns a value of 0, not 1:

ls | grep ZZZZZ | wc -l

If I'm wrong on anything here, I apologize in advance. I have made mistakes before.

Perhaps just a matter of convention to which one can adapt ?

In Applescript terms, for example:

set a to length of {} --> 0
set b to length of {""} --> 1

Split functions applied to an empty string often return a list containing one empty string. In JavaScript terms, for example:

// Empty string -> list of paragraphs
''.split(/\r\n/) //-->  [""]

and

''.split(/\r\n/).length //--> 1

or equivalently:

[""].length  //--> 1
1 Like

I think you are saying (if I understand you) that the Applescript methodology supports my point of view while the Javascript methodology (which I don't follow) does not. Sure. And I showed the UNIX methodology also supports my point of view. My impression was that Keyboard Maestro used UNIX as some of its underlying implementation so I had expected it to follow that methodology. I can certainly adapt, but I think the UNIX methodology is more appropriate for KM. Obviously Peter has the final say. And he has to worry about breaking existing products.

Absolutely – both approaches are perfectly rational. We would probably all agree that `

"a"

is one line.

After that, there is just a definitional choice as to whether we are going to call the following 1 line or zero lines:

""

My personal feeling is that neither is particularly preferable, and both are perfectly understandable. Perhaps consistency and predictability are all we really need ?

1 Like

We understand each other. We see both alternatives. You know what, I didn't actually test what UNIX would return for "wc -l" on a file with a single character and no carriage return. I'm guessing it would return a 1. I still find it unnatural for KM to return a 1 for all three cases: "", "a", and "a\n". Do any other languages line up with KM on all three cases?

Do any other languages line up ...

I think it may turn on the data structure used to represent strings – there is always a fundamental choice to be made between:

  • Two tier: container + contained ? e.g. a string is a (possibly empty) list of characters ?
  • Single tier: (no container) a string is a string and can only be long or short.

In the first case, the notion of 'zero' lines is natural (an empty container).
In the second case, there is a built-in ambiguity – is a very short string no string at all ?

1 Like

Don't guess!  :wink:

With a file containing 1 line of “dddd” and no linefeed:

wc -l ~/Downloads/test.txt

Result:

0 /Users/myUser/Downloads/test.txt 

This is very disagreeable but quite Unixy.

By contrast awk will give a “correct” answer for “dddd” or “dddd\n”

awk 'END{print FNR}' ~/Downloads/test.txt

Result:

1

You have to understand the idiosyncrasies of your tools...

There is NO WAY ON EARTH Keyboard Maestro should return 1 for an empty variable.

So your puzzlement is quite understandable.

@peternlewis -- This is surely a bug -- yes?

On the other hand.

“dddd”

And:

“dddd[EOL]”

Are quite understandably a single line -- even though in the latter case you see a cursor available for text-entry on line 2 -- line 2 does not yet exist.

-Chris

It returns the number of lines used, not the number of carriage returns/line feeds (EOL characters).

Zero or more characters without any EOLs takes one line.

Zero or more non-EOL characters followed and terminated by an EOL still takes one line.

Zero or more non-EOL characters followed by an EOL followed by one or more non-EOL characters takes two lines.

Zero or more non-EOL characters followed by an EOL followed by zero or more non-EOL characters followed and terminated by an EOL still takes two lines.

The current implementation is, I believe (from memory):

  • if the text does not end with a EOL, then add one
  • return the number of EOLs

An argument could be made for zero characters being zero lines, but that's not what it does, and since it's an ambiguous case, it's not going to change.

If the difference is important to you then you'll need to either special case it, or calculate it according to your own rules of what you think the answer should be in all the different cases.

Alrighty. Thanks. I felt it was a point I had the right to inquire about. I think it was legitimately confusing.

It's perfectly valid to ask, and you're right, it is definitely ambiguous what the result should be.

Peter, could you please add all that to the KM Wiki?
Thanks.

I added a link to this topic and a comment that the definition of characters, words and lines is ambiguous at best.