How To Use the KM Percent Encode for URL Filter

JMichaelTX · September 3, 2019, 4:50pm

How To Use the KM Percent Encode a URL Filter

Do not apply the filter to the entire URL, since some "special" characters must not be encoded.
It is best to apply the filter only to the parts of the URL that you are providing, like from KM Variables
Be aware that you must not encode any of the URL Query characters: ? = & when they are used as part of a query.
For more info see: Percent-encoding (URL encoding) -- Wikipedia

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Below is a discussion where @peternlewis clarifies the use of the KM filter Percent Encode a URL. Some of us had the wrong understanding of how to use this filter.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Peter, that is very unexpected behavior. I'm pretty sure most people would expect this filter action to encode the entire URL. I certainly did.

The KM Wiki does NOT mention this restriction.

From:
KM Wiki Percent Encode Filter action

Percent Encode or Decode a URL.

As of Ver 9.0: Percent Encode for URL will encode all non-alphanumeric characters.

peternlewis · September 4, 2019, 4:28am

You can encode the dot (.) and slash (/), but in this case, they have meaning as themselves. If the folder path component had a slash (/) in it (as opposed to slash being used to seperate components), then encoding it as %2F would probably be necessary.

Encoding strings, encoding URLs, encoding scripts, etc, it always an exceptionally complex issue. Some parts must be encoded, some parts must not.

The purpose of the filter is to remove the “special” nature of every character in the encoded component. But you cannot use that on the entire URL, since some characters actually do have a special meaning.

Which ones have special meaning and which ones don't needs to be known in order to know which parts need to be encoded.

This hasn't changed from the pre-9.0 variant - in all cases, you have to encode only the special characters that need to not have a special meaning.

In the URL above, the colon, slash and dot all have special meanings and cannot be encoded. The parts between the dots and slashes is plain text, and any special characters need to be encoded (and as I mentioned above, if one of those components contained a colon, slash or dot that was meant to be part of its name, then it probably would need to be encoded).

The same thing applies for http: URLs. The colons and slashes and ? has a meaning, as does potentially the = and & in the query. Other characters will generally need encoding.

The filter is not magic, it just facilitates the act of encoding the characters. It doesn't know which bits might need to be encoded and which bits don't - and indeed you cannot know that after the URL is constructed. You must do that while constructing the URL. Otherwise you cannot tell which characters of the URL are intended to be special and which not.

For example, a URL without any encoding applied to it like this:

https://www.stairways.com/folder?query

There is no way to know whether that refers to page folder and query at query, or it refers to page folder?query (which is a perfectly valid file name on many systems).

How To Use the KM Percent Encode for URL Filter

How To Use the KM Percent Encode a URL Filter

Options