Use Case
- Use to extract just the Domain Name from a valid URL
- This is a very tough problem because of the wide variety of URL formats
- It has been discussed in the KM Forum here:
- There are solutions offered in both of those, and in other places on the Internet
- Having studied and tested all that, I think I may have a highly reliable soluiton.
- But, I need YOUR HELP to test and make sure, and to adjust as needed.
- So please download and test with your most unusual URLs -- try to break the macro!
@peternlewis, since you're a Regex guru, and we've had a number of discussions about URLs, maybe you could review/test this if you have time.
MACRO: Extract Domain from URL [Example]
-~~~ VER: 1.0 2020-07-12 ~~~
Requires: KM 8.2.4+ macOS 10.11 (El Capitan)+
(Macro was written & tested using KM 9.0+ on macOS 10.14.5 (Mojave))
DOWNLOAD Macro File:
Extract Domain from URL [Example].kmmacros
Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.
Example Output
Method
In the large majority of cases this Regex will return the correct Domain Name in the Local__Domain variable.
(?i)\/\/(?:([^\/]+?)\.)?(([^\/]+)\.([A-Z]{2,}))
- https://regex101.com/r/SlSFOy/4
The only issue with the above RegEx is when the TLD (Top Level Domain) is non-US, i.e., a two-character name.
-
When this occurs, then this Regex will miss the main part of the SLD (Second Level Domain) IF there is no actual Server Name (like "www") provided.
-
For Example: https://controldesign.co.uk
-
The RegEx returns:
* Local__Server: "controldesign" -- which is incorrect
* Local__Domain: "co.uk" -- which is incorrect
* Local__SLD: "co"
* Local__TLD: "uk"
- In this case, the Regex thinks "controldesign" is the Server name, which is incorrect.
- When that happens, the Local__SLD value will be ≤ 3 chars because it contains only part of the non-US domain.
- So, this IF/THEN works like this:
IF ((Length(TLD) = 2) AND (Length(SDL) ≤ 3))
THEN prepend the Local__Server to Local__Domain
Note: The KM function for Length is "CHARACTERS()"
ReleaseNotes
Author.@JMichaelTX
PURPOSE:
- Provide a Method to Extract the Domain Name from a URL
NOTICE: This macro/script is just an Example
- It is provided only for educational purposes, and may not be suitable for any specific purpose.
- It has had very limited testing.
- You need to test further before using in a production environment.
- It does not have extensive error checking/handling.
- It may not be complete. It is provided as an example to show you one approach to solving a problem.
HOW TO USE
- Run as is to see the results of the test URLs that have been provided.
- ADD All of Your URL Test Cases to the Below Action in Magenta color.
REQUIRES:
- KM 8.0.2+
- But it can be written in KM 7.3.1+
- It is KM8 specific just because some of the Actions have changed to make things simpler, but equivalent Actions are available in KM 7.3.1.
.
- macOS 10.11.6 (El Capitan)
- KM 8 Requires Yosemite or later, so this macro will probably run on Yosemite, but I make no guarantees.
MACRO SETUP
-
Carefully review the Release Notes and the Macro Actions
- Make sure you understand what the Macro will do.
- You are responsible for running the Macro, not me.
.
- Assign a Trigger to this maro.
- Move this macro to a Macro Group that is only Active when you need this Macro.
- ENABLE this Macro.
. -
REVIEW/CHANGE THE FOLLOWING MACRO ACTIONS:
- ALL Actions that are shown in the magenta color
- ADD your URL Test Cases to this Action:
- ADD All of Your URL Test Cases to the Below
USE AT YOUR OWN RISK
- While I have given this limited testing, and to the best of my knowledge it will do no harm, I cannot guarantee it.
- If you have any doubts or questions:
- Ask first
- Turn on the KM Debugger from the KM Status Menu, and step through the macro, making sure you understand what it is doing with each Action.