Finally I decide to give this a go, and it turns out, page element detection or OCR is not needed at all. All one needs is blindly clicking that "skip ads" button no matter if it is there. It is even faster because the button could be even clicked when it is not visible...
YouTube ad skipper.kmmacros (6.3 KB)