I picked one at random. Notice that in your 3 examples only one adds any meaningful information in the description -- that pictures will be downloaded from a camera. The really useful info is in the specification which, for KM, is much more completely documented on each action's Wiki page.
No. An image on the screen is not an object -- it's an ordered collection of pixels, each having particular characteristics. The action works best when comparing a screenshot of some or all of a screen to the current screen.
What is a "screen"?
Ah -- now, perhaps, we've found out what your problem is.
The "screen" is what (probably -- other output methods are available) you are looking at this post on. It's that thing you take a "screenshot" of, the thing you really shouldn't use a "screen cleaner" on (they can ruin the coating), that you might use a "screen reader" to interpret... You might, incorrectly, call it the "display" or "monitor" or "magic moving picture".
That is not only colloquially correct -- since KM runs on Apple devices it is also technically correct. Now that you know what a screen is, perhaps you'll understand what the action does.
What does it do by default?
Here's the default action, as freshly added (it autofills a recently-used variable, so yours will be different in that respect):
So, by default, it will look for an image that you paste into the image well in "all screens". That's something anyone who knows what a screen is can understand without having to go and look up "viewport" -- probably 99.999% of users, plus you now. And now you know what a screen is you'll also understand "main screen" and possibly "screen with index". The "Window" variations should be self-explanatory, as should "area".
The number of users who won't understand the above terms but will understand "viewport" is vanishingly small -- and has probably vanished altogether now you know what a screen is.
What image types are acceptable for this action? No information on that at all.
Yes, I'd like to see that as well. But that's me being a nerd -- if you can paste it into the image well, you can use it. If you can pick it with the file picker, you can use it. If the OS puts a bitmap version on the clipboard when you copy it, you can use it. If it's an area of the screen, you can use it. So while it would be nice have an explicit note as to what formats can or can't be used, it doesn't really matter in day-to-day use.