Sorry I forgot to mention that - I meant to, but it slipped my mind. @JMichaelTX would tell you that this is another advantage of using XPath, and he would be 100% correct. Food for thought...
As Peter mentioned, if you can't see it, neither can KM. Look at it like this: If you did a screen shot, can you see the image you're looking for? If not, then neither can KM. This includes the window being covered by another app.
Consider KM as your personal assistant. You may not be pressing the keys, but something is. Sometimes the action happens so fast, you can't see it. Other times, like looking for Menu items, it actually does happen in a way that's not visible. But for the most part, you see it.
If it bothers you enough to want to do something about it, here's some tips with image finding that can speed things up, that I've discovered over time:
-
The smaller of an area that KM has to search, the quicker it can find the image. So if, for example, you have multiple monitors, then the option to search a specific monitor, or even better, the foreground application, will be quicker, but sometimes this doesn't work. Depends on the application.
-
You could even restrict the area to be searched to a section of the screen, using x/y coordinates and width/height. On the one hand, this is guaranteed to break if you move things around. But if you always run the app full screen, it's less likely to break. And of course, if it does break, you can just change the area, or change back to search the entire screen(s). (@JMichaelTX is rolling his eyes right now, but seriously, it's an easy fix
)
-
This one's tricky, but depending on the image being searched for, it can make a difference: Consider this image:

Now compare it to this image (I've added a border to it to make it clear in this post, but pretend the image doesn't have a border):

There's a lot of white space around this image, that white space going to match a lot of other white space on the screen, so it will take a while to find the unique part. I don't know if KM looks from the top left down, or the bottom right up, or something else, but the point is the same: Trim the image until the most unique portions are at the edges.
OK, I said this could make a difference, but the difference may not be noticeable. But I've had instances where I noticed the time difference, and even though it was in tenths of a second, perception is everything.