MoveMouseToWord Macro (v1.0)

This macro moves the mouse to the location on the main screen of the "word" that you pass to this macro. I generally use this approach to replace my Find Image actions when they attempt to locate a "word" on the screen. This means that, with no changes, this program can be used to locate a location on the screen even if the colours of the word changes (eg, DarkMode, LightMode, and more.) This program also reduces the size of your macro file, since it lets you reduce the number of images saved, which can even make the KM Editor more responsive.

The new Apple OCR code is very fast (about 600% as fast as the old OCR in KM v10, and MUCH more accurate) so this routine is viable as a means to move the mouse. This macro has two separate methods for locating a word. The first method, which is extremely fast, is that it "remembers" the location of any word, using a dictionary, and does a quick check to see if the word is still in the same location. If so, it is practically instantaneous. The second method it to do a modified "binary search" of the screen looking for the "word", and on a 2020 M1 Mac, this takes a variable amount of time between 5 and 10 seconds. That may seem slow, but in most cased its memory will be valid, and this 5-10 seconds will rarely have to be repeated.

This macro is written as a subroutine and it has two parameters: (1) the word, and (2) the approximate font size of the word. The second parameters helps the algorithm to know when to stop searching. Without that second value, it might end up stopping way too early and missing the correct location of the word.

I just rewrote this entire macro from scratch, in an hour or two, because my older version was using the OCR in KM v10, which meant that the full search tended to take 30-45 seconds, which was annoyingly slow. I also rewrote it because I knew I could make improvements. It feels much better now.

There is one small block of code that occurs twice in this macro. Generally I avoid that approach, but in this case I wanted to keep everything in one macro.

It's possible that this code contains a bug, so I may add updated versions here in the future.

Feel free to try it out, and see if it works for you. Please report any problems here.

At the end, your mouse is moved to the location of the word, and a small blue box is displayed for one second so that you can see how accurate the algorithm was. If you want to save that location or immediately click the mouse, that's up to you to do afterwards. If it does not find your word, it returns an error "2", or if it thinks it was close but wasn't sure the result was good enough, it will return a "1".

This macro can be used to see how the following features of KM can be used: Dictionaries, Screen Rectangles, OCR, Bisection algorithms, Subroutines with Return values.

MoveMouseToWord Macro (v1.0)

I've removed v1.0 from this message and replaced it with v1.1, see next post for an explanation and for the new, updated macro.

8 Likes

This contains v1.1. Please update! I'll remove v1.0.

Thanks to a user who tried my macro, he found a bug, which I have fixed, and I will upload the new v1.1 here shortly. The bug was that it contained an action very near the top of this macro which said "Cancel Just This Macro" (you can see it in the screenshot above) and I need to replace that with "Return 0". I have fixed it and will upload v1.1 in a few minutes. I thought I should mention here why it's getting updated.

When you rewrite a macro from scratch, things like this can happen. My old version didn't have this issue.

MoveMouseToWord Macro (v1.1)

MoveMouseToWord.kmmacros (55 KB)

2 Likes

Great subroutine!

I was able to use it to click one of the buttons that was difficult to select due to being a Java multiplatform applications. Cheers for your efforts!!!

Thanks. Sorry about the bugged v1.0. This macro is really amazing. I've used it to solve really tough problems. And even though it's callable on demand, I've even put it in an infinite loop to keep checking for things. I thought I'd burn my CPU out by running this in an infinite loop, but no, Apple's M1 CPU does not burn or catch fire even when this runs forever.

If you're the curious type, you can re-enable two "Highlight" actions in the macro and it will show you a blue rectangle showing you where it thinks it found the word. I used this feature a lot when I was debugging it, but in its final version I'm not sure people care to see that information. If you're extremely curious, I can also show you how to modify the macro so that you can see it doing its binary search to find the word. That's a really cool thing to see, although it slows the macro down a lot.

1 Like

Thanks for sharing this macro and the idea behind it.

I was looking for something like this for a long time and tried to implement it in my environment. But unfortunately I'm not able to get reliable or even reproducible results. A few days ago, I tested for at least two hours just on my MacBook Pro and got such varied results, from good to not working at all, that it wasn't even possible to reproduce and describe the incident.

I tried to give it another chance and recorded the session in a small screencast.

For testing purposes, I created a macro that positions the mouse at a specific location and then highlights the cursor to also indicate that the macro has started. Then I use the subroutine to find words. I used Safari and scaled the font size to see which might work. I also changed the font size within the subroutine, but that is shown in the video.

I don't know if this might only be an issue on my site. This video was shot on an external 4K monitor. I don't know if this can have an impact on the OCR routine.

Here is the link to the video:

I'm wondering if there's a way to achieve reliable text recognition to move the mouse there, which obviously doesn't depend so much on font size and I have no idea how to get it done.

If I'm on the completely wrong path, I'm grateful for any clarification.

Thanks, HaPe

I'm watching your video now. I'll add comments here as I see things. I do appreciate, very much, your attempt to use it and the "issues" you have raised. I think I have perfectly valid explanations for your issues. I think it's mostly a misunderstanding of how my program works, which is my fault for not being more clear.

Firstly, in your first search you were searching for the word "Suche " (with a space at the end). You can't add spaces to the end, because Apple's OCR doesn't return spaces in OCR scans except for spaces in the middle of two words.

Secondly, you then erased the space from the end of the word and did another search, and it did find the word "Suche" at the top right of your screen. Yes, that word was "Suchen" rather than "Suche", but that's not a bug, that's how my algorithm works. This is caused by a deficiency in all OCR software, which is that the location of a word doesn't get returned with the word, so I have to subdivide the screen into smaller and smaller chunks to find your search word. You have to make sure your word is unique to the screen, and I see at least three occurrences of "suche" on the screen. Which one is the correct one? How is my program supposed to know which one you want? It's impossible. That's why your word must occur only once on the screen, because my program has no way of knowing which one you want.

Thirdly, for the same reason as above, you can't have the KM Editor running while the macro runs, because in the KM Editor is the word you are searching for. I can see it on the screen.

Fourthly, you have not shown on your screen what your trigger is. So I can't be certain how you are triggering it, but my macros assumes that you aren't triggering it twice in a row before the first call finishes. It may fail if you don't wait for the first call to return a result. I have a feeling that that's one of your other problems here. I didn't think anyone would ever try to do that, but from what I can see visually, that may be what you are doing. So as a temporary fix, I think you should add a line to the end of my macro saying "Finished" which should prevent you from running two copies at the same time. I have never tested what happens when someone runs my macro twice before the first one finishes.

Fifthly, some of your searches include accented letters. I speak only English, and the thought never occurred to me that someone might try this. At this time, I have no idea if that will work. I would have to do some testing.

Sixthly, in at least one case I saw (around 4:23) the program probably was missing the word because of something you did. You used the number "100" which causes my program to end the search when it's "roughly no further than 100 points away from the target word." It is your responsibility to fine tune the FontSize value to be small enough for my program to get close to your word but large enough that it won't be looking for a word so tiny that it cannot find it. I guess I need to explain this more carefully in my macro. Basically the number you include for "FontSize" should be roughly the size of the font you are searching for. You can't just make the words bigger and bigger on the screen and expect the search to always work with the same FontSize value.

Here's what I mean about the example at 4:23: (notice the red arrow I added)

In the above example the arrow that my macro "thought" was your word was less than 100 points away from the actual word. You used the number "100", so it allows for 100 points of error. (Something like that; it's tricky to explain in a short sentence.) If you had used a FontSize more accurately depicting the word, maybe 30, it would have almost certainly found the word. (The word on screen is NOT 100 points high!) If you had used a FontSize of, say "5", it would likely NEVER find the word because it's expecting to see a small word, not a large one.

The main issue, which is hard to explain, is that no OCR software returns the location of any found word or phrase. So I have to use some magical trickery to subdivide the screen into smaller windows and keep asking the OCR detection software "is the word in this block of the screen?" I keep this magical trickery under the hood as much as possible so you don't have to know all the technical details.

If my macro is too hard to use, perhaps nobody will use it. But with people like you giving me feedback, maybe I can make it easier to use.

2 Likes

First time I’ve come across this thread, and I haven’t actually tried the macro. Can it be upgraded to allow specifying the window that the word is expected to appear in?

I suppose it could be added, but I doubt that it would shave more than a second off the search time, so I'm not yet sure if it would be worth adding.

I suspect I threw you off with my mistaken use of the word "expected". I'd rather it be an option to "limit" the app / window that the word is to be found in, in order to make it easier to use this when the same word may inadvertently appear in multiple applications simultaneously. This would (among other things) nicely conveniently workaround the issue of forgetting to close the editor window if this option gets used. Does that make sense?

Thank you so much for all the explanations.
I'm pretty sure I was unsure how exactly it was supposed to work. :wink:
So I just tried with my (limited) options to figure out how it works. I was hoping that my humble trial and error method would produce a result that would allow me to use the macro.

The values I used regarding the font size were therefore chosen purely at random. I don't know (yet?) a method to determine the size of a font on the screen. Possibly still in relation to the resolution of the screen.

And I had no idea how the routine would handle having a word on the screen more than once. I had already taken this circumstance into account, but - as I said - I just wanted to see whether I could find out by experimenting. And I really took a lot of time for it because I find the approach very interesting and there are many possible uses for it.

Keyboard Maestro offers a few options for images to deal with multiple matches.

There is certainly a possibility that I triggered the macro too quickly. But I mentioned that the rings around the mouse pointer visualize the start of the macro (via a shortcut).

I apologize again for not being able to foresee these details in advance. I just had the "newbie" hope that I could somehow use it for situations in which a button is not accessible and the search for images does not work or only works unreliably.

I'll try again with the information given as soon as I have some more time.

If anyone has a tip on how to roughly determine the size of the font, I would be very grateful for it.

Thank you!

HaPe

Your testing is much appreciated. No worries.

I'm thinking hard about what it would mean if multiple targets were found, but I can't come up with a meaningful definition for that. In theory, I could return the coordinates of all occurrences, but what then? You can't move the mouse to three occurrences. The macro is called MoveMouseToWord.

KM has a "Fuzz" factor on its Find Image search, and KM doesn't define exactly what "Fuzz" means. Perhaps I should call it that. But by calling it "FontSize" I think it helps the user.

Yes, but I couldn't tell when your macro ended. So I'm unsure if you ran it twice (overlapping). In my next version I will add semaphores to prevent this error. Maybe my macro should behave differently when the caller action is "Trying" or "Hotkey," perhaps I can show the progress of the search as it progresses in those cases.

If you are willing to have some fun, I think I left some "Highlight Rectangle" actions greyed out which you could re-enable. It will show you the progress of the screen search. But the overall time to find the word will be increased from about 3 seconds to a longer value.

One last thing you didn't notice is that the result returned indicates if the algorithm was confident it had a match (0), not too confident (1), or failed (2). You could use those results to say "Got It", "Maybe got it", or "Missed it."

Now that I'm thinking about it, the performance of the Mac (M1, M2, M2, Ultra, etc.) will affect how well it works. I didn't take that into consideration. I could compensate by taking more time on the slower machines.

1 Like

Thanks again. Unfortunately, my workload changed from one day to the next, so I have to take the time to get back to testing while taking all of your information into account.

I wanted to test this macro too. Since I didn't know how to call it, I used HaPe's example:


The word "immediately" on the second line in Ms Word isn't found:

There are at least four possible reasons why the macro didn't find your word. First, your video shows you waited about 6 seconds, while the description above says it should take 5-10 seconds the first time and then be about 1 second for subsequent attempts if the word you are searching for is in the same location. Second, you are using a "FontSize" value of 12, and since I can't tell what your screen resolution is I cannot determine if that number is large enough. You see, the smaller the number the smaller the search box the program attempts to find the word in. And it could be that your word is too large for the value 12, so just try a larger number. If you try a font size of say 50, it should find it but the pointer may end up a few pixels too high or too low. It's up to you to try various values until you get a value that always works and gives sufficiently accurate results. Third, I can't tell if you are using v1.0 or v1.1 of my macro. And fourth, since you didn't indicate what kind of Mac hardware you are using, I can't tell you if the estimate of 5-10 seconds will work for you. If you are using an Intel model, it may take much longer.

Note that the "FontSize" parameter to my macro is just a name that is not actually a true fontsize but approximates the font size. It's up to each user to adjust that value to the smallest possible number that always finds your word on the screen without missing the word by too much. If the number you provide is too big, the mouse may end up pointing above or below the word, but if the number you provide is too small, it will never find the word. The number represents the size of the box that the macro attempts to find the word in.

ANOTHER WAY TO EXPLAIN IT: If you specified a FontSize of a ridiculously low value of 3, it would never find the word in a box that small on your screen. If you set a FontSize of 1000, it would find the word almost immediately because that would be roughly the size of the entire screen, and it would probably place the cursor near the moddle of the screen, which wouldn't be anywhere near your word.

If you want to troubleshoot this for your situation, I'm pretty sure I placed a disabled Highlight action in the macro so that if you enable that action, you will see my macro examining rectangles to find the word. It's an excellent debugging tool, but it will make the macro run more slowly. I was always highly entertained seeing the macro running with this Highlight action so that I could see it finding the word.

My macro worked for me in similar cases, so I think the most likely issue is that you either used a bad FontSize value, or you didn't wait long enough. I notice also that you didn't display the resulting value which was returned to your variable named "VarName". If you display that value you might learn more about what my macro did in your situation.

Hi - a quick question - can this macro be modified to look in a certain area of the screen for a particular word instead of the whole screen?

It looks to me like you could just modify the lines below to the area that you want to search. Keep in mind that reducing the area won't necessarily make it much faster, since the time a binary search takes is based on the LOG2() of the range being searched. But maybe you have other reasons for trying to narrow the search, like avoiding a false reading on one side of the screen.