Is the best way to scan for varying numbers in a desktop app to find an image of the number?

The numbers on the desktop app are displayed according to the example image below.
Screen Shot 2022-05-03 at 1.03.21 PM

These numbers will vary.

The workflow I imagine is to input a specific number (say, "215"), then have KM scan for the number and move the mouse and click at a specific point relative to the found number.

Is the best way to scan for a specific number to pre-screenshot every unique number, and save it as a unique "find and click image" macro?

That seems like murder. Can you tab your way from number to number?

What app is it and what are you trying to do?

There has been quite a lot of discussion of using OCR to extract information from the screen and it sounds like this requirement might be a suitable candidate for that approach. It seems that this has really only come into its own since the introduction of Monterey so unfortunately I can't be of more assistance as I'm still on Mojave :frowning:

But anyway, a search of this forum for "ocr" turns up a number of interesting and possibly relevant discussions that you might want to look at in your research of this problem.

I tried an OCR on the provided image and it failed to detect around half the entries. It may do better on the original, and I don't know if it could be quickly trained to do better.

If OCR can be made to work and row heights are constant, it wouldn't be too much work to find the number of the matching paragraph in the OCRed text and calculate a click-point.

Or can you pass a "variable" image into a Find Image action? The numbers themselves appear to be monospaced, so you might be able to "build" your search image out of the pre-saved "2", "1", and "5" images, then find and click that...

I'm wondering if AppleScript might be able to get the values of those rows, depending upon the app and the nature of the UI elements. You could feasibly cycle through each row looking for a match with the specified number. Can't tell if this is possible of course without knowing what the app is.

It's a stock trading application called Thinkorswim. I'm trying to create shortcut keys for tasks on this app which seems to be built with Java.

No, unfortunately there's a lack of shortcut key usage in this app, which is why I'm trying to create my own

I'm also on Mojave. Do you mean to say that OCR is practically unusable on Mojave, or does it work with a bit more difficulty?

On Mojave the OCR functionality built-in to KM is fiddly to use and isn’t very reliable in terms of recognition whereas with Monterey there is an OS-provided OCR facility that is reportedly very accurate and can be accessed both via KM and Shortcuts. Since I can’t try it out myself all I’m saying is that it may be worth investigating but only of course if you’re on Monterey. One user in particular (@Sleepy ) spent some time investigating it IIRC.

If anyone reading this needs to fact-check what I’m saying then a search of the KM forum is a must.

It may be that with a lot of tweaking and experimentation you can do a lot of productive stuff with KM’s OCR on Mojave but I really can’t attest to that. Maybe someone else can…

1 Like