Automatic screen capture then OCR

Hi, I am fairly new to Keyboard Maestro so please bare with me.
I have a windows program, wihch I run in a virtual machine and connect to from my Mac.
I would like to monitor an area of this remote desktop session (which essentially shows the Windows program) and when it detects a change to this specific area, capture it, OCR it and then write it out to a text file.
If I cannot do it when the are is detected to change, then perhaps every 15 seconds take a screen grab.
Hope this makes sense.
(I realise KM does not run under windows, I merely want to monitor an area of my desktop)
Thanks
Mike

Hi, welcome to the forum.

Sir, we have a dress code.

KM would need to look for the change, yes. You could use a Periodic Trigger and the Find Image on Screen Action.


I wonder if another approach is possible. It depends whether you are able to use any automation software on the Windows PC to connect to a Web address. What follows may or may not seem like gobbledygook, depending on your previous experience. If you are comfortable automating PCs and including values in URLs, I hope it will make some sense. Otherwise, don't worry about it for now.

Something causes the "specific area" in the Windows PC's display to change, but I don't know what, so I'll call it "X". When X happens, do you have any way of getting the PC to connect to a URL? For example, I understand that Power Automate can so this.

If you can do that, on the Mac side you could set up KM's Web Server and connect to it from the PC using the Web Server's URL.

A Public Web Trigger could then be used for your macro. The macro would then run whenever the connection is made to the Web Server.

If you have a way to include the changed text as a value in the URL, the Public Web trigger will include that text in the trigger value token.

If the Windows PC and the Mac are not on the same LAN, a better alternative to the Public Web trigger might be the Remote trigger. Note the caution on that page, especially if you are passing text values in the URL.

If there's a "base" state that the screen shows, you could use a looping Pause until screen area doesn't contain the base image.

e.g. say the specific area usually shows the text "HELLO THERE". That could be your base image.
Then if the area changes to something else, the rest of the macro could run.

The downside is that this may or may not take up enough CPU cycles to make a noticeable difference, if e.g. you have it running all day. But I've found this kind of macro is usually pretty CPU-friendly, especially if the area is small.

Just pay attention to timeouts and such. Sometimes, I've accidentally left Pause untils running longer than I wanted it to, with unexpected results.