Use Image Rec or OCR of One UI Element to Click on a Second Identically Named UI Element

Dear Wizards @DanThomas, @ccstone, @JMichaelTX, and @peternlewis,

I'm trying to wrestle control of another UI point in Resolve:

  1. Use a screen-grab of the UI to select a macro starting position. This is done. When resolve selects a clip in the timeline, it places an orange bounding box around the clip. I screen-grabbed the left edge of that orange bounding box and used it for "Find Image on Screen." It's highlighted green by KM in the screen-grab below (by the arrow pointing from the number 1 in the yellow box).

It works consistently, it will always find the selected clip.

  1. How can I use that image location to:

    A. Recognize the UI name of the clip "A1_G3" (next to the number 2 arrow)?

    B. Use that UI clip name "A1_G3" to click on the identically named group
    "A1_G3" (next to the number 3 arrow)?

    C. "Scroll" and "Continue Until" a group is found. Maybe page up to top, scroll
    down to bottom, until the group is found, and cancel if not found. This part of
    the macro is not as crucial.

Wondering why not click on the group itself? The list of groups can be long in hour long episodic TV. Also, I would still have to scroll, look, and click. This macro would make group selection instant with the tap of an xKey button, and I wouldn't even have to look at clip names. Keeping my eyes on the image and scopes.

I found this macro from @JMichaelTX, but I don't want to have to click in an area to determine the boundaries for OCR or image recognition. I think I would rather create an offset position from that left edge of the bounding box as a starting point to create an area within which KM can search to OCR or Find Image.

Questions:

  1. Is OCR to a variable the way to go, or is there a way for KM to use one image to reference another and then click on that second image? If I use image recognition, would it be a temporary screengrab to clipboard or something?

  2. Should my first macro be a "Click at Found Image" at the left side of the orange bounding box, so that I have a cursor location that can be used as an offset to create an area for either image recognition or OCR to a variable?

  3. How do I create an offset area from the left edge of that bounding box and feed that location into an OCR or Image recognition action?

  4. Am I even close conceptually? Should I approach this differently?

Here are the UI elements. There is no way to programmatically access any of these UI elements:

Here is the orange bounding box left edge:

clip_edge

Thanks, your humble apprentice.

That is very tricky to do.

You can go from the found image to the area of the clip name. And then you can try OCRing that. But OCR is always going to be somewhat suspect, and highly dependent on the contrast, font and characters used as to whether it reliably gets the name correctly. In my quick test it got the name correct for the clip.

However then you would have to search for the name in the list - the name is not the same color, so an image match probably wont find it. A quick test shows the image mach matches all three group names (A1_G1…A1_G3) equally well - the difference in color is sufficient to outweigh any similarity in the last digit.

Similarly, the lack of contrast in the group names means that OCR fails as well.

This is based on the image you posted - your actual screen may have more resolution and so maybe you'll get different results. But it's not promising.

I think the best you're going to do is manual assistance. Use Keyboard Maestro to find the selected clip, then use Keyboard Maestro to highlight where the clip name is. Then use Keyboard Maestro to move the mouse to roughly where the group list is, and then you have to manually click on the desired group.

Anything else is likely to be highly error prone with the available screen image.

Thanks @peternlewis. Is there anyway you can post the macros you tried? I have had some success with taking multiple screen grabs (each state the text and background color can be in) and using "If any of the following are true" actions. I'm using that currently for a similar action and it defeats the change in background color.

Thanks for testing!

I didn't really try any full macros, just a few actions, things like screen grabbing the text and using the OCR Image default macro, and putting it in a Find Image action with the specific area set to limit where it was looking:

image

It was enough to tell me the OCR of the main text might work, the OCR of the group text probably would not work, and the Find Image of the image of the main text probably will not work.

Have you tried UI Browser? I'll bet you can find it that way. You'll probably need help to figure out the code (AppleScript or JXA), but I'm sure it can be done.

Hi @DanThomas, I wish. Resolve's UI is notoriously unavailable programmatically. I've tried Apple Developer, UI Browser, and Scrip Debugger. It's all just one green blob. I do get access to the scroll bar though. So that could be accessed to scroll up and down to access more groups.

@peternlewis amazed at your ability to respond to so many questions. Thanks. I would still like to give it try. How would you pipe the position info of say "click at image" to create an area for OCR or image recognition? Then how would you use one image to reference another?

Thanks guys.

1 Like

You would use the Find Image on Screen action to find the image and store the result in a variable (which is x,y,width,height,fuzz). Then you would use OCR Screen action with the Area selection, and use things like var.x+50, var.y+50 in the fields to select the area to OCR. Same for Screen Capture action.

And to use the resulting image in a further Find Image, you would have to write that captured image in to a file, and then reference that file from the OCR Screen action.

Hi @peternlewis,

I figured it out. I think the underscore in the name may have thrown off your tests. I changed the naming convention, and it works every time using image capture. I uploaded the macro.

I haven't looked into KM's clipboard feature yet, so I just want to confirm that the image screen captured to the system clip board is blown away the next time I copy anything to the system clipboard? It's not somehow stored, clogging system memory?

Thanks for all your help!

DR_FIND_GROUP.kmmacros (10.5 KB)

1 Like

The image is saved in the clipboard history, you can see it if you activate the Clipboard History Switcher.

Most times involving the clipboard you cannot avoid the system clipboard, but this is an exception. Use a Named Clipboard instead. Create a new Named Clipboard in the Keyboard Maestro Clipboard Preferences, and then select it in those last two actions instead of the System Clipboard.

Hi @peternlewis, what’s the best way to delete the image saved to clipboard from the clipboard after each firing of the macro?

Thanks!

You can use the Delete Past Clipboard.
https://wiki.keyboardmaestro.com/action/Delete_Past_Clipboard

And if it is the last entry, then you should delete past clipboard 0.

Hi @whitebalance - here's a tip for you if you don't already know:

If you're ever in the KM editor wondering "is there an action for that?" then do the following

  1. Type A

which will show a spotlight-like dialog to let you find actions in KM, a bit like this:

From there you can usually find what you're looking for even if you don't know exactly what it is you're looking for!

This and the KM wiki are my two favourite places - apart from this forum!

2 Likes

Hey guys, @tiffle, @JimmyHartington, @peternlewis,

Thanks for the tips. Sorry, I should have searched the wiki first. I had spent a few days searching the wiki and forum for the beginning of the macro and couldn’t figure it out from the wiki and needed @peter’s help.

But I should have started fresh and searched again for the last action.

Thanks again!

1 Like

If you use a Named Clipboard in this case, and the system clipboard is never touched, then you will not have to delete it.

Thanks @peternlewis. I’ll try named clipboard. Do you think image recognition would work better if I upgrade to UHD (3840x2160) instead of my current display size 1920x1080? That forces the UI to be displayed at twice the resolution I think, so can KM take advantage of that?

I got scroll wheel into the macro, so now Resolve’s UI will page up until it finds an image match!

I doubt it, but without testing the specific case it is impossible to know.