Is there any way to instruct a macro to click on specific lines and boxes

Here's an example from one of my macros:

I open Handbrake with a specific file. It takes Handbrake a few seconds to process it, so I Pause until the "Browse" button and file name edit box become enabled, by looking for this image:

Note that this image needs to be unique (well, for my macro it does - there's ways to handle things when the image isn't unique).

You can adjust the "fuzz factor" (that's the "e" to "e" slider) to have the image be more exact (to the left) or less exact (to the right). Normally it works at its default.

Once it finds the image (assuming it doesn't time out), I click on the specific area of the image that I need, which in this case is 35 pixels down from the image's top left corner.

Again, feel free to ask more questions.

1 Like

I'm sorry. I am completely lost, and thank you for your patience.
Let's take a practical example.
1- You are in Chrome
2- press Cmd-P for print
3- want to click on More settings
Now, let's take the Move and Click action.
We want to click, left button, no modifiers, and let's try to leave the fuzz factor as is
A- How do I determine which numbers I should enter instead of 0 and 35 ?
B- what is the window title which I should enter : Print as per the top left of the window?

First, get a screenshot of the place you want to click. Press shift+control+command+4, then drag a box around the place you want to capture, like this:

We're going to use "Click at center", so try and make sure the center of the image you capture is a clickable place. You'll end up with something like this, in the clipboard:

Here's the macro:

The "Pause Until" is just so we can wait until the dialog is displayed. Be sure to click the "Gear" icon and set a timeout, for a couple of seconds. Alternatively, you could just pause for a second or two.

The last action says to find this image, and click at its center.

NOTE: You can click on the image box in the Action and paste in the image. Or, if you have the image in a file, you can drag the file into the box.

Caveats:

There's lots of reasons this might fail, at some point in the future. Your screen colors change. Your resolution changes (not sure if this would fail or not). Another window has the same image in it somewhere. Etc...

Should that happen, there's usually a way to fix it. You just have to figure it out.

2 Likes

Thank you very much for an excellent explanation.
I tested it, and can click on More settings.
next step is to press page down - no pb
Now the problem is to be so precise as to click on the checkbox just to the left of Selection only
Could you suggest a way to do so? I imagine that it is a question of precise coordinates, but have no clue as to how to approach the problem.
thanks again very much.

Grab a screenshot like this:

and use the same method as before. Clicking in the middle of this appears to "check" the checkbox, so it should work just fine.

Same with the Print button.

So basically, you're telling Keyboard Maestro to look for a specific image, and click on the image when it's found. If clicking on the center of the image doesn't work, experiment with things like "Top Left" plus 15, or something like that.

Since you've obviously already got a program for capturing screen images, just use that to get the image you want, then copy it and paste it into a KM action.

2 Likes

@ronald, sometimes using the Found Image Action is the only way to find and click on a specific element on a window. But I try to use it as a last resort, because it is usually NOT as reliable as using other techniques that specifically address the desired element (button, link, etc).

XPath is one great tool for finding elements on a web page. Having found the element, then you can do stuff with it, like, click on it, copy it, etc. For a list of topics about XPath, see Topics tagged with 'xpath'

However, XPath is NOT easy to use for those unfamiliar with it. It will take some study to learn/master it. I'm still learning myself. But, for those that do understand it well, like @ComplexPoint, it becomes a tool easy to use.

Here is one example:

2 Likes

In my opinion, it takes a brave person to venture into the deep dark waters of XPath. It’s not intuitive at all, at least not to me. However, when a Find Image kind of action fails, it’s pretty easy to debug - it’s all right in front of you.

With that said, it took me a long time to break down and learn regular expressions, and now I wonder how I lived without them.

But still, XPath is not for the faint of heart. :open_mouth: Again, my opinion. YMMV. LOL, and all those kinds of acronyms.

2 Likes

Dan,

The biggest issue I have with Find Image is that the "same" image can actually vary Mac to Mac, monitor to monitor. If they are all your Macs and monitors, then you can accept more risk, than if you are distributing the KM macro to others. If the web page you are using makes even a small change in its graphics, then it can break your macro.

It all depends on how well you understand HTML and the DOM. But I agree it is not something that you can learn in 15 minutes. :wink: OTOH, you can't learn RegEx in 15 min either.

Find Image and XPath -- both tools in your tool kit. Some tools require more effort to learn than others. Almost all of us can use a hammer. :grinning:

2 Likes

I like hammers! :stuck_out_tongue:

Yes, I totally agree with this comment. However:

[Devil's Advocate time, just for kicks] Seems to me that if a web page's appearance changes, it's reasonable that most of the time the structure of the HTML page will change also. Not necessarily, of course, but I would think it's likely.

So here I am, with a macro you created, and it breaks because the XPath no longer works. Or, here I am with a macro you created, and it breaks because the color, shape, or text of a button changes. Which do you think would be easier for me to fix?

[End Devil's Advocate]

To be honest, it makes little difference to me, because I'm only creating these things for myself. If I create anything intended for anyone else, I wouldn't necessarily take these chances, except with a big disclaimer up front.

The truth is, I've created dozens, if not hundreds of class libraries and utility packages for use by other developers, and if I find myself in that position again, perhaps it's time to rethink retirement! :open_mouth:

1 Like

It finally works, thanks to your help. Without you I could never have done it. I am grateful that you took the time to take snapshots to explain your points.

A few comments and questions

Comments
1- I mistakingly thought that one had to click exactly on the checkbox to the left of selection only, whereas, as you point out, anywhere within the image works.
2- thanks to you I discovered Pause until conditions met + image, a very useful function

Questions:
1- when I click on the hot key, I see the macro action unfolding. Is it possible to send all macro activity to the background, so I just end up with the printed document.
2- between clicking on more settings and clicking on selection only, I entered a Page Down. Is the Page Down necessary. What I am getting at is the essence of the find image function. Even if I cannot see the Selection Only image without pressing Page Down, can the macro action find the image? I would like to better understand the find image action.
3- funny thing: the macro did not work correctly when I simply put the action press Enter at the end. When I substituted Move and Click on the image of the Print button, it worked. Why would this be?

thanks again very much

Generally, no. Not with UI control. Definitely not with Find Image (which is looking at the same screen you see) and Click (which is clicking on the same screen you see).

The Find Image can only see what you can see, so it must be scrolled onto the screen. Same for clicking.

Stop the macro at that point (Cancel This Macro) and see if you can press Enter.

If so, then you probably just need a pause.

If not, then you have your answer.

1 Like

Sorry I forgot to mention that - I meant to, but it slipped my mind. @JMichaelTX would tell you that this is another advantage of using XPath, and he would be 100% correct. Food for thought...

As Peter mentioned, if you can't see it, neither can KM. Look at it like this: If you did a screen shot, can you see the image you're looking for? If not, then neither can KM. This includes the window being covered by another app.

Consider KM as your personal assistant. You may not be pressing the keys, but something is. Sometimes the action happens so fast, you can't see it. Other times, like looking for Menu items, it actually does happen in a way that's not visible. But for the most part, you see it.

If it bothers you enough to want to do something about it, here's some tips with image finding that can speed things up, that I've discovered over time:

  1. The smaller of an area that KM has to search, the quicker it can find the image. So if, for example, you have multiple monitors, then the option to search a specific monitor, or even better, the foreground application, will be quicker, but sometimes this doesn't work. Depends on the application.

  2. You could even restrict the area to be searched to a section of the screen, using x/y coordinates and width/height. On the one hand, this is guaranteed to break if you move things around. But if you always run the app full screen, it's less likely to break. And of course, if it does break, you can just change the area, or change back to search the entire screen(s). (@JMichaelTX is rolling his eyes right now, but seriously, it's an easy fix :slight_smile: )

  3. This one's tricky, but depending on the image being searched for, it can make a difference: Consider this image:

Now compare it to this image (I've added a border to it to make it clear in this post, but pretend the image doesn't have a border):

There's a lot of white space around this image, that white space going to match a lot of other white space on the screen, so it will take a while to find the unique part. I don't know if KM looks from the top left down, or the bottom right up, or something else, but the point is the same: Trim the image until the most unique portions are at the edges.

OK, I said this could make a difference, but the difference may not be noticeable. But I've had instances where I noticed the time difference, and even though it was in tenths of a second, perception is everything.

1 Like

Dan, I assume you meant this in the context of using Found Image.

In other areas, KM macros can operate in the background if the app/action does not require it to be frontmost or visible.

Here's an example:

  • I have a macro that runs every time I wake my Mac. It runs an AppleScript which resets some app prefs that get messed up often. Does NOT require any UI -- copy copying files.
1 Like

I was just referring to macros that manipulate UIs. Obviously things that can happen in the background do.

thank you for your answers.

  • how do you split my comments into multiple snippets which you can address individually?. Superb way of interacting.
  • when I create a macro, I enter all the actions, disable all but the first, and look at how the macro functions action by action. I am sure that there much be a more intelligent and efficient way ? I had rather ask the grand master to start on the right foot.

thank you for the tips and your example. I will remember small images: makes sense.

  • I could not find an action called xpath?
  • I remain confused. This must be the third time that I ask: with which tool does one find coordinates. Last time, we solved the issue by using the click center option

@ronald First off, kudos for continuing to ask questions. I’ve found very few people who are as willing to say “I don’t get it yet” as you are - most people just give up. And that’s a shame, because there’s so much you can learn if you’re persistent. And you are, so keep it up!

As for the XPath action, two things. First, “XPath” is the name of a way to reference parts of a web page and XML (it’s more than that, but that’s enough for now). So it’s a generic term, not specific to KM.

Secondly, the action that was mentioned by @JMichaelTX was actually written by a member of this forum, and he says it’s broken. So we can’t use that.

As for finding the screen coordinates: In the Keyboard Maestro editor, there’s a menu item: Window -> Mouse display. You can use that to determine screen coordinates.

Just remember that in KM, when you refer to a screen area, it wants X,Y,W,H, which means Left, Top, Width, Height. So that means if you have the top left at 100,200 and the bottom right at 400,900, with width is 300 (400-100) and height is 700 (900-200). Assuming I didn’t transpose anything, which I did until I proof-read what I typed. :slight_smile:

Oh - one other thing about finding images in areas. For some reason, I’ve had to specify an area larger than the target image, or KM won’t find it. Not sure why. I need to ask Peter about that. Anyway, if you decide to use areas, play around with the sizes.

1 Like

ronald,

"XPath" is a web technology, not a KM Action name.

As I mentioned earlier, to see topics here about XPath, click this link:

Topics tagged with 'xpath'
####For those not familiar with XPath, like me, you may find these helpful:

  1. Introduction to using XPath in JavaScript @ mozilla.org
  2. XPath Tutorial @ w3schools.com
1 Like

@JMichaelTX - In case you missed my earlier comment, @ComplexPoint says his macro is broken right now. I was actually giving it a try!

See, an old dog can learn new tricks. Except, well, this one’s broken.

1 Like

Dan, I never had any doublts about you learning new tricks.
Just take the broken bone back to the owner, and ask him to fix it. :smile:

I'd love to see you tackle XPath, along with me. Even though I recognize its value, I also have not spent that much time learning it.

So many new tools, so little time.