OCR Features

Keyboard Maestro 9 adds support for OCR (Optical Character Recognition) of both the screen or images.

There is a new OCR Screen and OCR Image action, as well as an OCR condition. The OCR actions uses the Tesseract OCR and you can select the desired language - the first time you use a language, Keyboard Maestro will download the trained data file (they vary in size, but are typically a few megabytes).

There is a default macro in the Clipboard Filters macro group called OCR Image that will OCR the system clipboard. You can use this from the action (gear) :gear: menu in the Clipboard History Switcher. This can be a great simple way to copy text that is not selectable (such as text from web page images, or from application interfaces or whatever).

The OCR quality tends to be quite good for screen shots, but more varied for scans or pictures.

You can also use this action in your macros to extract text from an application or web page (from web pages, you would often be better using JavaScript to extract the text, but sometimes OCR is easier).

You can also combine this with the Find Image on Screen action to find something near the text you want to OCR, and then OCR the area relative to that matched image.

3 Likes

Thanks Peter, for a really great feature, that I now often use.

While the new OCR features allow you to select the entire screen to OCR, there is no means to select only a region of the screen to OCR. The below macro by @ccstone provides this function.

1 Like

To be clear, the OCR action can OCR any section of the screen, but Keyboard Maestro has no UI for selecting an area on the screen. There is no “Prompt for Area on Screen” action for example, which is what @ccstone’s macro provides.

I thought that's what I said:

Was that not clear to you?

No, it was not, since you can select an area to OCR, eg:

image|313.5x157

But there is no UI to select an area.

Hence the clarification, that while you can OCR a selected area, there is native Keyboard Maestro UI for selecting an area.

OK, I'm glad I asked, because that was not clear from your post. :wink:
Also, the KM Wiki OCR Screen or Image action is really not clear on this. I had to reread it several time before I found this buried in the text:

When reading from the screen, you can get the image from all screens, or from a specific screen or window or area on the screen.

We need to revise the Wiki to make all of this more clear, and show some example screen shots of doing an OCR from other than the Clipboard.

So, to recap:

  • KM can OCR a region of the screen as specified in the KM Action OCR Screen or Image action
  • But KM can not provide the UI like the macOS to allow the user to select the area, using a mouse, to be OCR'd.

Actually: there's a shell script that make's us of the partial screenshot function built in OSX. Don't know where I found it but it works like a charm. You get the on-screen-crosshairs to select your screen grab and the text is sent to the clipboard. All credits go to the original poster.
Screenshot to OCR:Clipboard.kmmacros (2.2 KB)

That is what is used in the above Macro by @ccstone.

I've been using the OCR scan of a rectangle, looking for a bit of text before my macro continues, and I've found that it works maybe 70% of the time. The rest of the time it never recognizes the text it's looking for, even though the circumstances should be identical in all cases.

@jwiegley, if you can provide an example image that the KM OCR is not working for, then maybe we can offer some suggestions.

Also, I'm not sure I know what you mean by "scan of a rectangle", but maybe you could try this macro and select only the text you want to OCR:
OCR -- User-Selected Area by @ccstone

Sure, JMichaelTX, here's the window I'm trying to match text within:

And here's the rule I'm using to do the matching:

It works more than half the time, it's mainly the uncertainty of it that I find puzzling.

John

I assume that you want to enter your TOS PW, right?
You can set it so your username is autofilled.

Just so happens I have a macro that does exactly that.
You may need to replace the image I have for the TOS Login button:
image

This macro is triggered when TOS launches, and pauses until that button appears.
Then click in the password field and types the PW.

Maybe you have another method of getting your PW, but I decided to use the macOS Keychain. If you use that then you will need to have created a Keychain for it, and use that name in the Action that sets the KM Variable to your password.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

MACRO:   Enter TOS Login When @TOS Launches [Example]

**Requires: KM 8.2.4+   macOS 10.11 (El Capitan)+**
(Macro was written & tested using KM 9.0+ on macOS 10.14.5 (Mojave))

#### DOWNLOAD Macro File:
<a class="attachment" href="/uploads/default/original/3X/d/f/df9b121196f823234ff8204422c5ac09b7e7269a.kmmacros">Enter TOS Login When @TOS Launches [Example].kmmacros</a>
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**


---


<img src="/uploads/default/original/3X/b/5/b5293c3dde916e09ec84594c0554abe7a7ca3a52.png" width="535" height="1126">


This works 99% of the time.  For the other 1%, I have a companion macro triggered by a hotkey that simply calls the above macro.

![image|535x328](upload://3OYdQ1Ug4DyJqRALcyS541CZUOg.png) 

`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`

Let us know if this works for you, or if you have questions.

Ah, a fellow TOS user, great. :slight_smile: Your idea of using the button seems much more reliable, so I've switched my macro to do that. Thank you!

To answer the original question, since the text is grey on grey, Tesseract OCR may be having trouble reading it.

A post was split to a new topic: How Do I OCR a Pre-Determined Area on a Window?