OCR Action and macOS Live Text

I use OCR in KM every day, even though I haven't posted here for years. The OCR action in KM is a little quirky and sometimes inaccurate (and sometimes requires reloading the KM engine to start working again.) But today Apple announced a new OCR feature in macOS (presumably coming within Maverick, or may come earlier) called Live Text which is built-in to the OS. Will the KM code that implements OCR switch to the built-in OCR going forward? If not, I'll try to create a KM macro that does this with the KM screen capturing action.

The explanation of Live Text says it "works with screenshots". And that it would "work with macOS." I think they said it would work with 7 languages. I am guessing that Apple will have an API to allow programs like KM to access the text. So will KM users be able to benefit from this new technology?

yes, video here, starts at 6:28:

I'd say either OCR feature, the KM one of the macOS one will be used interchangeably, like using KM to open Preview and Select All text.

When you say "will be used interchangeably" are you saying KM will, or will not, use the OCR engine in macOS instead of using its custom engine?

since this new OCR feature will be native to macOS 12 (Monterey), iPadOS 15 and iOS 15, my guess is that KM will keep its own internal OCR engine, to keep compatibility with previous operating systems.

Update:

Monterey's Live Text will only work on Macs with M1 chips, will not be available to Intel-based ones:

That's an excellent point. Hmm, are you implying that KM should never support a feature that's available only on M1 Macs? Or do you think eventually KM should be allowed to drop support for Intel Macs? It's quite a shame that great new features of macOS cannot be supported just because the old processor doesn't support it.

Peter will have to make a tough decision whether or not to support M1-exclusive features, especially when the Intel machines are going to be unsupported by Apple at some point in the future. My suggestion is that for this particular feature he create an Action that uses the new M1 features (like the amazing OCR that's built-in to Monterey), whereas the old action can remain supported for as long as Peter wants to support Intel. This won't be the only feature Apple will ever introduce that is exclusive to M1.

In this case it would be especially helpful because the OCR that comes with KM is very finicky. I don't want to list all the problems with the KM OCR, but it often requires a root of the KM engine to start working again, and it often gets into a mode where it reduces its accuracy from about 97% to about 5% for no apparent reason. However I have learned that I can usually force the KM OCR to start working again by switching languages.

Agree, since KM has a lot of focus on providing backwards compatibility with older operating systems, it'll be very useful to be able to run any of the two OCRs (KM or Monterey's). I'd prefer to be able to select a default OCR in as a setting in KM Preferences, and forget about it.

Great point, Peter recently mentioned in another thread that he's actively developing KM version 10, we'll see :slight_smile:

I can't wait to switch to a M1 Mac. And you?

I've just started using my M1 Mini today, which I bought last year, but haven't used it until now because my iMac was still surviving. But I just bought a monitor yesterday for my Mini and I did the OS updates yesterday and it's working with my new 4K monitor now.

The first thing I want to say is that it physically runs very cold, unlike my Intel Mini which I could use as a plate warmer for my meals, even when just watching a YouTube video. The second thing is that the M1's GUI feels really quick at everything compared to my iMac. Starting Safari takes a small fraction of a second. Starting Pages takes one second, plus another second once you choose your template. The only thing that feels slow is rebooting (15 to shutdown, 30 to reboot) although maybe it was installing some software when I measured that.

So far I'm very happy with my M1. My LG 4K monitor is almost as good as the 5K iMac monitor.

It's a tough decision for Peter if/how/when to roll out M1-exclusive features. I can see why he might not do it immediately, but I'm sure he'll come around to it eventually. It is inevitable.

1 Like

I currently don't know if the Live Text facility is available as an API or not, so I don't know if Keyboard Maestro can support it.

I have no problem supporting M1-only or OS version specific features in Keyboard Maestro (for example, Keyboard Maestro supports Dark Mode but that is not available in OS X 10.11 which Keyboard Maestro supports). Obviously, restrictions on where a feature is available factor in to its utility and thus my decision as to whether to support it or not, but there is no principle against supporting a feature that is only available to some Macs.

4 Likes

That's a helpful answer and a good philosophy. Thanks, Peter. If I have not said it recently, you are awesome. I hope and pray that there's an API for Live Text, and that you decide to support it. It seems like a very useful new feature and seems to fit with KM's raison d'être. I'm still in awe how you implemented OCR in KM a few years ago. My feeling is that you could provide a new/distinct Action to support Live Text, but I'm not sure if you should replace, update or supplement the existing OCR condition. You always make great decisions.

4 Likes

I just found this thread and would also like to express my interest in macOS Monterey's very good OCR. The quality is really amazing - unfortunately I have to figure out a workaround with simulated mouse movements to use the feature with KM.

2 Likes