OCR problems

After switching to a new macos system I have a problem with OCR. After starting the program, it works only once, then it does not function at all. On the other hand, after selecting Apple Text Recognition it works without any problem.

Do you also have this problem?

I'm not sure what you mean by "then it does not function at all." Are you using it on the exact same image every time, or is the image changing? I'm unable to give you an answer without more information. However if Apple Text Recognition works for you, then what is the problem? I can't think of any case in which I would not want to use Apple Text Recognition. Since that works fine for you, why not use it?

Thank you for your message! I want to clarify that I am using OCR on the same image each time. However, the issue I’m facing is that Apple Text Recognition only supports English, and I need it to work with Polish text. This limitation is why I’m exploring other options, as I require OCR functionality that can handle the Polish language effectively. If you have any suggestions or alternatives that support Polish, I would greatly appreciate your input!

That's a valid concern. Apple's OCR does lack "polish" and it also lacks "Polish." (No applause required for that pun.)

It may be the case that Apple's OCR does support Polish, but KM certainly hasn't incorporated the Polish option in its interface. (@peternlewis? Here's a user that needs a different language.)

One thing about the non-Apple OCR is that sometimes the KM Engine needs to be rebooted in order for it to work reliably. When was the last time you rebooted the KM Engine? Try that and see if it helps.

See this post from KM's developer.

Yes, there's a problem using anything other than Apple Text Recognition, and that will be a problem if you are trying to OCR Polish text.

But if it is the same image each time, do you actually need OCR? Can you not just use the image itself for whatever you are doing?

I think he was using the “same image” just for testing and debugging purposes. He needs OCR to be able to read different Polish images containing text.

I fear you're right, but there's a slim chance OP is waiting for a button labelled "Składać" (thank you, Google Translate) or similar.

Yes, perhaps. I hadn't thought of that. But I'd say the odds are 2-1 in favour of my interpretation.

Is this in Sequoia or an earlier OS?

It appears the Tesseract code is frequently crashing with a memory corruption in Sequoia. I have no idea if this is a Tesseract bug or a Sequoia bug (the fact that it has worked fine in all previous versions of macOS does not necessarily mean the bug isn't in Tesseract).

I'm not sure that I am going to be able to resolve it though - resolving a memory corruption bug would be hard enough in my own code, it is virtually impossible in the system or in Tesseract’s code. I will look at what options there might be and whether it might be possible to update the Tesseract code, but I can't reasonably do that until my development Mac is also running Sequoia which will not be for some time. So Apple Text Recognition remains the only answer. And as you note, it is English only - at least as far as Keyboard Maestro supports, I'm not sure whether it is possible to expand it to support other languages. That is another thing I will endeavour to look in to.

1 Like

The error showed up after installing MacOs 15, before that everything worked without a problem.
Now you can use Apple's built-in OCR, but it doesn't work for Polish characters.

To solve this problem, I wrote a python script that retrieves screen content and performs ocr using the pytesseract module. The script works without a problem. I had some work to do to get it running in python in the environment, but it worked out and I learned a new skill :slight_smile:

Before that, install the module pytesseract in the environment.

Script placed in location
In the KM I launch: "Execute Shell Script"
source /bin/activate
python3 /ScreenOCRtoClipboard.py

Script:

#!/usr/bin/python3

import subprocess
import pytesseract
from PIL import Image
import os

def main():
# ĹšcieĹĽka do zapisu zrzutu ekranu
screenshot_path = '/temp_screenshot.png'

# Ustaw ścieżkę do wykonywalnego pliku Tesseract OCR
pytesseract.pytesseract.tesseract_cmd = '/opt/homebrew/bin/tesseract'  # Zaktualizuj tę ścieżkę, jeśli to konieczne

try:
    # Wykonaj zrzut wybranego obszaru ekranu
    subprocess.run(['screencapture', '-i', screenshot_path])
    
    # Wykonaj OCR na obrazie w języku polskim
    text = pytesseract.image_to_string(Image.open(screenshot_path), lang='pol')
    
    # Skopiuj tekst do schowka
    process = subprocess.Popen(['pbcopy'], stdin=subprocess.PIPE)
    process.communicate(text.encode('utf-8'))
    
finally:
    # Usuń tymczasowy plik zrzutu ekranu
    if os.path.exists(screenshot_path):
        os.remove(screenshot_path)

if name == 'main':
main()

1 Like