How to insert spaces in a "string without spaces between words"

I write KM macros to help me improve my experience (or my performance, hehe) playing Apple Arcade games. Most of the threads on this site are about solving "business problems" so I'm a little embarrassed that I'm trying to solve a problem with "game playing." It makes me look immature.

But here's my problem today which I'm stuck on. My macro obtains the names of the current players from the screen, using OCR, and when one player leaves or dies, my macro reads the names of those players aloud, so that I can hear who died. This works fine, and it helps me play the game. My question is more of a text-handling issue. Many of the players have names like "Mary" or words like "Winner", which are easy for the Speak Action in KM to pronounce, but about 50% of all players have multiple-word names without spaces between them, like "MrRobot" or "AlienDNA". What I want is ideas for how I can write code in a KM macro that will insert spaces into the name where the word breaks are probably intended. In the two cases I just mentioned (MrRobot and AlienDNA), I could probably insert a space just before a capital letter, and that would probably address those cases ("Mr Robot", "Alien D N A.") But that won't solve all the cases. What about names like "youbelongtous", for example? That's an example where there might be multiple valid ways to break up the string.

I don't really care about perfection or ambiguities in any solution. I know there will be unsolvable or ambiguous situations. But I'm looking for tips that will address the majority of the problems. I do have some ideas for solving this, but I have a feeling that there may be better ideas than mine, so I'm asking here today. For example, is there a macro that can perform spell checking on a string variable? If so, that might help to create a great approach. Or maybe I could paste the string into a Google search box, which has the interesting ability to correct your spelling? Has anyone done this? Or what about trying to trick macOS into using its auto-correction feature?

P.S. There isn't even a "tag" for "games" on this website, which shows how few people are using KM to solve problems with games. Maybe we could create a topic category called "Games" for this site so we don't have to suffer the embarrassment of asking in such a business-oriented environment.

1 Like

I've never even thought about a task like this, and had zero idea how to proceed. And here's where I'd argue tools like ChatGPT shine: I don't want an answer (necessarily), I just want some direction. Sure, it may not actually be artificially intelligent, but it can find references much faster than I can using web search tools.

So I asked it for some direction:

Me:
Are there any Unix libraries that can help split strings into words, where the string is a series of connected common words, like youdonotlivehere?
ă…¤
ChatGPT
ă…¤
Yes, there are several libraries and tools available on Unix-based systems that can help split strings into individual words, particularly when the string is a concatenation of common words. Here are a few options you can consider…

And it offered quite a few solutions; here's the full exchange. I tested the wordninja solution, as it looked the simplest, and it worked great with your test name:

>>> import wordninja
>>> text="youbelongtous"
>>> words = wordninja.split(text)
>>> print(" ".join(words))
you belong to us

This wouldn't solve all possible names, especially as it's keyed around real words. But maybe it could be the first pass, and you get fancier if that doesn't help?

-rob.

1 Like

Not clear, alas, that it's responsible to recommend trying libraries, real or imagined, "recommended" by LLMs.

From the article:

"Once you know about the threat posed by package hallucinations, there’s one simple fix: Double-check everything that ChatGPT tells you before you actually believe it."

Which is exactly what I did, and found that all of the things it told me about were legit:

  • Natural Language Toolkit has been around for over 19 years.
  • wordninja has been around for seven years.
  • The third solution is recommended uses the built-in dictionary, which has been around…I have no idea, as long as Unix has?

In any event, I'm in complete agreement with the article: Check everything ChatGPT…strike that…check everything anyone tells you about packages, apps, etc. It's just common sense.

And with that said, I stand by what I said: LLMs can help us get stuff done. ChatGPT just saved me probably 15 to 20 minutes of complicated web searching and trial and error to find the answer to the question I wanted answered. You seemingly despise LLMs, but they have a role to play, especially for those of us who aren't professional coders.

And to prevent this from going offtrack, let's move any further discussion to the existing lounge thread on this very topic.

-rob.

If there are no better ideas, I was planning to write my own solution using the approach in the example shell script solution you cited. It looks like the idea I had in my mind.

It's almost convincing me to use ChatGPT to solve my problems. It was so fast creating that shell script. It would have taken me many hours to write that script. But I enjoy writing code. So I'll stick to brain power for a while yet.

I still think my "capital letter hack" might solve about 20% of my problem with a single KM macro. I'm still pondering other simple ideas that may solve another 20%.

Yea, if there are capital letters, that'll work great—I didn't bother answering that one, as I figure you had that one handled. I was more intrigued by the concept of splitting merged words, and figured someone must have written a library to help with that at some point in time. And I was right! :slight_smile:

So now I have two tools bookmarked for possible future use, which is great.

-rob.

I’m not a professional coder, and I have no objection to anyone experimenting with LLMs.

But I don’t think it’s relevant or responsible to promote them here.


If you have found a library which seems relevant to a problem just tell us.

(I don't think that help with use of Keyboard Maestro needs look at the details of my LLM session ! )

I provide such references so that people don't assume that I'm the source of the knowledge. If I am, obviously, I don't provide a link or mention an LLM. But in this case, I felt it important to point out where I got the knowledge I was providing, so that others can do their own due diligence on the results—because as you point out, there are pitfalls to using LLMs.

On the other bit, I posted a reply over on the Lounge thread.

-rob.

A large language model has many interesting properties,
but it is never, in any sense, a "source of knowledge".

It's just a source of random language.

In addition

  1. There's no need for us to show our working (look – here are the details of all the cool google searches I tried !!), and
  2. if we want to cite an authority, then we can cite some institution or publication which seems likely to inspire a degree of confidence or reassurance.

That can never be a tame stochastic parrot, or random cliché generator.

I'm not going to reply here any longer, because it's off topic, something we've been asked to avoid. I will reply in the Lounge.

-rob.

Yes – proselytising for OpenAI was the point at which it went off topic.

I'm sorry to have caused such a horrible fight between two leaders of this website.

If it's any consolation to one of you, I think one of you is on firm moral ground. But I won't say which one.

Ironically, splitting up a concatenated string is something ChatGPT is good at, so rather than get it to code for us, we could get it to do the task itself.

If only I could get that confounded API call macro going...

I think it's less than three weeks until Apple's AI announcement. I'm not investing any time in AI before then, in case Apple adds support for AI, perhaps even at the API level.

1 Like

You are very polite to ChatGPT. In all the prompts that I have seen, I never saw the word “please”.

2 Likes

I'm hoping it will remember and spare me on Judgment Day.

2 Likes

It may not be a deity, but it could be a dAIty.

1 Like

It was a poorly executed Skynet reference.