The built-in title case filter in Keyboard Maestro distinguishes between "small words" that should not be capitalized and words that should but it doesn't recognize acronyms.
Take, for example, these fake headlines:
The sub-report of the ACI committee
C.I.A. Defeats F.B.I. in Smart Softball match-up
Lightroom CC Includes Significant ACR Revisions
OpticsPro takes on lightroom cc and phase one capture one pro
This is how the built-in filter applies capitalization:
The Sub-Report of the Aci Committee
c.i.a. Defeats f.b.i. In Smart Softball Match-Up
Lightroom Cc Includes Significant Acr Revisions
Opticspro Takes on Lightroom Cc and Phase One Capture One Pro
John Gruber wrote a Perl routine in 2008 to handle title case which was refined that same year by Aristotle Pagaltzis and put in the public doman (https://gist.github.com/gruber/9f9e8650d68b13ce4d78). It applies capitalization to those same headlines like this:
The Sub-Report of the ACI Committee
C.I.A. Defeats F.B.I. In Smart Softball Match-Up
Lightroom CC Includes Significant ACR Revisions
OpticsPro takes on lightroom cc and phase one capture one pro
The main difference is that when the Gruber/Pagaltzis routine detects an acronym, it leaves it alone, although it won't presume one if it's lowercase.
USAGE
Select the text you want to convert and type the Hot Key, which has been set to Shift-Option-T. Your selection will be replaced with the capitalized string.
Nice to know (about Gruber). Maybe the next version of Keyboard Maestro can drop in the improved version .
The hashtag and sign tags are not something I’ve seen in standardizing headlines, which is what I’ve been using it for. You’d think you’d just have to add a [@#]/w+? to the small words array but that doesn’t fly.
I’m happy to update it, but as far as I can tell, the current Title Case script used by the Title Case filter does not actively lowercase anything, so if you start with “C.I.A.” you’ll end with “C.I.A.” unless you lowercase it first.
I can’t see how any non-sentience filter could figure how to uppercase “aci” as an acronym if it was not originally “ACI”, in which case the current filter will leave it uppercase.
I completely forgot that I did in fact run the lowercase filter on the string before the title case filter when using the built-in filter. That would have killed the acronyms, as you point out.
I think the reason for that was to knock down an all caps string so it could be properly capitalized. Which suggests no one approach handles everything. If you have a string with acronyms, you don’t want to lowercase the string. But if you have all caps, you do.
Maybe, though, the code could see if there are any lowercase characters in the string to begin with. The improved version does do that, I see.
You can’t do Title Case on the first because it’s already all caps. So you would lowercase it first. But when you do that, you lose the acronym.
You can do Title Case on the second without losing the acronym, though.
The improved Title Case does look at the text to see if it has any lowercase letters. If not, it lowercases the text:
$_ = lc $_ if not /[[:lower:]]/;
That still loses the acronym but at least it edits the text. You just have to restore the acronym. And if the string is not all caps, you’re fine.
(I’d been using the built-in Title Case after a Lowercase and forgot about that Lowercase when I said the improved macro handled acronyms and Title Case didn’t. I was killing the acronyms that Title Case would have left alone.)
the only way I can think of to deal with acronyms in all caps is to have a lookup list. I don't know if that is worth the effort or not, or if it would slow down the process too much. It's just an idea.
Maybe it could work like spell check. The system provides a standard set of acronyms (available on the Internet), and then the user could have a custom list he/she could add to.
Well, it’s only an issue with a string that’s all caps so you could deviously avoid selecting any acronym in such a string (since it’s already correct) and process the rest.
I don't thinkthat would workbecause pretty much every short sequence of letters is an acronym for something. The best you could do would probably be the reverse - if it was not an english word then assume it's an acronym but that would fail for all sorts of things too (eg mad up words, or truncated words or slang or …
Yep, you're right. That's why it was just an idea, an untested idea at that.
What about if the user provided a list when he wanted to check for acronyms?
Maybe that's a separate KM Macro, to use after the Titlecase function/filter.
I did this years ago for someone on the BBEdit-Talk list using the Satimage.osax and a very large list of acronyms (about 1400). The thread on BBEdit-Talk is (here).
The downloadable file has the entire list of acronyms included.
The bare AppleScript has ONLY a sample of the large number of acronyms in the downloadable file.
The variable fixedCaseWordList contains words that are to be formatted as they are IN the list and is very easy to add new items to.
As is the script works on the selection in BBEdit, but that's easy enough to change.
------------------------------------------------------------------------------
# Auth: Christopher Stone <scriptmeister@thestoneforge.com>
# dCre: 2015/10/21 17:20
# dMod: 2017/05/18 20:01
# Appl: BBEdit & the Satimage.osax
# Task: Change case of selected text to title-case.
# Libs: None
# Osax: Satimage.osax – http://tinyurl.com/satimage-osaxen
# Tags: @Applescript, @Script, @BBEdit, @Change, @Title, @Case, @Title_Case, @Selected, @Text, @Acronym
------------------------------------------------------------------------------
# SCRIPT REQUIRES INSTALLATION OF THE SATIMAGE.OSAX APPLESCRIPT EXTENSION!
------------------------------------------------------------------------------
set fixedCaseWordList to paragraphs 2 thru -2 of "
a
AFL-CIO
an
and
as
at
but
by
for
from
in
into
it
NAACP
nor
of
on
onto
or
so
the
to
with
"
set lowerCaseWordRegEx to change "(.+)" into "\\\\b\\1\\\\b" in fixedCaseWordList with regexp without case sensitive
set _text to getSelectionOfNamedBBEditWindow(1)
set _text to titlecase _text
set newText to change lowerCaseWordRegEx into fixedCaseWordList in _text with regexp without case sensitive
set newText to change "^(\\w)" into "\\u\\1" in newText with regexp without case sensitive
set newText to change "(\\w)(\\w*)$" into "\\u\\1\\2" in newText with regexp without case sensitive
setBBEditTextSelectionTo(newText)
------------------------------------------------------------------------------
--» HANDLERS
------------------------------------------------------------------------------
on getSelectionOfNamedBBEditWindow(_window)
tell application "BBEdit"
if _window = "front" then
set _window to window (name of front text window)
else if class of _window = integer or class of _window = text then
set _window to text window _window
end if
tell _window to return contents of selection
end tell
end getSelectionOfNamedBBEditWindow
------------------------------------------------------------------------------
on setBBEditTextSelectionTo(_text)
tell application "BBEdit" to set contents of selection's text to _text
end setBBEditTextSelectionTo
------------------------------------------------------------------------------
I'm trying to adapt your macro to not replace whatever was on the clipboard previously. Copying to a named clipboard doesn't do it. Deleting past clipboard works only if followed by Display Clipboard. With Display Clipboard disabled or removed, the text processed by the Improved Title Case shell script remains on the clipboard, not the previous clipboard contents.
The same is true if I set the shell script to save results to a named clipboard and then paste from there.