Select a sentence in Pages

Hey Folks,

Apple in their infinite wisdom has decided that the user NEVER needs to work with a selection while scripting – at least in Pages.

If you have time please complain and disabuse them of this idiotic notion that completely disrespects and neglects the needs of their users.

To solve this problem you have to think fairly far out-of-the-box, but it's doable.

This macro is a proof-of-concept and will require adjustments to work perfectly with all possible punctuation.

-Chris


Pages -- Select Sentence.kmmacros (7.8 KB)

1 Like

Well-done, Chris. Using the macOS Find Pasteboard never occurred to me, but it is clearly a great solution here, well out-of-the-box. :+1:


Chris, one follow-up just to make sure I'm understanding your setup.
It appears that the user should first click in the Pages text just before the start of the sentence to be selected, correct?

Good point. Perhaps one obvious limitation of your RegEx:
\b(\w[^.?!]*ˆ[^.?!]+[.?!])

is that it will fail to properly identify the sentence IF the sentence includes a period anywhere within the actual sentence, as in:

Here is a sentence with abbrev. and acronyms like U.S. within the actual sentence. This starts a new sentence.

@ccstone and I seem to be thinking along the same line. But I'm stuck on the RegEx, I've tried his solution and several other syntaxes, but I always get an error at this point "search regular expression failed to match". I have checked my clipboard (or variables when trying them) by doing a "display" on it just before running the regex, and they do have the correct text. Do you know what I might be doing wrong here?

Side note: You can completely cut out the AppleScript bit by selecting from the cursor to end of the paragraph with Option-Shift-Down Arrow. If we really want to be fancy and make it so we don't have to move the cursor to start of sentence, we can do an Option-Shift-Up Arrow, use regex to move to start of the sentence, and enter your ^. Or alternatively, save that segment as it's own variable to be concatenated later (This was the path I was going down but ^ solution does seem like it might be more straight forward).

Edit: The regex I'm using to get the second half of the sentence ^[^.?!]*+[.?!] is working. But the regex I'm trying to use to get the first half of the sentence [^.?!]*[.?!]$ is not, despite working on regex101, regexr, and regextester. More complex solutions such as the one given are also not working for me.

A well-known problem for translators. There’s even a standard to deal with it: https://en.m.wikipedia.org/wiki/Segmentation_Rules_eXchange

Luckily, the number of exclusions is limited. Sets of examples are widely available for many languages.

One could enhance the macro to include a list of abbreviations and acronyms. This list can be maintained in a text file with one item per line.

The most elegant solution would be to let the user select an abbreviation or acronym in Pages and let the macro add it to the (alphabetically sorted) text file :).

Once you know the range of the sentence you found above. You can set the selection via AppleScript

I have just posted a Macro that should do the trick:

MACRO: Pages -- Select Sentence [Example]

Please let us know if this works for you, or if you have further question about your OP.

This macro does address/solve the above issue I raised:

There seems to be something wrong with my RegEx? I am using your new RegEx suggestion in the macro and am still getting the same error. As it works for you and I have the latest version I'm assuming I'm doing something wrong somewhere along the line.

Are you using the Macro I just uploaded? If not, please do so.

Nice regex, but I’m not sure whether it’ll work for languages that use initial uppercase for nouns, like German:

Die Firma Müller und Co. KG stellt Miederwaren u. Ä. her.

Well, English also uses initial upper case for nouns, so that is not an issue, but I can't speak to use with other languages that may have different punctuation.

However, your German sentence worked fine, EXCEPT for the upper case immediately following an abbreviation, like this:
und Co. KG

Assuming that "KG" is NOT the start of a new sentence, then my RegEx matched a false positive, and thought this was the sentence:
"Die Firma Müller und Co."

If I simply added a normal word between "Co." and "KG", it works fine:

Die Firma Müller und Co. und KG stellt Miederwaren u. Ä. her. Here is a sentence nouns like Tom with abbrev. and acronyms like U.S. within the actual sentence. Start of a new sentence.

Result:

Die Firma Müller und Co. und KG stellt Miederwaren u. Ä. her.

Does that look right?

But that same issue would/does exist in English as well.

The bottom line is that there is no perfect solution that will identify sentences in all cases. I have done extensive research over the last few days, and while I found many people asking how to do this with RegEx, I did NOT find any perfect solutions. AFAIK, my RegEx, which I developed, is the best I have seen. But I would be thrilled if anyone can improve on it! :smile:

Very clever solution, kudos to you both!

This looks great but ideally I would like to be able to select the sentence from anywhere in the sentence, not just the beginning as navigating to the start could potentially be more time consuming than just highlighting it with the cursor. So I've been working on this solution, which takes a similar approach as yours, but basically follows my outlined workflow above.

However, as I stated earlier, the sticking point so far is that RegEx is not working for the initial SentenceFirstHalf selection. Though it does work for the second half. This RegEx passes on three of the major RegEx checkers, so I'm confused as to what I'm doing wrong here. As I stated earlier since your new RegEx solution is also not working for me I am suspicious I'm doing something else wrong that is the root problem but can't see what it is.

Note:

  • I appreciate this is a very primitive RegEx solution, I'm new. But if I can troubleshoot the issue hopefully I can expand it using your template.
  • The display boxes are just temporary to test that the script is working up to that point.

Kyla_New

I am not getting a RegEx error in your macro, so I assume my issue is local to my solution. But I'm unsure what I've done wrong? Can you spot the error?

Would love to use yours but having to move to the start of the sentence to do the selection would take as much if not more effort than just skipping the macro. ie if I've already used the mouse to move my cursor I may as well highlight while there. And while I do have a hotkey set up to move the cursor back word by word, if I'm in the middle of the sentence that's several keystrokes before I can use this hotkey solution.

Is there a way to finesse yours or fix the RegEx on mine? In the case of yours perhaps a preceeding Regex command to read backward for a punctuation mark (using $ anchor?), then a mouse click?

OK, that is good to confirm that my Macro works for you.

So, if you want to initiate the Macro when you are in the MIDDLE of a sentence, that is much more complex. I will have to take some time to investiage this use case.

Let's start with a real-world example of the text where you want to select the sentence you are in. Please post in a code block to preserve all characters, OR, even better, zip your pages file and post it, along with an indication of where the text cursor would be when you want to select the sentence.

Macro has been updated to support text cursor anywhere within target sentence.

2 Likes

It works! You sir, are a genius.

2 Likes