Find and replace in Pages App

Hi,
I have a friend with a task that I'm sure KM can help with, but I'm not entirely sure how to tackle it.

It is a music PhD thesis written in pages app. There are numerous descriptive notes written such as 'Ab' 'C#' 'Db' etc.
He needs to replace all of the 'b' or '#''s with a different font, a downloaded music font that makes the flat and sharp signs look correct.

I believe the regex is
([A-G])([#b])?
And the second group would be the text that needs to be replaced with the new font. The formatting is all done already in pages, so copying the file to the clipboard and pasting back isn't an option, and the 'search file using regular expression' doesn't accept pages files.

Could somebody help please? My friend has downloaded the demo and has already got KM turning his office lights on and off when his mac wakes and goes to sleep, he will be a very easy convert to the KM way of life!

I would use AppleScript for this job, which Pages specifically integrates well with. It can be used to edit the text of an open document and instigate the replacements you specified, like this:

(P.S. He doesn't need a new font. The Unicode character set contains musical notation, including ♯ and ♭.)

tell application "Pages" to tell the front document to tell the body text
	
	set (every character where it is "#") to "♯" -- replace hash with sharp
	set (the last character of ¬
		(every word where ¬
			it is "Ab" or ¬
			it is "Bb" or ¬
			it is "Cb" or ¬
			it is "Db" or ¬
			it is "Eb" or ¬
			it is "Fb" or ¬
			it is "Gb")) to "♭" -- replace b with flat
end tell

You can insert that code into an Execute an AppleScript action, then open the file that needs to be edited (making sure it's the only one open, or editing the script so that front document is replaced with a reference to a specific document), then run the script.

Test it first on a small snippet to make sure it performs the changes you desire (although, even if you did it on the full document and it wasn't to your liking, you can ⌘Z it).

2 Likes

Hello! I'm the person with the thesis, and thanks for replying to this. AppleScript seems like the way forward but unfortunately this particular script is throwing up an error. KM will run the macro but it doesn't like the script, throwing up this error:

error "Pages got an error: Can’t set document 1 to "♯"." number -10006 from document 1

Is this an error with the script not being able to locate the file, or an error with the script?

Thanks for your help! :slight_smile:

Hi Scott and all,

That worked for me. I simply had a pages document with some random Ab's and C#'s thrown in, and it replaced them with the correct unicode.
Unfortunately the spacing looks so wide that I thought there was an extra space - but I suppose it's fine.
B♭
That's an example. There's no extra space there. Perhaps there's another setting that can fix that, or maybe your alternative font. Perhap's the issue is because the file is so large. Try it on a sample of just a few pages Scott and report back and someone will be able to help I'm sure.

Update: the one thing I didn't do was a restart. After restarting my machine everything is working great. Thanks @Damoeire and @CJK :slight_smile:

Actually, off-line, both @ScottF and myself are having issues with beachballing if we run it on the whole file which is 364 pages and 3.4 mb. We could of course cut and paste out 50 pages or so at a time, but are there any other suggestions?

You could try this:

tell application "Pages" to tell the front document
	
	set _P to a reference to every page
	
	repeat with P in _P
		set _T to (a reference to body text of P)
		-- replace hash with sharp
		set (every character of _T where it is "#") to "♯"
		-- replace b with flat
		set (the last character of ¬
			(every word of _T where ¬
				it is "Ab" or ¬
				it is "Bb" or ¬
				it is "Cb" or ¬
				it is "Db" or ¬
				it is "Eb" or ¬
				it is "Fb" or ¬
				it is "Gb")) to "♭"
	end repeat
end tell

This processes the text page by page rather than trying to manage the entire document as one huge mass. But without a similarly large document to test this on myself, I can't give you definitive guarantees that this will perform better than the other.

1 Like

Thanks @CJK
That does a better job but seems to stop at Pg. 90 /364.

In addition it misses some like C♯-F# and [B - A#] which I could understand if they were the flats, it would be a regex issue, but this applescript says replace every hash with sharp.

It sometimes only does the first 4 or 5 # on each page then misses the rest.

It also misses all of the tables. I can see from the dictionary you can specify the tables, but I'm afraid I don't know how, would you mind adding that for us?

And finally, while the unicode flats and sharps are going to be an invaluable backup if all else fails - the spacing and looks of the external fonts are much much nicer. Do you think there is a way to incorporate a separate font? @ScottF tells me Applescript has a 'Set Font' command. Might this help?

Thanks for your help with this.

I did try this with the font without success:

tell application "Pages" to tell the front document

set _P to a reference to every page

repeat with P in _P
	set _T to (a reference to body text of P)
	-- replace hash with sharp
set the font to "Opus Chords Std"		
set (every character of _T where it is "#") to "#"
	-- replace b with flat
set the font to "Opus Chords Std"				
set (the last character of ¬
		(every word of _T where ¬
			it is "Ab" or ¬
			it is "Bb" or ¬
			it is "Cb" or ¬
			it is "Db" or ¬
			it is "Eb" or ¬
			it is "Fb" or ¬
			it is "Gb")) to "¨"
end repeat

end tell

The keys for sharp and flat are alt-3 and alt-8 - which is what I've typed into the above.

Without further digging into your issue, I’ve just seen that you are replacing a # by a # (i.e., by the same character):

set (every character of _T where it is "#") to "#"

Probably I’m missing something, but if not, then this doesn’t make much sense.

Thanks @Tom. The idea there is that if the font has changed then the second # would be correct. Maybe the 'set the font' instruction has to be on the same line, I don't know. Regardless, the issues that were in the previous post still apply even when the # is set to be replaced by unicode ♯

Using a different font would be ideal but the unicode would be better than nothing!

Sorry, it totally slipped my mind to reply to this post regarding your latest issues.

Yes, indeed there is. You can do it like this:

set fnt to "Opus Chords Std"

tell application "Pages" to tell the front document
	
	repeat with P in (a reference to every page)
		set _T to (a reference to P's body text)
		-- replace hash with sharp
		-- set (every character of T where it is "#") to "♯"
		set (the font of every character of _T where it is "#") to fnt
		-- replace b with flat
		set (the font of the last character of ¬
			(every word of _T where ¬
				it is "Ab" or ¬
				it is "Bb" or ¬
				it is "Cb" or ¬
				it is "Db" or ¬
				it is "Eb" or ¬
				it is "Fb" or ¬
				it is "Gb")) to fnt
	end repeat
end tell

That has to be a bug in Pages or AppleScript. When I tested the script above where the font of specific characters are changed, I did so on a slightly longer piece of sample text, and I encountered the same phenomenon. There's nothing wrong with the code, but in one instance, the script failed with an error code of 10000 and a message that said the AppleScript event handler had failed, which is always AppleScript not doing what it's supposed to do.

If this happens to you in any future scripts (it happens with System Events a fair bit), your two choices are to change nothing and just run the script again, and it'll probably work; or, if it persistently throws that error, find another means to achieve the same objective (which can be hard).

I ran the script again, and it was fine. I also noticed that in running the script again, some fo the characters that weren't replaced in the first or second run we then replaced successfully in the third; but not all, and I had to run the script four times to change one page of text.

It got there in the end, but it's pretty ridiculous that this is occurring. I don't have a fix for that right now, so I guess you might have to run the script a couple of hundred times until it's done (which Keyboard Maestro can help you do).

Ah, yes, it would do. I didn't know there'd be tables in the document.

This is where we really hit a bit of a wall with what your options are here.

There are two issues with tables in Pages documents. The first is another bug in AppleScript, which is unable to retrieve the object references of any table that was inserted manually into the document unless you first select each table manually and change its Object Placement setting (found under Arrange) from Move with Text to Stay on Page.

Again, it's ridiculous. But there's no workaround to this.

After that is done, the next irritation is that the text contained in cells of a table aren't retrieved as rich text items by AppleScript, and instead the content of the text is stored in the value property of each cell of the table, which is just a simple unary string value with no way to manipulate the characters in the way we can with the rest of the document.

You can iterate through each cell in the table to get these string values like this:

tell application "Pages" to tell the front document
	
	set _P to a reference to every page
	set _T to a reference to every table of _P
	set _C to a reference to every cell in _T
	
	repeat with C in (_C whose value is not missing value)
		set v to C's value
		.
		.
		.
		set C's value to ...
	end repeat
end tell

However, what you do at that point it beyond me. You could do a search and replace manually using the menu in Pages. Whilst this will successfully find words in tables, it will only let you substitute words and letters with words and letters; you couldn't use it to change the font.

I’ve seen the discussion has gone forward, but judging by your answer to my post some things seem to be unclear to you. So, maybe it’s worth clarifying them:

When we speak about Unicode or UTF-8 characters:

The code points are completely unrelated to the font. For example a # is U+0023 and a is U+266F. Two completely different things, and they won’t change with the font; they are constant.

If a font doesn’t offer a glyph for a certain Unicode char, then you just get nothing or an arbitrary replacement glyph.

If you have a special “graphics” font (e.g. specialized in musical symbols), things can be different:

Those fonts usually use common keys, like “A” or “T” and just replace the glyphs (A, T) with different graphics, for example or 🈚︎.
That is: For the computer the is still an “A” and the 🈚︎ is still a “T”.

So, when you are going to search/replace those chars you’ll have to look for “A” or “T”, not for or 🈚︎, because the latter ones are not visible to the computer. They are just representations (made-up within the font itself).

2 Likes

@Tom Thanks for that unicode101. I was probably being reckless with terminology in addition to not fully understanding it - though I did grasp the basics, even if I wasn't demonstrating that.

I'm going to abandon any further work on this, the third party (not @ScottF) who this was aiming to help has spent an hour and manually changed everything. I can't understand why they didn't just wait two weeks for me to spend hours trying to automate it, but each to their own! On the plus side this thread may help others and we have converted @ScottF to the KM side which is always a good thing. Thanks @CJK also for your help.

I'm going to abandon any further work on this

I guess the horse is dead, but it might have been interesting to convert the Pages document to a Word document. Take the Word document and use Advanced Find and Replace with Use Wildcards. This is sort of a limited Regex search and replace that Word is capable of.

Find what: ([A-G])#

Replace with: \1♯

and

Find what: ([A-G])b

Replace with: \1♭

This does the job on the Word document.

Then convert the Word version back to Pages..

This might work in the sense that Pages has some ability to convert to and from Word with the formatting preserved. In small tests, I could get this to work.

But I admit that in a document as large as a PhD thesis with heavy formatting, it would seem likely that there would something that would not survive the back and forth translation..


Doing it manually in Pages doesn't really seem that it should take even an hour. :thinking:

Would not you just do the 14 Replace and Seaches in Pages without using Regex?

Settings: Whole Words; Match case

A# -> A♯
B# -> B♯
...
G# -> G♯

Ab -> A♭
Bb -> B♭

Gb -> G♭
etc.

Yes, as you guessed, the problem with a thesis Pages document with complicated footnotes, formatting and tables, was the probability of small changes occuring during the two conversions. And the time needed then to recheck the thesis with an eye for tiny things that may have changed. It's possible it could have worked, but the thesis owner decided against it in the end.

Doing it manually in Pages doesn't really seem that it should take even an hour. :thinking:

Would not you just do the 14 Replace and Seaches in Pages without using Regex?

The problem there was that the user wanted to use a different font for the flat signs rather than unicode. As far as I can see Pages couldn't do that. I had a super quick play, if you went and searched for 'Ab ' using find and then pasted the correct font 'b' over the 'b' it would take some time, again, something I would have used a macro for. Also, does 'find' go through footnotes and tables? It doesn't matter, but it's all done regardless, and in much quicker time than it would have taken me to try to work out an 'efficient' way of doing it!

Anyway, as you say, for now the horse is dead, and Elvis has left the building upon it.

A post was split to a new topic: Choosing the Best Tools for Writing Long-Form Documents (e.g. Thesis)

@Tom, I thought your post on choosing the tool for writing long, text-centric documents was a great one, and worthy of its own topic under "Tips and Tutorials". So I moved it there.

Yeah, I agree, thanks. It’s a bit out of context now, but fine :wink: