New user, request for help with text manipulation in Bruji's BookPedia

What does this mean? The database is supposed to be up to date on saving time, no?

1 Like

Tom, I think this rules out using CSV, unless he wants to export the entire book, process the data, and create a new book from the corrected data.

@MarkSealey, what do you think?

1 Like

I'm calling it a day guys. If you haven't solved it by tomorrow, I'll be glad to help if I can.

1 Like

Yes, according to what he says, it rules it out. But what he’s saying is he wants to go thru each row (record) manually, because he doesn’t know if there are maybe two colons in the first row.

This complicates the things, obviously. But it is not undoable. And even if we can’t manage to solve it with two+ colons, he could always post-treat the results in relatively less time. Well, depends on the the amounts of titles with double colons in it.

1 Like

JMichaelTX,

Attached a zip of two dumps - txt and csv - from a much cut down BookPedia library with all (Smart) Collections removed - except the one that returns which records do have a colon in them somewhere.

BookPedia exports.zip (5.0 KB)

The ideal KM macro would be one which looks at a single record. That's because not all records with a colon in their title field need splitting… I want to retain control.

And one which allows me to start iterating through the 8,000 records at any point in the database - partly because I've already fixed a couple of hundred (such) records manually; and partly because I don't anticipate being able to fix all those records which need fixing at one go/in one sitting.

The interrelatedness of (Smart) collections and possible resultant losses of data integrity in BookPedia is another reason why I've been reluctant to pursue the approach whereby a script just works its way through either an export or the entire database.

IOW, exporting-fixing-importing would work very well for a single table; but re-importing would lose all sorts of data-relationships in BookPedia.

I really do need to go record by record - thereby updating all BookPedia's (Smart) Collections etc; and giving me the final say over which record is amended, and which not - really is the best way for me, I think.

I doubt that would be possible because of all the live-updated (Smart) Collections etc. I don't believe they're exported with the dumps we've been talking about.

I have almost 100 of these (Smart) Collections with carefully-defined criteria. It would take longer for me to record and recreate them all than to clean up on colon manually. But thanks for advising that there is so much merit to that approach; I can see your point!

I really appreciate all your help. I'm learning a lot. Bottom line: I can live with having to edit records with more than one colon. But I do believe it's vital to go record-by-record within BookPedia.

Thanks for your other guidance :slight_smile:

@MarkSealey, I have updated the macro and the script, please re-download it from here. (If you are still interested.)

I have also set the maxEntries to 1 so it now processes only one row at each run.

I’ve tested it with my mini dataset and it seems to run fine, if you order the rows by modification date, so that the oldest one is on top:

(Otherwise it would always process the same row.)

1 Like

Tom,

Thanks a million!

Still very much interested :slight_smile:

I've downloaded and tried your macro. For some reason it's not quite working as I think it should.

I believe it's failing in the loop:

on error
	set originalTitle to ""
end try

because the Original Title field is remaining blank.

Would you prefer to email direct, take this offline here as it probably isn't of great interest to anyone not using BookPedia? I email you or you me with email addresses?

One quick question, first, though, please: is your macro designed to work when the records are displayed one-by-one, in the single Edit window; or as in your screenshot (above) displayed in BookPedia's spreadsheet-like view?

If in the latter, I understand the importance of moving BookPedia's columns so that at the left come: Title, Date Modified (on which the whole thing must be sorted, oldest first) then Original Title.

But I wonder whether it's not working for me the way it is for you because I have a different number of columns further to the right and different actual custom columns?

Or because of some other way in which my fields are arranged, ordered, (named even?) differently from yours?

This shouldn’t be the reason. This is meant to avoid an undeclared variable in case the old title doesn’t have a colon (= no text item 2).

No. This can still be of use for somebody with a similar database GUI.

Starting point is the window as shown in the screenshot.

In the main window (is it called like that?) the order of the displayed columns shouldn’t matter (well, try to have the Title column at the left, since that’s how I tested it). What does matter are the positions of the Title and Original Title fields in the edit window, as shown in the last image here.


PS:

Another (unmentioned) assumption of my script is that the Original Title is empty for all records. (Otherwise it will just get overwritten.)

1 Like

I just imported your exports from above and noticed that the Title field remains empty for all records.

Is this normal, or what am I doing wrong? (In the CSV there seems to be a Title…)

1 Like

Tom,

(I do want to find a more concrete way to thank you for all this. I know it's the spirit of our community; but when we (= you) get it working, it will save me so much time - and I'll have learnt so much. And you've put in so much time too!)

I'm starting with the main BookPedia window as here:

Then triggering the macro (I put a KM Display Text action in to make sure it is running - although I do see BookPedia's Edit window briefly appear, so I am sure it is running, though somewhat slowly).

But I didn't have the fields in the same order as you did.

Now I do:

Yes, the Original Title is always empty. But if it weren't, overwriting would be what I want.

When I run the macro, the Original Title field still remains empty; I also checked - and the candidate string for that field isn't getting put into any other field.

I do notice - as will you - that the Date Modified field always contains the same date (e.g. November 5 2018): this is because of hours and hours I've spent tidying manually. I don't think it's really relevant since - as you explain - it's a way to prevent the records re-ordering themselves on processing complete.

Which does bring me to another question, please, Tom: is your macro designed to work on all records - or just those with a colon? I'm happy with the latter. It will be quicker, and I do still need not to split certain records even though they have a colon :slight_smile: .

Tom,

I've always been wary of working to and out of such exports. I don't think they're really designed to act as ways to transfer data between actual BookPedia files.

I don't know. Yes, I just looked again at the exports. They're as I would expect. With titles. It's partly why I wanted to do everything inside BooKpedia.

I agree, odd…

Well, forget about it. I just downloaded it to have a more “real world” dataset. But we should also get along without it.

1 Like

I'm starting with the main BookPedia window as here:

Looks fine.

Now I do:

Seems fine, too.

When I run the macro, the Original Title field still remains empty; I also checked - and the candidate string for that field isn't getting put into any other field.

When you run the script/macro, does the edit window get opened at all?

I do notice - as will you - that the Date Modified field always contains the same date (e.g. November 5 2018): this is because of hours and hours I've spent tidying manually.

Although the app only displays the modification date/time with a resolution of days, it still is aware of the precise modification time.

I don't think it's really relevant since - as you explain - it's a way to prevent the records re-ordering themselves on processing complete.

The point is to make it reorder the records. So the most recently modified (= modified by the script) must go to the bottom. Otherwise we would run the script always on the same record. And the app does live-time reordering, at least here. (No need to exit the window or such.)

is your macro designed to work on all records - or just those with a colon?

It should be fine with either. See first paragraph here. If there isn’t any colon nothing will (should) be changed.

1 Like

What little experience I have of using SQL with BookPedia tells me that it is extremely well and robustly constructed (their developer is one of the very best).

Perhaps there are INSERTions into temporary/buffer tables.

Looking at the structure with something like Datum Lite might be revealing?

Yes, the Edit Window is opened and the menus called for the changes the macro makes.

It also puts the target record elsewhere - presumably to today's modification date.

But when I then look at it again - e.g. by searching for its title or author - the Original Title field is still blank.

Yes, the Edit Window is opened

and closed again?

and the menus called for the changes the macro makes.

What do you mean with this?

It also puts the target record elsewhere - presumably to today's modification date.

If the script has changed a record it should go to the bottom of the list, yes.

But when I then look at it again - e.g. by searching for its title or author -

You could probably also bring the last modified record to the top by – temporarily – changing the sort direction of the Modified column. (Click the column header.)

the Original Title field is still blank.

Even if there was a colon in the old title? And the Title field, was it changed by the script (if there was a colon)?

1 Like

When you run the AppleScript from Script Editor (/Applications/Utilities/), does it behave the same as when run via KM? (The script in its current form is completely stand-alone, i.e. you don’t need KM to run it.)


PS:

When running it from Script Editor, make sure the app has Accessibility permission:

47-pty-fs8

And also KM Engine should have that:

24-pty-fs8

1 Like

Tom,

Yes; it is very quick. But it does - I think - what it's supposed to do by opening the Edit window.

I see the menu appearing and the script making the same changes as I would do if I were editing manually. Quick, though.

Running it again, I see that it actually doesn't change the position of the record.

That's right, I'm afraid. The Main title string of records with one colon remain the way they were and nothing gets pasted into the 'Original Title' field :-(.

No.

Tom,

Running from the Script Editor (Accessibility permissions all set as they should be), I think reveals what's going on. I think:

  1. I have BookPedia open in the Columns view (as confirmed above)
  2. ordered/sorted by Date Modified
  3. I single-click to select the first record
  4. the standalone script, or the KM macro, executes - on that first record successfully; if the first record contains a colon in the title, the split and the strings are handled correctly

But when:

  1. I have BookPedia open in the Columns view (again as you confirmed above)
  2. ordered/sorted by Date Modified
  3. I single-click on - say - the third record to select it because it does contain a colon, it's the first visible record to contain a colon,

then:

  1. the standalone script still runs but it starts on the first record, and processes it correctly

IOW it always processes the first record.

Is that what you'd expect?

If it is, maybe that's OK because I can run the script/macro on the Smart Group I created where (virtually) all the records do need to be processed :slight_smile: .

Progress!

Running from the Script Editor

OK, since this works for you, let’s run it from Script Editor from now on. Just to exclude any additional complications by KM (not saying that KM tends to add complications :wink: , just to make it as simple as possible – during the debugging phase).

I single-click to select the first record

You don’t have to click any record. The script always works on the top-most. That’s why I emphasized that it’s important to sort by Modification Time, so that the processed record goes down. Sorry, if this wasn’t clear enough in my explanations.[1]

IOW it always processes the first record.

Is that what you'd expect?

Exactly. See above.

the standalone script still runs but it starts on the first record, and processes it correctly

Great :slight_smile:


1: Since you have stated that you only ever want to process one record at a time, I will remove the loop entirely from the script. This will make the script maybe a bit more transparent. I’ll upload it shortly.

Edit: However, I think, processing the records in junks of 5 or 10 wouldn’t be too bad either, no? Let me know what you think. That way you wouldn’t have to temporarily change the sort order for after each record for checking. (You could check the last bunch of 5 or 10.)

1 Like