Renaming Data Files with Alternate IDs from a Spreadsheet

Hi everyone,

I am setting up a research project with student responses (in PDF file) to assessment tasks. The files are generated with a course number, semester, student name, and additional trailing details to complete the file names. (ex.: chem999 Sp19 - Fname Lname/Project/Assignment/etc/etc).

Because of the kind of research I do, I need to (reversibly/retrievably) anonymize these filenames, which I have done in excel by generating random numbers for each student who has consented to research, sorting by the random number, and then sequentially assigning participant ID numbers starting at 0001.

So I have a spreadsheet that contains student participant information (first name, last name, etc) paired with the newly created Participant ID numbers (and new file names based on these IDs). I also have files containing student names. What I need to do is rename each file by matching student names (which are in both the filename and the spreadsheet) and retrieving/using the correct filename from the spreadsheet to rename the data file (a PDF).

I did this manually for last semester's data (about 400 files). But I would LOVE to automate this if feasible (I have roughly twice the data this time around). But my KM skills are super rudimentary. I think this might be a regex task, but I have zero experience with regex. At all.

Thanks in advance for any suggestions/assistance, and please let me know if I have not included enough information.

No Ideas?? :sob::sob:

If you can provide us with some data to work with, then it will be more likely you will get a response.

  1. Excel file (in a zip file) with make up student names
  2. Zip file with one file for each student
  3. Step-by-step workflow of how you do the task manually.
  4. Any other data that you (and us) need to do the job.
1 Like

Here is a solution. It wont be fast, but much faster than doing it by hand:

Extract the entries from the spreadsheet - convert all the tabs to spaces (use a Search and Replace action for that if necessary) and then put them in the macro in the Mapping variable as shown.

Then Keyboard Maestro will process the folder - I have assumed that you have a folder with folder names, each with the student names, so I have assumed you only have to process the folder (not its contents), and that the “Fname LName” is always at the end of the folder name.

For each file, it will search through the lines looking for a matching name. If it finds one, it will stop looking and then it will rename the folder.

Convert Files.kmmacros (6.1 KB)

2 Likes

thanks! I'll give it a shot

Thanks, I'm putting together some dummy files.

I should be very clear too - make sure you have solid up to date, verified, backups before attempting this!

If anything goes wrong, you could end up with a complete mess.

1 Like

Oh absolutely :slight_smile:

Test.zip (483.7 KB)

So here's the folder I put together.

Dummy participant info (name, global ID number, course section, study ID, and filenames for two different types of files) is in ParticipantIDs.xlsx
Dummy files are in the 'SamplePDFs' folder; these files have student names in the filename

When renaming files I do the following:

  1. Navigate to folder containing PDFs from a specific section for the specific submission type (referred to as DataConcl and PlanDoc in the excel file)
  2. Open the excel file, select all data, filter by section (to match the folder in step 1), and the sort alphabetically by first name (my pdf export includes student name in the filename in 'First Last' format)
  3. When renaming files, I start at the first file in the folder, command-tab over to excel, find the correct row, use arrows to move up or down to the right student/submission filename combination, command-c to copy, command-tab back to finder, type return to enter the file renaming mode, command-v to paste, hit return to accept the new filename
  4. use arrows up or down to highlight the next file that needs to be renamed
  5. goto 3

I don't think the example @peternlewis posted will work without some tweaking, so I'll try to make it work with my current setup in the morning. Again, thanks for the suggestions and example macro to try out!

This is why you need to post real (redacted) data when you ask a question.

Your folder looks nothing like your original example:

chem999 Sp19 - Fname Lname/Project/Assignment/etc/etc

Now your names have multiple names, not just First/Last, and they appear in the PDFs together with random numbers.

987654308 Lloyd Jeffry Andy.pdf

You can use the same technique, but you will need to process the PDF folder. The macro I wrote should work fine as long as you are replacing a name (eg “Lloyd Jeffry Andy”) with a number, then the lines in the Mapping variable will be something like:

0001 Lloyd Jeffry Andy

If so, the search & replace should still work.

Sorry. In my rush to whip together some dummy files I just copied and pasted from the makeshift spreadsheet and the filenames were not as accurate as they should have been.

The general form of filenames is:
University Course Semester Year - Fname Lname/Project number/Project Name/Project week/ Submission type - OriginalPlatformContainingDataNowStoredInPDF
Because that is the title of the webpage that is generated in my export to PDF macro (yes, including the /'s)

It was my intention to make sure I included the First and Last names in my dummy filenames since that is what I look for when manually renaming files. Sorry that I made things more difficult. And again, thanks for the help, I really appreciate having a place to start

I'm confused on what exactly is your workflow.
Your descriptions are conflicting, and don't match your data files.
Here's what I see:

It looks like to me that the first number in the PDF file name matches the ID column in the Excel file:

If so, then why not just use that ID to lookup the data in the Excel file?

So, how do you want to rename the files?
Using the "ParticipantID" column in the excel file?
Something else?

We need to see a real-word example of:

  1. The PDF files BEFORE any changes
  2. The PDF file AFTER renaming
  3. Exactly what the renaming rules are.

You have provided a lot of other info that I do not see the relevance of.

As I see it, your workflow would be:

  1. Open the Excel file to the workbook and worksheet that contain the student ID, and the renaming data.
  2. Select the folder that has all of the student PDFs
  3. Trigger the KM Macro to rename these PDFs

If that's not correct, then please correct it.

Well, based on my best guess of your workflow:

I have built the following Macro as an example of how you might do this:

MACRO: Rename Files Based on Excel File [Example]

It is probably not exactly what you want/need, but I think it should help you get started. If you have any specific questions about the macro, just ask.

yeah, I realized that my example data files were not properly named. I was tired and constructed the filenames extremely lazily/poorly. The attached zip file contains more representative filenames; the filenames do not contain the ID numbers, so they unfortunately cannot be used to lookup data.

Thanks for the example macro. It looks like I should be able to wrangle it to work with my weird setup.

Test_CorrectedFilenames.zip (488.6 KB)

OK, so we still need to know:

Please correct the workflow as needed.