I have a great number of duplicates in evernote that I would like to find an delete the extra note. Evernote directs to use View β all notes β sort by title then review the list of notes. I have 30,824 notes. It will take a lot of time to scan the list of notes. I have KM activating Evernote, Showing all Notes and sorting by title. But I do not see how to have KM match the titles and then tag the duplicates as duplicates so that I can see a list of duplicates only.
It will require AppleScript to accomplish your task.
Even so, it will take a while (10-60+ minutes) to go through every note and compare titles.
Are the dup titles exactly the same? If there is even one character difference, then an exact comparison will fail.
If we can do an exact compare, then the process is simple.
The titles would be exactly the same. My thinking is that if I can get the duplicates tagged, then I would have the duplicates that I could review without reviewing all notes. I donβt know enough AppleScript to accomplish this.
OK, Iβve been working on some generic EN Mac AppleScript code that will help us here.
Some info I need for the script:
For the initial sort of Notes, we will sort on Note Title, ascending. What 2nd level sort do you want?
Date Created, Date Updated?
This will determine which Note we assign as βoriginalβ, and which as βdupβ
.
What tags do you want assigned to the dup Notes?
You could us something like:
βdup.1β β for the original
βdup.2β β for the actual dup
This would let you search/filter to get either, or both (using tag:dup.*)
But its up to you and how you want your workflow to go
.
Once the dups have been identified by Title, you could also compare checksums on the body/contents. If the same, the the Notes are truly identical. If so, then the script could just delete the dup, if you want.
I have found a very fast sort engine, which took only ~3 sec to sort 18,000 Notes, so thatβs one possible bottleneck avoided. But Iβm still not sure how long it will take to do the actual title comparison. Iβm working on it.
Date Updated would probably be best. I agree with item 2. #3 would be very helpful. I donβt worry too much about the time it may take. I can let it run overnight it you think its ok to do so.
Since they would be exact duplicates, would it really matter? I can go with either. I thought that if I had updated the note, then I would want that one. But now I realize that if I have undated the note it would not be a true duplicate. So the original should be the original, if I understand correctly.
How are the dup notes created? Manually, or from some import like from email or web clipper?
Obviously, if you start out with two exact dups, and then you update one or both, then they are no longer true dups, even if their Titles are the same.
So, I'm now thinking that the "original" should be the note with the oldest Date Created. But if you prefer different, that is obviously your choice. I'll setup the script so that the date to use is set at the top as a property, making it easy to change.
Good news: An AppleScript colleague has come up with a method for very, very quickly determining the dup items in a list. If this works out as indicated by early testing, it should reduce the time to just a few minutes. I should have something within a day or two.
The originals would be imported. They would likely be PDFs. Also there may be more than one copy of the same PDF. My work flow takes a downloaded PDF or a scanned pdf and runs it through PDFPen for OCR and then its added to Evernote. I have found that the work flow doesnβt always work correctly and multiple copies are imported.
@1_Hominid, the above is still true, but I found a bug in Evernote when creating tags that I'm working on making sure does not affect this script. Barring more bugs, should be ready tomorrow . . . (developers famous last words LOL)
OK, I think the script is finally ready for you to use.
Please let us know if this script/macro works for you.
It actually ran very fast on my iMac-27, with 18K+ Evernote Notes, as you can see from the AppleScript Log, taking on ~39 sec:
I'm sure with 30,000 notes it will be slower for you, but if you have reasonable recent/fast Mac, it should take no more than ~80 sec. But it is best to be prepared for several minutes. I'm make sure that Evernote was the only app running, and of course the KM Engine.
Do make sure Evernote Mac is running, and do a sync, and let it complete, before you trigger this macro.
##example Output
You will get two script prompts to confirm continuing with the script:
####Script Dialog Showing Results
(it will automatically close in 5 sec, but results have been placed on clipboard.
####Open New Evernote Query Window with Results
This shows a Note list filtered by the dup tags: any: tag:Dup.orig tag:Dup.dup
Your window may appear different. I have manually changed my window to show the Note List on Top, and sorted by Title. Of course you can change the filter (Search actually) anytime, now or later.
Install the file BridgePlus.scptd (from the zip file) into your ~/Library/Script Libraries folder (create the folder if need be)
This is a very safe and reliable script library written by the well-known AppleScript guru Shane Stanley
###Script Properties You Can Change
You can find these near the top of the script.
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--- PROPERTIES CHANGABLE BY USER ---
property dupSortBy : "creation date" -- "creation date" OR "modification date"
property dupSortDir : "ASC" -- "ASC" OR "DESC"
--- These Tags will be DELETED At the Start of the Script ---
-- (thus any Notes with these tags will no longer have these tags)
property ptyTagOrig : "Dup.orig"
property ptyTagDup : "Dup.dup"
property ptyMaxDupSets : -1 -- limit the tagging of Dup Sets for testing. Set to -1 for ALL
property ptyLogDupSets : false
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The KM Macro is very simple -- it has one Action: Execute AppleScript
You can also run this script from the Script Editor app.
##Macro Library Get Dup Evernote Note List & Assign Tags
####DOWNLOAD:
<a class="attachment" href="/uploads/default/original/2X/6/6ae690296aee6af9024c73aeb93f1cf22acb27cd.kmmacros">Get Dup Evernote Note List & Assign Tags.kmmacros</a> (16 KB)
**Note: This Macro was uploaded in a DISABLED state. You must enable before it can be triggered.**
---
###Use Case
* Identify and tag Duplicate Evernote Notes
---
###ReleaseNotes
* See above
* **Make sure Evernote is running and fully sync'd BEFORE triggering this macro.**
REQUIRES:
* [BridgePlus (BPLib) Script Library](https://www.macosxautomation.com/applescript/apps/BridgePlus.html)
* KM 7.3.1+
* macOS 10.11.6
* Evernote 6.11.1+ (do NOT run using any Evernote BETA).
---
<img src="/uploads/default/original/2X/4/4fab5d97d677ce71e61d9c69c3ce1c31cc4f293f.png" width="619" height="708">
---
###AppleScript
```applescript
property ptyScriptName : "EN Get Dup Note List & Tag"
property ptyScriptVer : "1.2"
property ptyScriptDate : "2017-08-21"
property ptyScriptAuthor : "JMichaelTX"
(*
PURPOSE: Search All EN Notes to Idenfiy Duplicate Notes by Title,
and assign dup tags to those Notes.
Dup Note Sets are logged as tags are assigned.
Upon completion, a new Evernote window is shown,
filtered to "any:" of the dup tags.
REQUIRED:
macOS El Capitan 10.11.6+
(may work on Yosemite 10.10.5, but no guarantees)
*)
use AppleScript version "2.5" -- El Capitan 10.11.6+
use scripting additions
use framework "Foundation"
use BPLib : script "BridgePlus"
property LF : linefeed
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--- PROPERTIES CHANGABLE BY USER ---
property dupSortBy : "creation date" -- "creation date" OR "modification date"
property dupSortDir : "ASC" -- "ASC" OR "DESC"
--- These Tags will be DELETED At the Start of the Script ---
-- (thus any Notes with these tags will no longer have these tags)
property ptyTagOrig : "Dup.orig"
property ptyTagDup : "Dup.dup"
property ptyMaxDupSets : -1 -- limit the tagging of Dup Sets for testing. Set to -1 for ALL
property ptyLogDupSets : false
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
set scriptResults to "TBD"
try
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
set frontApp to path to frontmost application as text -- use for dialogs
--- GET NOTE COUNT and CONFIRM PROCESSING ---
tell application "Evernote"
set nbList to every notebook
set numNotes to 0
repeat with oNB in nbList
set numNotes to numNotes + (count of notes in oNB)
end repeat
end tell
log "Num of Notes: " & numNotes
set msgStr to "Number of Notes to Process: " & numNotes & LF & LF & Β¬
"Script will scan ALL Notes to determine duplicate Notes based on Note Title," & LF & Β¬
"Then you will be asked to confirm assigning tags to these Notes as follows:" & LF & Β¬
tab & "β’ Original Note: " & tab & tab & ptyTagOrig & LF & Β¬
tab & "β’ Duplicate Notes: " & tab & ptyTagDup & LF & LF & Β¬
"This could take between 5 sec and 10 minutes, depending on the number of Notes and the speed of your Mac" & LF & LF & Β¬
"Click \"Continue\" to Process ALL " & numNotes & " Notes."
if not my continueScript(msgStr) then error "User Cancelled"
set startTime to current application's NSDate's |date|()
--- GET PROPERTIES OF ALL NOTES ---
(*
β’ this is much faster than getting a Note Object list, and
using a repeat loop to get properties.
β’ The noteLinkList will be used to get the actual Note object
when we need to process the dup Note list.
*)
tell application "Evernote"
set {noteLinkList, creDateList, modDateList, titleList} to {note link, creation date, modification date, title} of every note of every notebook
end tell
--- CONVERT LIST of LISTS to SINGLE, FLAT LIST ---
-- (one item per Note) (Requires BridgePlus)
set noteLinkList to my flattenList(noteLinkList)
set creDateList to my flattenList(creDateList)
set modDateList to my flattenList(modDateList)
set titleList to my flattenList(titleList)
--- SORT BY Title, Date, Note Link ---
if (dupSortBy = "creation date") then
set dateList to creDateList
else
set dateList to modDateList
end if
set {titleList, dateList, noteLinkList} to my sortMultiLists({titleList, dateList, noteLinkList}, {"ASC", dupSortDir, "ASC"})
------------------------------------
-- GET DUP NOTE LIST --
------------------------------------
set {dupNoteList, dupNoteCount} to my getDupItemList(titleList)
set elapTime to (-(round ((startTime's timeIntervalSinceNow()) * 100)) / 100.0)
log ("Time to Get Dup Note List: " & elapTime)
------------------------------------
-- ASSIGN TAGS TO DUP NOTES --
------------------------------------
log "Num of Dup Sets: " & dupNoteCount
set msgStr to "Number of Duplicate Note Sets to Assign Tags to: " & dupNoteCount & LF & Β¬
"This make take between 10 sec and 10 minutes to complete."
if not my continueScript(msgStr) then error "User Cancelled"
set startTime to current application's NSDate's |date|()
tell application "Evernote"
--- CREATE TAGS IF NEED BE, OR DELETE if They EXIST ---
-- MUST sync before/After due to EN Mac BUG
my sync()
--- DELETE TAGS IF THEY EXIST ---
set syncNeeded to false
if ((tag named ptyTagOrig exists)) then
delete tag ptyTagOrig
log "Tag Deleted: " & ptyTagOrig
set syncNeeded to true
end if
if ((tag named ptyTagDup exists)) then
delete tag ptyTagDup
log "Tag Deleted: " & ptyTagDup
set syncNeeded to true
end if
if (syncNeeded) then my sync()
--- CREATE TAGS ---
if (not (tag named ptyTagOrig exists)) then
make tag with properties {name:ptyTagOrig}
log "Tag CREATED: " & ptyTagOrig
end if
if (not (tag named ptyTagDup exists)) then
make tag with properties {name:ptyTagDup}
log "Tag CREATED: " & ptyTagDup
end if
my sync()
set iDupSet to 0
----------------------------------
repeat with oDup in dupNoteList
--------------------------------
--- EXIT Repeat IF NOT All DupSets AND Max DupSets Have Been Processed ---
if ((ptyMaxDupSets β -1) and (iDupSet β₯ ptyMaxDupSets)) then exit repeat
set iDupSet to iDupSet + 1
if (iDupSet mod 10 = 0) then -- display notify every 10 dup sets
set msgStr to "Processing Dup Set #" & iDupSet
set msgTitleStr to ptyScriptName
display notification msgStr with title msgTitleStr sound name "Tink.aiff"
end if
set noteTitle to item 1 of oDup
if (ptyLogDupSets) then log "DupSet: " & iDupSet & tab & noteTitle
------------------------------------------
repeat with iNL from 2 to (count of oDup)
----------------------------------------
set noteLink to item (item iNL in oDup) in noteLinkList
if (iNL = 2) then
set tagStr to ptyTagOrig
else
set tagStr to ptyTagDup
end if
set oNote to find note noteLink
assign tag tagStr to oNote
end repeat
end repeat
end tell
set elapTime to (-(round ((startTime's timeIntervalSinceNow()) * 100)) / 100.0)
log ("Time to Assign Tags to Notes: " & elapTime)
set scriptResults to "OK" & LF & "SUCCESS!" & LF & numNotes & " Notes were Processed and Checked for Dups" & LF & Β¬
dupNoteCount & " Dup Note Sets were found." & LF & Β¬
"Tags were assigned as follows:" & LF & tab & "β’ Original Note: " & ptyTagOrig & LF & Β¬
tab & "β’ Dup Notes: " & ptyTagDup
set the clipboard to scriptResults
display dialog scriptResults & LF & Β¬
"(copied to clipboard)" with title ptyScriptName Β¬
buttons {"OK"} Β¬
default button Β¬
"OK" with icon note Β¬
giving up after 5
tell application "Evernote"
activate
set enQuery to "any: tag:" & ptyTagOrig & " tag:" & ptyTagDup
set oWin to open collection window
set query string of oWin to enQuery
end tell
my sync()
--~~~~~~~~~~~~~ END TRY ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
on error errMsg number errNum
if errNum = -128 then ## User Canceled
set errMsg to "[USER_CANCELED]"
end if
set scriptResults to "[ERROR]" & return & errMsg & return & return Β¬
& "SCRIPT: " & ptyScriptName & " Ver: " & ptyScriptVer & return Β¬
& "Error Number: " & errNum
end try
--~~~~~~~~~~~~~~~~END ON ERROR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--- RETURN THE RESULTS TO THE KM EXECUTE SCRIPT ACTION ---
return scriptResults
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- HANDLERS (functions)
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
on sync()
local msgStr, msgTitleStr, isSync
tell application "Evernote"
set msgStr to "Waiting on EN Mac SYNC to Complete"
set msgTitleStr to "Synchronize EN Mac"
display notification msgStr with title msgTitleStr sound name "Tink.aiff"
synchronize
set isSync to isSynchronizing
repeat while isSync
delay 0.1
set isSync to isSynchronizing
end repeat
end tell
set msgStr to "Sync COMPLETE!"
display notification msgStr with title msgTitleStr sound name "Tink.aiff"
end sync
--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
on getDupItemList(pSourceList)
(* VER: 1.2 2017-08-14
---------------------------------------------------------------------------------
PURPOSE: Get a List of Dup Items, with indexes, found in Source List
PARAMETERS:
β’ pSourceList β text β Source List to search for duplicate items (exact match)
RETURNS: List of Lists β Each Item in main list is list with these items:
β’ text β Source items which had dups
β’ integer β Index of first item in Source List
β’ integer β Index of second item in Source List
β’ [integer β additional items for each dup found, one item per dup]
EXAMPLE:
{ Item in Source List,
Index in Source List to first dup
Index in Source List to 2nd dup
. . .
Index in Source List to nth dup }
{ "11.16.2011 [WED] Daily Notes",
239,
240,
241 },
{ "15-Minute Retirement Plan | Fisher Investments | Jul 12, 2012.pdf",
279,
280 },
. . .
nth Dup Set
AUTHOR: JMichaelTX refactored script by Shane Stanley
REQUIRES:
β’ macOS 10.11.6+
β’ use framework "Foundation"
REF:
1. Shane Stanley, 2017-08-13
http://lists.apple.com/archives/applescript-users/2017/Aug/msg00053.html
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
*)
Β¬
local time1, theCount, countedDupes, duplicatedValues, dupItemList, thisValue, thisIndex, thisInfo, startTime, elapTime, dupItemCount, msgStr, msgTitleStr
set startTime to current application's NSDate's |date|()
set pSourceList to current application's NSArray's arrayWithArray:pSourceList
set theCount to pSourceList's |count|()
-- get a counted set of the duplicate instances of any duplicated values
set countedDupes to current application's NSCountedSet's setWithArray:pSourceList
countedDupes's minusSet:(current application's NSSet's setWithSet:countedDupes)
-- get the indices of the duplicated values' first and dupe instances
-- USE THIS for NO SORT --
set duplicatedValues to countedDupes's allObjects()
--- USE THIS to SORT on Source Item ---
### NOW REPLACED by BPLib Sort at Bottom
### set duplicatedValues to countedDupes's allObjects()'s sortedArrayUsingSelector:"compare:"
set dupItemList to {}
repeat with thisValue in duplicatedValues
-- Value and first index.
set thisIndex to (pSourceList's indexOfObject:(thisValue)) + 1
set thisInfo to {thisValue as text, thisIndex}
-- Indices of dupes.
repeat (countedDupes's countForObject:(thisValue)) times
set thisIndex to (pSourceList's indexOfObject:(thisValue) inRange:({thisIndex, theCount - thisIndex})) + 1
set end of thisInfo to thisIndex
end repeat
set end of dupItemList to thisInfo
end repeat
### ADD BPLib SORT of RESULTS on First Index (Item 2) ###
-- This sorts the results in the same order as the Source List
set dupItemList to BPLib's sublistsIn:dupItemList sortedByIndexes:{2} ascending:{true} sortTypes:{}
set elapTime to (-(round ((startTime's timeIntervalSinceNow()) * 100)) / 100.0)
set dupItemCount to count of dupItemList
set msgStr to ((dupItemCount as text) & " Dup Items found in " & elapTime as text) & " sec"
set msgTitleStr to "getDupItems() Handler"
display notification msgStr with title msgTitleStr sound name "Tink.aiff"
return {dupItemList, dupItemCount}
end getDupItemList
--~~~~~~~~~~~~~~~ END OF handler getDupItemList ~~~~~~~~~~~~~~~~~~~~~~~~~
on continueScript(pMsgStr)
beep
display dialog pMsgStr Β¬
with title ptyScriptName Β¬
buttons {"Stop", "Continue"} Β¬
default button Β¬
"Continue" with icon caution
set buttonStr to button returned of result
if (buttonStr = "Continue") then
set continueBol to true
else
set continueBol to false
end if
return continueBol
end continueScript
on flattenList(pList)
set flatList to BPLib's listByFullyFlattening:pList
return flatList
end flattenList
on sortMultiLists(pListOfLists, pSortDirList)
(*
REQUIRES:
use framework "Foundation"
use BPLib : script "BridgePlus"
*)
local rowsList, listCount, sortByList, iL, oSort
--- Setup the Sort ---
set rowsList to BPLib's colsToRowsIn:pListOfLists
set listCount to count of pListOfLists
set sortByList to {}
--- Sort Order by List as Passed in pListOfLists ---
repeat with iL from 1 to listCount
set end of sortByList to iL
end repeat
--- Convert Text Sort Direction to Boolean ---
-- (true means ascending)
repeat with oSort in pSortDirList
set contents of oSort to ((oSort as text) starts with "ASC")
end repeat
--- Do the Sort
set rowsList to BPLib's sublistsIn:rowsList sortedByIndexes:sortByList ascending:pSortDirList sortTypes:{}
--- Get Sort Results ---
set pListOfLists to BPLib's colsToRowsIn:rowsList
return pListOfLists
end sortMultiLists
```
OK
SUCCESS!
29285 Notes were Processed and Checked for Dups
1233 Dup Note Sets were found.
Tags were assigned as follows:
β’ Original Note: Dup.orig
β’ Dup Notes: Dup.dup