Comparing Data Sets With Multi-Line Regex

I'm writing a macro to compare two DVD queues and produce a list of duplicate entries. I have several macros that compare data like this:

I'm stymied with the regex, and then how to apply it, since I cannot use my "for each line in" method, as the regex (I think) I'd need includes newlines. Any help appreciated :slight_smile:

Here is an example of how the information is formatted. I've marked the titles in bold for readability. Please note that "rated" always follows the title, and a number always precedes the title; however, the number can be any number of digits long (though I doubt I'll ever have a thousand items in a queue). Additionally, some titles contain numbers, symbols etc.

Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
MOVE
REMOVE
5
Sherlock: Series 2: Disc 1
rated 4.8 stars
4.8
Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
Short wait
MOVE
REMOVE
6
Sherlock: Series 2: Disc 2
rated 4.8 stars
4.8Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
MOVE
REMOVE
42
Divergent
rated 3.9 stars
3.9
Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
MOVE
REMOVE
43
Fast & Furious 6
rated 4.3 stars
4.3
Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
MOVE
REMOVE
107
Marvel's Daredevil: Season 1: Disc 1
rated 4.6 stars
4.6
Rate 5 starsRate 4 starsRate 3 starsRate 2 starsRate 1 stars
Blu-ray
MOVE
REMOVE
108
Marvel's Daredevil: Season 1: Disc 2
rated 4.6 stars
4.6

Based on that, this RegEx will match the Titles, putting them in Capture Group #1.
(?m)^(.+?)[ \t]*(?=\nrated)

See this regex101.com page:
Extract Title from List

Using that RegEx, I would first build two KM Variables with ONLY Titles.
One for Set #1 and one for Set #2.
Then do a compare.

I’ll have to check, but I think that JavaScript can do this quite easily.

I’ll leave you with this for now, and check back later.

1 Like

@cfriend, here's an example macro to build a list of Titles ONLY:

###Example Results

###MACRO:   @RegEx Extract CD Titles from Detailed List @Example

~~~ VER: 1.0    2017-04-05 ~~~

####DOWNLOAD:
@RegEx Extract CD Titles from Detailed List @Example.kmmacros (4.1 KB)


###ReleaseNotes

TBD



I'll check into the JavaScript to ID dups now.
1 Like

@cfriend, as I thought, there is an easy JavaScript solution.
From: http://stackoverflow.com/a/26343472/915019

EDIT: 2017-04-05 5:37 PM CT
###JavaScript to Get and Remove Dups from Two Lists

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  
@Lists @Arrays Get List of @Duplicate Items in Two Lists @JS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DATE:    2014-10-13
AUTHOR: Nick Russler
REF:
  • compare two arrays and return duplicate values
    • Stack Overflow, 
  • http://stackoverflow.com/a/26343472/915019

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

var x = ['IdA', 'idB', 'IdC', 'IdD', 'IdE'];
var y = ['idB', 'IdE', 'IdF'];

//--- GET LIST OF DUPS ---
var z = x.filter(function(val) {
  return y.indexOf(val) != -1;
});

console.log("z: " + z);

//--- REMOVE DUPS FROM ONE ARRAY (LIST) ---

x = x.filter(function(val) {
  return y.indexOf(val) == -1;
});

console.log("x: " + x)

//=== RESULTS ===
/* z: idB,IdE */
/* x: IdA,IdC,IdD */

I’ve given you all the parts.
Let me know if you need help in putting them together.

BTW, I would probably do the initial RegEx to build the two lists in JavaScript.

  • In KM macro, set two KM Vars to the two CD lists
  • Read the KM Vars in JXA.

See Execute a JavaScript For Automation action (KM Wiki)

1 Like

With that regex I’ve got a functional macro. I’m going to go back and examine the JXA methods tomorrow. Thanks!

1 Like