Search Variable using Regular Expression

Hi

I am using Search Variable using Regular Expression to split up an csv-line to individual variables.
There are 31 cells and I use this RegExp
^"(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)","(.*)"

When I copy it into the textfield in Keyboard Maestro, then the program just hangs.
If I try a shorter RegExp then it works but I do not get all my cells to variables.

Is there a better way to do this?

I have two thoughts:

  1. Use a KM “For Each” Action, setting one variable at a time.
  • First replace each "," with "\n"
  • Dynamically name the KM Variable in the loop:
    myVar%Variable%Counter%
  1. Use JXA, with the powerful RegEx engine

Try this. It's a first stab, but I'm fairly sure it will work.

Parse Quoted CSV Line.kmmacros (5.8 KB)

Pass it one line.

It will return each field on a separate line.

You can also, FWIW, write a general allMatches() function which uses a generic unfoldr

(The inverse of foldr / Array.reduceRight())

e.g. something like:

  // Example
  (function () {
      'use strict';



      // Get all matches (at specified subgroup indices) for a regex

      // Regex -> String -> Maybe [Integer] -> [[String]]
      function allMatches(rgx, str, lstIndex) {
          return unfoldr(
              // String -> [Int] -> (Regex -> (Bool valid, Maybe value, Maybe new))
              function (s, lstIndex) {
                  // Regex -> (Bool valid, Maybe value, Maybe new)
                  return function (rgx) {
                      var m = (rgx ? rgx.exec(s) : void 0),
                          blnMatch = !!m;
                      return {
                          valid: blnMatch,
                          value: blnMatch ? lstIndex
                              .map(function (i) {
                                  return m[i];
                              }) : [],
                          new: blnMatch && (0 < rgx.lastIndex) ?
                              rgx : void 0
                      };
                  };
              }(
                  str,
                  lstIndex ? (
                      lstIndex instanceof Array ? lstIndex : [lstIndex]
                  ) : [0]
              ),
              rgx
          );
      }


      // General inverse of foldr / .reduceRight
      // (derive a list by repeatedly applying a function 
      // to the remaining residue of a simple start value)

      // (b -> Maybe (a, b)) -> b -> [a]
      function unfoldr(mf, v) {
          var lst = [],
              a = v,
              m;

          while ((m = mf(a)) && m.valid) {
              lst.push(m.value);
              a = m.new;
          }
          return lst;
      }

      var str = '"one", "two", "three", "four"';

      return allMatches(/"([^"]*)"/g, str, [1]);

  })();

Output:


[["one"], ["two"], ["three"], ["four"]]

I know you really like this pattern, and that’s cool.

But if you’re posting these kinds of examples specifically for me, you’re wasting your time. I find this paradigm/pattern complex and unintelligible.

If you think others will benefit from it, then by all means, continue to post these examples. Just don’t do it on my account. :slight_smile:

1 Like

:slight_smile:

I think the main virtues of function composition are simply that:

  • it lets you build up a library of reusable functions,
  • pure functions work like black boxes - you don’t particularly need to follow their contents, and you can rely on their not having any effect on the global name space

I personally find it useful to have an allMatches() function to reach for, and I hope maybe some others will too.

1 Like

Great macro and script, Dan.
Like you, I much prefer to keep things as simple as possible, that can be easily understood and modified by most.

####I enhanced it to set the KM Variables in the script.

@JimmyHartington: You just need to change two actions in the Macro:

  • Set the Source Data
  • Set KM Variable Prefix

BTW, if you would prefer to pass the script a list of the KM Variable names to be set, that would be easy to do.


###MACRO: Parse Quoted CSV Line (Ver 2)

Parse Quoted CSV Line (Ver 2).kmmacros (8.6 KB)


###The Script

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PURPOSE:  Extract One Line of CSV Data, & Set KM Variables
          RETURN number of matches found

VER: 2.0    2016-08-17

AUTHOR:
  • @DanThomas   -- Ver 1 which extracted and returned the data, one line per item
  * @JMichaelTX -- Ver 2 which ADDED Set of KM Variables
  
KM VARIABALES REQUIRED:
  • csvLine
  • csvVarPrefix
  
KM VARIABLES SET:
  • One Variable for each match found of CSV data
  • Variable Name:  csvVarPrefix + sequence#
    (Example:  TEST_myVar2)
  
REF:
  • Search Variable using Regular Expression
    https://forum.keyboardmaestro.com/t/search-variable-using-regular-expression/4704
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*/

(function() {
  'use strict';

  try {
    var kme = Application("Keyboard Maestro Engine");
    
    var input       = kme.getvariable("csvLine") || '"TEST one","TEST two","TEST three"';
    var kmVarPrefix = kme.getvariable("csvVarPrefix") || 'TEST_myVar';
    
    if (!input)
      throw Error("Variable '" + csvLine + "' is empty");

    var regexp = /"([^"]*)"/g
    var matches;
    var matchList = [];
    while ((matches = regexp.exec(input)) !== null) {
      matchList.push(matches[1]);
    }
    
    var kmVar = ""
    
    var numMatches = matchList.length;
    for (var iMatch = 0; iMatch < numMatches; iMatch++) {
    
      kmVar = kmVarPrefix + (iMatch+1).toString();  // ADD a numeric suffix to variable name
      
      //console.log(kmVar + " = " + matchList[iMatch])
      
      //--- SET THE KM VARIABLE ---
      kme.setvariable(kmVar, { to: matchList[iMatch] });
    
    } // END for matchList


    return numMatches    // matchList.join("\n")
    
  } catch (e) {
    return "[**ERROR**] " + e.message;
  }
})();

###Example Results

###These KM Variables Were Set

1 Like

I agree with your goals.

Without talking about implementation, those goals are some of the driving forces in my development life. I build toolboxes and tools. It’s what I live for, and what makes me happy.

I just prefer to write tools that are a little more obvious in their usage, at least to us normal mortals.

Hmmm. I just realized something I should have thought of before. I’ve been through these kinds of discussions with people before, so you’d think I’d remember.

I’m going to generalize here, and it may tick you off. So be it.

People who have your level of intelligence can rarely be convinced that something that seems clear to you is not clear to most of us.

And the thing that amazes me, is that for all the intelligence that these people have, and it probably includes you, they can’t for the life of them, even for a moment, consider that “normal” people just can’t grasp the things these highly intelligent people find so easy to understand.

1 Like

( Flattering description, tho I wish I could think of someone I know who would agree :slight_smile: )

In the meanwhile and more practically, given:

  1. a regex,
  2. a string to search, and optionally,
  3. the indices of any particular regex groups,

(where, by JS regex convention, 0 is the entire match, and 1 is the first parenthesised group in the regex pattern)

allMatches(rgx, str, optionalIndices)

just returns all matches for the regex (or all matches for any optionally indexed subGroups)

If we leave out the indices parameter:

     var str = '"one", "two", "three", "four"';

      return allMatches(/"([^"]*)"/g, str);

The the whole of each match will be returned, including the double quotes:

[["\"one\""], ["\"two\""], ["\"three\""], ["\"four\""]]

but if we only want all matches for group 1, then

     allMatches(/"([^"]*)"/g, str, [1]);

just gives us the the set of group 1 matches:

[["one"], ["two"], ["three"], ["four"]]

While this would give us both the 0 matches and the group 1 matches:

allMatches(/"([^"]*)"/g, str, [0, 1])
[["\"one\"", "one"], ["\"two\"", "two"], ["\"three\"", "three"], ["\"four\"", "four"]]

The regular expression is pathologically bad, this is why you are getting the hang.

Replace all the . characters with [^"] and that will likely resolve the issue.

To give an example, lets assume that you are search for just four of these

^"(.*)","(.*)","(.*)","(.*)"

and lets assume you were searching this text

"a","b","c","d"

First, the pattern matcher will match the first group to all of "a","b","c","d" and then see that there is no trailing comma, so now it back tracks and matches the first capture group to "a","b","c"," and sees that there is no trailing comma, so then it matches the first capture group to "a","b","c" sees the comma, matches the second capture group to "c" and then sees there is no comma, so back tracks further to match the first capture group to "a","b"," and then to "a","b"m sees the comma, and here is where it starts getting really pathological, it now matches the second capture group to "c","d".

So basically what the regex ends up doing it breaking the 31 items up in to any possible sequential subset. I’m not sure the maths, but it is probably something like 2^31 different combinations that it will look at.

There are discussions of this issue around, see:

http://algs4.cs.princeton.edu/54regexp/

for example.

JXA’s regular expression may or may not suffer the same pathological issues.

4 Likes

Thanks for all the suggestions.

I ended up using @peternlewis’ suggestion since this solved my problem.
But the other macros has been saved for other uses, when working with data.

Which expression would we require to only keep the last 10 digits/characters of a variable?

Use the Substring action:

Or, as RegEx, this should do the same: .{10}$

2 Likes

Hey Ali,

Tom's answer is simplest, but here are a couple more options:

Last 10 Digits via RegEx – Search Version.kmmacros (2.8 KB)

Last 10 Digits via RegEx – Search & Replace Version.kmmacros (2.7 KB)

-Chris

1 Like

Thank you

2 posts were split to a new topic: How Do I Extract Text from "Fields" in a Multiline String?