Find specific line in document

Hi there!

I looked around for this on the forum but could not find the specific action(s) I needed to accomplish this. Thanks for everyone helping on this forum, this is the first time Ive had to ask because the information here is so thorough!

Im trying to find a way to search a document for the word “OUT:” then copy the next line to a variable. My document looks like this…

YOU002
IN:
OUT:
01:11:32:13
01:11:36:01
Hello how are you, please make yourself at home
YOU003
IN:
OUT:
01:11:38:00
01:11:40:09
Oh my that was frightening

so the macro would make variable1 = 01:11:32:13, variable2 = 01:11:38:00, etc. Any advice even if its pointing me in a direction would be greatly appreciated. Thanks!

Using your example data, I found that there was a SPACE after "OUT:".
So the RegEx Look Behind requires this space:
(?<=OUT: \n)(^.+?) *$

<img src="/uploads/default/original/2X/9/9fa034801953dfe7d1745afcb7087c7530a06e23.gif" width="70" height="17"> 2017-04-13 21:52 CT

See final macro posted at:
[MACRO:   Parse String from File into Multiple Variables @RegEx @Example](https://forum.keyboardmaestro.com/t/parse-string-from-file-into-multiple-variables-regex-example/6810)

2017-04-12 19:33 CT

  • I figured out another RegEx pattern that does NOT depend on having a SPACE after the "OUT:". It works with and without a SPACE. Ver 2.0 uploaded with this fix.

See if this will help get you started. Please let us know if this works for you.

###Example Results

NOTICE: This is just an example.
It has had very limited testing and most likely will require further testing and changes by you BEFORE you use it in production environment.

###MACRO:   @RegEx Parse String into Multiple Variables @Example

~~~ VER: 2.0    2017-04-12 ~~~

####DOWNLOAD:
@RegEx Parse String into Multiple Variables @Example.kmmacros (6.6 KB)


###ReleaseNotes

TBD


1 Like

Thanks for the help Michael! Ive gone through tons of your posts helping others. When I import and run the macro as is I do get the proper result. However when I replace the first variable TEST_SourceStr with new text it didnt display any results. The display window pops up but no variables are listed.

Also, Im not sure why those spaces were there after “OUT:” when I initially posted. I pasted that original text directly from my pdf (the source of the provided text) into KM and the spaces were not there. I did try the text both ways with and without spaces and it worked fine as you mentioned though. Here is the new text I tried that did not display results…

YOU013
IN:
OUT:
02:07:57:14
02:08:00:02
Me, I’ve never been happier
EdiCue v2.7.0 23.976 Frame Sort: Character, Time
Oct 25, 2016 10:07 AM Page 1 of 2
Movie
Character: Sam
Reel: 1-6 PicVers: 2 Actor’s List Actor: John Doe
CUE#: LINE:
YOU014
IN:
OUT:
02:09:04:18
02:09:05:20
Higher
YOU015
IN:
OUT:
02:09:07:17
02:09:08:09
Higher please
YOU016
IN:
OUT:
02:09:10:05
02:09:11:04
Higher!

Hey Billy,

I'm going to take a different tack at this than JM and use the shell.

One of the handy features of the shell's version of the grep tool is the ability to return lines before or after found text.

In this case you need the line after “OUT:”, and that's what the first line of the script returns for each instance of “OUT:” that is found.

At this point It looks like this:

OUT:
02:07:57:14
--
OUT:
02:09:04:18
--
OUT:
02:09:07:17
--
OUT:
02:09:10:05

The second line returns only those lines that have a digit followed by a colon, and thusly strips out other text grep returns in addition to the found lines.

Acquire Values for “OUT-” entries in Text.kmmacros (3.4 KB)

The output of the macro given your last set of data is:

02:07:57:14
02:09:04:18
02:09:07:17
02:09:10:05

Instead of searching this to place in a variable it might be easier to use a For Each action with The lines in a variable.

Something like this:

Now you can deal with 1 or more found out-value serially in a loop.


Here's how I'd do the job with AppleScript and the Satimage.osax AppleScript Extension. (You MUST have the Satimage.osax INSTALLED for this script to work!)

------------------------------------------------------------------------------
# REQUIRES the Satimage.osax AppleScript Extension to be installed.
# Find the newest version here: http://tinyurl.com/smile-beta-page
------------------------------------------------------------------------------

set myText to "YOU013
IN:
OUT:
02:07:57:14
02:08:00:02
Me, I've never been happier
EdiCue v2.7.0 23.976 Frame Sort: Character, Time
Oct 25, 2016 10:07 AM Page 1 of 2
Movie
Character: Sam
Reel: 1-6 PicVers: 2 Actor's List Actor: John Doe
CUE#: LINE:
YOU014
IN:
OUT:
02:09:04:18
02:09:05:20
Higher
YOU015
IN:
OUT:
02:09:07:17
02:09:08:09
Higher please
YOU016
IN:
OUT:
02:09:10:05
02:09:11:04
Higher!"

set outValueList to fndUsing("^OUT:$\\n(^\\d{2}:.+)", "\\1", myText, true, true) of me

------------------------------------------------------------------------------
--» HANDLERS
------------------------------------------------------------------------------
on fndUsing(_find, _capture, _data, _all, strRslt)
   try
      set findResult to find text _find in _data using _capture all occurrences _all ¬
         string result strRslt with regexp without case sensitive
   on error
      false
   end try
end fndUsing
------------------------------------------------------------------------------

This produces an AppleScript list object like so:

{"02:07:57:14", "02:09:04:18", "02:09:07:17", "02:09:10:05"}

This can easily be turned into a string, or you can iterate through the list.

** In my scripts the handlers are hidden in a library, so I only see one line of code other than the assignment statement for the data.

-Chris

Billy, are you using the latest version of my Macro?

When I use your new example data, it works fine for me:

Maybe something is happening when you copy the example data and paste into the KM Forum. Try copying the example data FROM the above Forum post, and paste that into the Macro.

Let us know if you're still having trouble.

1 Like

JM yes I am using your newest macro. and you are correct about the KM Forum formatting. When I copy from the original pdf directly to KM it doesnt work. However when I copy from the forum to KM it does work. So I tried copy/pasting from original pdf to textedit first then copy/paste textedit to KM and it worked.

So wow, thanks! Last thing to polish it off…is there an actual command in KM that lets you read the content of a text file, and extract those numbers (with those actions you provided) instead of pasting directly into KM? So maybe you hit a macro combo and a box pops up that asks which text file to examine?

Not sure if this helps but I was able to accomplish something similar on a PC scripting program. I would copy/paste the pdf lines to a text file specifically called ADR.txt in a specific hard drive location and the script would run as…

Loop, read, C:\PT\ADR.txt ;;;;;;;;;;;specifies to search this specific file
{

IfInstring, A_LoopReadLine, OUT: ;;;;;;;;;;;;search the above file specifically for the word “OUT:”
{

BAM! you the man!

Yes, there is: Read a File action (KM Wiki).

Unfortunately, there is not a native KM Action to choose a file, so we have to use a script.

###Replace this Action:

###With these two Actions:


2017-04-13 21:25 CT

  • Revised below script to Ver 2.1, to fix conversion of KM Variable SCPT__FileType from string to list.

###JXA Script to Choose File

'use strict';
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(function chooseFileKM() {    // ~~~ automatically executed when this script is executed ~~~
  
var ptyScriptName   = "Choose File Return POSIX Path"
var ptyScriptVer     = "2.1"
var ptyScriptDate   = "2017-04-13"
var ptyScriptAuthor = "JMichaelTX"
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PURPOSE:  Allow User to Choose File from Popup Window

RETURNS:  One of these, as text:
  • Actual Results of script if all goes well
    • POSIX path to selected file
    
  • "[USER_CANCELED]" at start of results if the user canceled something
  • "[ERROR]" at start of results if a script error occurred.
  
AUTHOR:  @JMichaelTX

KM VARIABALES REQUIRED:
  • SCPT__ParentFolder       [optional]: "~/Documents"
  • SCPT__ChooseFilePrompt  [optional]: "Choose FILE for KM Macro"
  • SCPT__FileType        [optional]: ['public.item']
  
  
TAGS:  @File @Prompt @Choose @Script @KM @JXA

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

var scriptResults = "TBD"  // Set your results to this var


try {
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
  // --- SET CURRENT APP VARIABLE NEEDED FOR DIALOGS & StandardAdditions.osax ---
  var app = Application.currentApplication()
  app.includeStandardAdditions = true
  
  //--- SET SYSTEM UI SERVER FOR USE WITH DIALOGS IN KM ---
  var susApp = Application('SystemUIServer');
  susApp.includeStandardAdditions = true;

  
  // --- SET KME APP VARIABLE NEEDED TO GET/SET KM VARIABLES ---
  //      (remove if not needed)
  var kme = Application("Keyboard Maestro Engine");
  
  //--- GET KM VARIABLES ---  
  var defaultFolderPath   = kme.getvariable("SCPT__ParentFolder")       || "~/Documents";
  var choosePrompt         = kme.getvariable("SCPT__ChooseFilePrompt")   || "Choose FILE for KM Macro";
  var fileTypeStr         = kme.getvariable("SCPT__FileType")           || "public.item";

  defaultFolderPath = defaultFolderPath.replace("~", app.pathTo("home folder").toString())
  var  fileTypeList = fileTypeStr.split(/, |,/g)
    
  susApp.activate();
  
  var myFile = susApp.chooseFile({ 
    withPrompt:       choosePrompt,
    defaultLocation:   defaultFolderPath,
    ofType:           fileTypeList
    })
    
    /*
      Other File Types:
        • All File Types: ['public.item']
        • File Extension:  Use ext without period, like ["aup"]
        • Images:   ['public.jpeg', 'public.png']
        • Text:    ["public.text", "text", "public.html", "public.xml", "public.script"]
        • MS Word Documents:
            ["com.microsoft.word.doc", "com.microsoft.word.docx", "org.openxmlformats.wordprocessingml.document", "org.openxmlformats.wordprocessingml.document.macroenabled"]
        • see Apple System-Declared Uniform Type Identifiers for other "type" values
          https://developer.apple.com/library/ios/documentation/Miscellaneous/Reference/UTIRef/Articles/System-DeclaredUniformTypeIdentifiers.html
    */
    
  scriptResults = myFile.toString();
  
  //~~~~ END TRY ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

} catch (oError) {
  
  var msgLog;
  
  if (oError.errorNumber === -128) {  // User Canceled
  
    scriptResults =  "[USER_CANCELED]\n\n"
      + "SCRIPT: " + ptyScriptName + "   Ver: " + ptyScriptVer;
      
    msgLog = "User Canceled";
  }
  
  else {
    scriptResults = "[ERROR]\n\n"
      + "SCRIPT: " + ptyScriptName + "   Ver: " + ptyScriptVer + "\n"
      + "Error Number: " + oError.errorNumber + "\n"
      + oError.message
      
    msgLog = oError.message;
    
  } // END if/else
  
  
} // END catch
//~~~~ END TRY/CATCH BLOCK ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

return scriptResults

//======================== END OF MAIN SCRIPT =============================================



})();  // ~~~ function is automatically executed when this script is executed ~~~
1 Like
<img src="/uploads/default/original/2X/9/9fa034801953dfe7d1745afcb7087c7530a06e23.gif" width="70" height="17"> 2017-04-13 21:52 CT

See final macro posted at:
[MACRO:   Parse String from File into Multiple Variables @RegEx @Example](https://forum.keyboardmaestro.com/t/parse-string-from-file-into-multiple-variables-regex-example/6810)

This includes the Choose File and Read from File Actions mentioned above.
1 Like

Cool thank you again and again for the help. I'm getting a blank variable list again however.
I put a display text action for the variable "TEST__SourceStr" right after the Read File to Variable ‘TEST__SourceStr’ action. Seems to be inputting the document to the variable correctly as seen here...

Actually, there is one big obvious difference: the backslash at the end of every line.

If now, the key string "OUT:" always has a "" then change the RegEx to this:
(?<=OUT:\\) *?\n+(^.+?)\\ *$

Do you see where I made the changes?

And, the RegEx for the second search also needs to be changed to:
*\n(.+?)\\ *$
(note there should be a SPACE at the beginning of the string)

1 Like

YUP! awesome, flawless now.

Hey Billy,

You started out with plain text and then switched to RTF.

For us to be able to help without wasting a lot of time (ours and yours) it’s vital that you tell us what data-types you’re working with.

Regular Expressions require considerable precision, and as you can see RTF adds all kinds of artifacts to plain text. Patterns written for one WON’T work with the other, although with foreknowledge it is frequently possible to write a pattern to accommodate both.

It’s also quite easy to test for RTF, if you know it’s a possible data-type.

-Chris