Processing default files names for Evernote

I want to store emails from my travel agent in Evernote. I use a script to copy the email with flight details and hotel / car pickup into Evernote, but the subject naming from the Agent is a mess, but is at least structured:

Travel Plan Update - Flight - PEK to SIN - Feb 2015
Travel Plan Update - Hotel - Singapore CBD - Feb 2015

When the email is in Evernote, I want to be able to copy to clipboard the title of the note, figure out the structure and paste the rearranged version, dropping anything not useful:

201502 - Flight - PEK to SIN
201502 - Hotel - Singapore CBD

I want to drop “Travel Plan update” text, read the year and put it to the front, parse Jan-Dec and replace with 01-12, paste in the travel category, then the destination.

As far as I can see, KM lets you pull out subsets from the clipboard selection but it looks like it’s not possible to use variables to determine the from and to counts.

Is there a way to pull out info from a line of text if I know the order between delimiters?

Thanks

David

Hey David,

RegEx is your friend.

Keyboard Maestro’s regex is up to this task, but since you’re using AppleScript anyway it’ll be easier if you install the Satimage.osax AppleScript Extension.

----------------------------------------------------------------------
# REQUIRES installation of Satimage.osax { http://tinyurl.com/dc3soh }
----------------------------------------------------------------------

set _text to "
Travel Plan Update - Flight - PEK to SIN - Feb 2015 
Travel Plan Update - Hotel - Singapore CBD - Feb 2015
"

try
  set _text to find text "Travel Plan Update *- *(.+)" in _text using "\\1" with regexp, all occurrences and string result
  set _text to change "^(.+) *- *(.+)" into "\\2 - \\1" in _text with regexp without case sensitive
  set _text to change "^(.+?) (\\d{4})" into "\\2\\1" in _text with regexp without case sensitive
  set _text to change {"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"} ¬
    into {"01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"} in _text without case sensitive
  set _text to change " {2,}" into " " in _text with regexp without case sensitive
  set _text to change "\\s+$" into "" in _text with regexp without case sensitive
  set _text to change " +- +" into "\\t-\\t" in _text with regexp without case sensitive
  set _text to join _text using linefeed
  set _text to do shell script "column -t -s'  ' <<< " & quoted form of _text
on error e
  beep 2
end try

----------------------------------------------------------------------

Result:

201502  -  Flight  -  PEK to SIN
201502  -  Hotel   -  Singapore CBD

If you use a monospaced font the columns will line up.

Simple.


Best Regards,
Chris

Thanks Chris, that looks very powerful. I was only vaguely aware of Regex before now, and didn’t know there was an extension for KM, I’ll give it a shot.

Thanks again.

Hey David,

Keyboard Maestro has its own regular expressions which are quite useful.

But it's also useful to have them organically available to AppleScript — hence the Satimage.osax.

OSAX == Open Scripting Architecture Extension.

I've used this one since 2003.

If you have problems send me your script and a proper test email.

-Chris { scriptmeister@thestoneforge.com }

And, of course, Yosemite Javascript for Applications also has regex built in ( no need to call an external library )

Perhaps something like this:
(tho we could also simply split on space, in this case, if the input is fairly consistent)


// YOSEMITE JAVASCRIPT VERSION (regexes built in)

//RETURNS:
//	201502 - Flight - PEK to SIN
//	201502 - Hotel - Singapore CBD

reWrite(
	"Travel Plan Update - Flight - PEK to SIN - Feb 2015\n\
	Travel Plan Update - Hotel - Singapore CBD - Feb 2015"
);


function reWrite(strTxt) {
	return strTxt.split('\n').map(function (strLine) {
		var lst = strLine.split(' - '),
			strDate = lst[3];
			
		// Find year and month by regex
		var	matchYear = strDate.match(/\d{4}/),
			matchMonth = strDate.match(/[^\d\s]{3}/),
			strType = lst[1],
			strFromTo = lst[2];

		return [
			(matchYear ? matchYear[0] : '') +
			(matchMonth ? monthDigits(matchMonth[0]) : ''),
			strType,
			strFromTo
		].join(' - ')
	}).join('\n');
}


// Janary|JAN|jan --> "01" 
function monthDigits(strMonth) {
	return zeroPad([
		'jan', 'feb', 'mar',
		'apr', 'may', 'jun',
		'jul', 'aug', 'sep',
		'oct', 'nov', 'dec'
	].indexOf(
		strMonth.substring(0, 3).toLowerCase()
	) + 1, 2)
}


// Numbers string padded to left with zeros to get fixed width
// intNumber --> intDigits --> strDigits
function zeroPad(intNumber, intDigits) {
	var strUnpadded = intNumber.toString(),
		intUnpadded = strUnpadded.length;

	return Array((intDigits - intUnpadded) + 1).join('0') + strUnpadded;
}

Exactly right. For Yosemite — an "upgrade" kept far away from my system due to its plethora of foibles.

I have to install it soon but will only do so under duress.

Although I do look forward to JXA when the day comes.

-Chris

1 Like

David, I think what you want to do is possible, but it may take a significant amount of time with trial and error to achieve. I have build several KM macros that process Outlook email and prepare it to be forwarded to my Evernote account.

I was already doing some cleanup and documentation on one of these macros, and I should be ready to publish here in the next day or so. Of course they don't do exactly what you want. but may be help you get started.

Meanwhile, may I suggest you take a look at TripIT.com. It does a fantastic job of capturing all the info it needs from emails to present you with a very nicely formatted itinerary. You simply forward all emails of your travel confirmations and it pulls the data it needs. Of course you can also make manual entries/adjustments as needed. Works with a very wide variety of formats, and has never failed me. You could then do a web capture of the TripIT itinerary to Evernote.

Thanks all for the suggestions. I’m a bit out of my depth on the RegEx, but getting stuck in.
Thanks again.
David

You can perform most of your clipboard transformation with one Keyboard Maestro search and replace action.

will reorganise a clipboard content from:

Travel Plan Update - Flight - PEK to SIN - Feb 2015 
Travel Plan Update - Hotel - Singapore CBD - Feb 2015

to

2015-Feb - Flight - PEK to SIN 
2015-Feb - Hotel - Singapore CBD

(The Feb --> 02 rewrite is a bit more tricky, but also doable with a regex)

For the detail of what is going on in

Search:

Travel Plan Update - (\w+) - (.+) - (\w{3}) (\d{4})

Replace:

$4-$3 - $1 - $2

(Each $n represents one of the bracketed groups in the search pattern)

You can play with an interactive regex explorer like:

and look at the documentation of the regex conventions which KM uses at:
http://userguide.icu-project.org/strings/regexp

PS - there are, as you can see from the all various previous posts, a number of ways of dealing with the Feb --> "02" rewrite:

Here's one which uses a scripted search and replace:

Rewrite clipboard contents.kmmacros (3.1 KB)

Here FWIW is an example of experimenting interactively with regex patterns in RegexRX:

Hey David,

Regular Expressions are just a way to define text. Like any language you have to learn some vocabulary to use them.

^ == beginning of line
. == any character
* == zero or more of the preceding character or pattern
+ == one or more of the preceding character or pattern

What's really nice about regex is the ability define patterns and not have to account for every possibility of literal text.

So I can use:

(?i)^Flight[[:blank:]]+\d+.*

To find lines that begin with 'Flight' (case insensitive) followed by 1 or more horizontal blank space followed by 1 or more digits followed by more text if there is any.

This kind of basic regex is really easy to learn once you've wrapped your head around a few concepts.

Advanced regex on the other hand is something you can keep learning for a lifetime.   :smile:

I've been using regex for over 20 years and still learn new things fairly often.

-Chris

Oh. I’m with @ComplexPoint on the usefulness of pattern analyzers.

I’ve owned RegExRX for years and am also partial to Patterns.

Here’s a good (and free) online tester complete with example:

https://regex101.com/r/oA9mZ4/1

Your task from what I’ve seen of the text is extremely easy. You just need some help getting started.

-Chris

On the conversion of dates, and thinking about doing it in the shell, as far as I can see OS X date -j -f doesn't parse month names, though perhaps that depends on the locale ?

Otherwise, I guess it's again a case of finding the position of a lower case match in a list, and attending to the leading zero for jan-sep.

monthDigits() {
	local year="jan feb mar apr may jun jul aug sep oct nov dec";
	local month=`echo $1 | tr '[:upper:]' '[:lower:]'`;
	local index="${year%$month*}"
	printf "%02d" "$((${#index}/4 + 1))"
}

echo `monthDigits "SEP"` `monthDigits "oct"` `monthDigits "Nov"`

The date format change seems to be the tricky part; it would be easy if the “Search and replace variable” action had an additional “mode” in which it can accept ICU date-time syntax for search and replace fields.

In the present case to find and replace all dates like “2015-Feb” whit the new format “2015-02” would be enough to use the search pattern “%ICUDateTime%yyyy-MMM%” along with replace pattern “%ICUDateTime%yyyy-MM%”.

Even better, at least for less experienced users, if KM had a graphical token composition tool like the one offered by Hazel.

Replacing the month with an index is a tricky problem.

Sometimes the easiest way is the best, just do twelve trivial search and replaces. Here is a way of doing it which is relatively straight forward. Basically go through the 12 indexes, and use the ICUDateTimeFor to find the month and then replacing it with the index (expanded to two digits).

1 Like

Often...    :sunglasses:

------------------------------------------------------------
# REQUIRES Satimage.osax { http://tinyurl.com/dc3soh }
------------------------------------------------------------
set _data to "
Some other text...

Travel Plan Update - Flight - PEK to SIN - Feb 2015 
Travel Plan Update - Hotel - Singapore CBD - Feb 2015

More other text...
"
------------------------------------------------------------
set dateFormat to "%Y%m"
set _updates to join (fndUsing("travel plan update - (\\w.+) - ([[:alpha:]]{3} \\d{4})", "\\2 - \\1", _data, true, true) of me) using return
set foundDateList to find text "[[:alpha:]]{3} 20\\d{2}" in _updates with regexp, all occurrences and string result
copy foundDateList to newDateFmtList
repeat with i in newDateFmtList
  set contents of i to (strftime (date (change " " into " 1, " in i)) into dateFormat)
end repeat
set newData to change foundDateList into newDateFmtList in _updates without case sensitive
------------------------------------------------------------

OUTPUT:

201502 - Flight - PEK to SIN 
201502 - Hotel - Singapore CBD

-Chris

UPDATE

FWIW, when we need to do this kind of thing within a bash script, date -j -f does, in fact, deliver – I had missed the strptime man page …

(%b parses the locale’s date names in short or long form, independent of case)

# OS X Bash
# strMonthName --> \d{2}
mmmDigits(){
	echo `date -j -f "%b" "$1" +"%m"`
}

echo `mmmDigits "sep"` `mmmDigits "OCTober"` `mmmDigits "Nov"`

–>

09 10 11

( but %ICUDateTimeFor% – see above – is clearly more Maestronic )