Find email address?

Is it possible to have an action to find an email address on a web page and copy it to the clipboard

I think that’s only really possible with applescript integration with keyboard maestro…

Take a look at this.

I have successfully set a variable to the page source and filtered the variable.

I have also had success saving .webarchive files, extracting them with the program File Juicer, and then searching the extracted text file with regular expressions.

Each is cumbersome to set-up and run. Hopefully someone will join in with an easy, direct, solution. But I can provide a nearly full answer to your question: yes, it is possible to write a KMacro that extracts email address on Web pages and makes them available.

Yes.   :sunglasses:

-Chris

Extract Email Addresses from Safari.kmmacros (2.4 KB)

#! /usr/bin/env perl
   use strict; use warnings;
#-----------------------------------------------------------
# Find Email Addresses in Source of Front Safari Page – Sort and Remove Duplicates.
#-----------------------------------------------------------

my $html = 'osascript -e "
   tell application \"Safari\"
      tell front document
         return source
      end tell
   end tell
"';

$html = `$html`;

my @array = $html =~ m!\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b!ig;
my %hash = map { $_ => 1 } @array;
my @unique = sort(keys %hash);
$, = "\n";
print @unique;
1 Like

ok thank you

Sweet!

Can you help me parse the regex?

~ m!\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b!ig

From "\b" to "\b" is clear to me. What do "~m!" and "!ig" do? (I'm guessing these are perl-specific instances of the modes multi-line, case insensitive, and ... group? ?

Will it work if I substitute any regex I want for "\b {...} \b" (as long as the matches are found within a line of text)?

Thanks.

—Kirby.

Hey Kirby,

my @array = $html =~ m!\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b!ig;
my @array = 

Creates an variable of type array and sets it equal to the rest of the statement.

$html

Is an existing scalar variable.

$html =~ m!\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b!ig;

=~  is a binding operator that causes the left expression to be evaluated by to the right expression.

!~ is "not equals".

http://perldoc.perl.org/perlop.html#Binding-Operators

This line by itself would assign the result to $html, but above I'm assigning the result to the @ccstone array variable instead.

The ig flags at the end mean case-insensitive and global.

The canonical method of writing a match expression is

 m/<pattern>/

Search & Replace is:

 s/<pattern>/<pattern>/

However you can substitute many different characters for the forward slash, and I happen to like the explanation point.

So the variable $html is evaluated by the match-pattern, and the result is placed into the @array variable.

Not quite.

m!<pattern>!

But you could use:

m/<pattern>/
m#<pattern>#
m@<pattern>@

The email pattern is a pretty good one I borrowed from Jan Goyvaerts:

http://www.regular-expressions.info/email.html

You can use the general script to find other things.

Be aware that getting used to Perl's syntax can make your head spin for a while...  :smile:

-Chris