I have HTML files exported from Scrivener. It contains CSS styles in <head>...</head> (not in a separate CSS file). I need to convert these styles to inline style. How can I accomplish it in KM?
What I would like to do is this:
When a FileName.html file is added to a folder, KM reads the file into a variable.
Convert the CSS styles into inline styles.
Write the converted content back to that file.
Steps 1 and 3 may be done easily. But I don't know how to do Step 2.
Below is a sample of an HTML file exported from Scrivener.
Preferably, I would like to remove some unnecessary styles (such as margin: 0.0px 0.0px 0.0px 0.0px and -webkit-text-stroke: #000000) to make the file look cleaner.
HTML content sample (click to unfold)
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Style-Type" content="text/css">
<title></title>
<meta name="Generator" content="Cocoa HTML Writer">
<meta name="CocoaVersion" content="2022.6">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000}
li.li3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
span.s2 {-webkit-text-stroke: 0px #000000}
ul.ul1 {list-style-type: disc}
ul.ul2 {list-style-type: circle}
</style>
</head>
<body>
<p class="p1"><span class="s1">Hello,</span></p>
<p class="p1"><span class="s1">This is a reminder that we will have a zoom meeting today at <b>7 PM Eastern Time</b>. (If you live in another time zone, please adjust accordingly).</span></p>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>Attendance Quiz</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">After the meeting, you must complete the attendance quiz within a week. I strongly recommend that you do the quiz as soon as each meeting ends.</span></li>
</ul>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>Requirement for the Zoom Meetings (LiveSyncs)</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">All Zoom meetings are mandatory. If you can’t attend, you must let you OTA know in advance.</span></li>
</ul>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>To join the meetings</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">Please go to the Zoom page on Canvas.</span></li>
<li class="li3"><span class="s2"></span><span class="s1">You will need to log in to Zoom in order to join the meetings.</span>
<ul class="ul2">
<li class="li3"><span class="s2"></span><span class="s1">Your Zoom account does not need to be the school email.</span></li>
<li class="li3"><span class="s2"></span><span class="s1">A couple of students used to have trouble joining the meetings with an organizational account. If you happened to encounter this issue, try a personal Zoom account.</span></li>
</ul></li>
</ul>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>Zoom Recordings</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">All Zoom meetings are recorded. To access the recordings, go to the Zoom page on Canvas and click the “Cloud Recordings” tab.</span></li>
<li class="li3"><span class="s2"></span><span class="s1">Note: Cloud recordings will be deleted automatically after they have been stored for 30 days.</span></li>
</ul>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>Due Date for the Meeting Reports</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">The Zoom meeting report quiz is due the 7th day after the day of meeting.<br>
</span></li>
<li class="li3"><span class="s2"></span><span class="s1">NOTE: the due dates for the survey quizzes are the same as the meeting start time. They will not be closed until a week after each meeting. This is to remind you of the meeting start time. Since you can only start taking these quizzes after the meeting, these quizzes will always be “late submissions.” But you will not be penalized for that. These are the ONLY 6 quizzes that allow “late” submission.</span></li>
</ul>
<h2 style="margin: 0.0px 0.0px 14.9px 0.0px; font: 14.0px Times; color: #000000; -webkit-text-stroke: #000000"><span class="s1"><b>Late Work Policy</b></span></h2>
<ul class="ul1">
<li class="li3"><span class="s2"></span><span class="s1">No late work submission is accepted for any of the assignments. I strongly recommend you submit the attendance report as soon as the meeting ends.</span></li>
</ul>
</body>
</html>
A brutal RegEx search might be the last resort. What I'm hoping to get is to use JavaScript to do the job.
I have found a JavaScript that I can use to run in a KM HTML prompt window.
transferAll();
function transferComputedStyle(node) {
var cs = getComputedStyle(node, null);
var i;
for (i = 0; i < cs.length; i++) {
var s = cs[i] + "";
node.style[s] = cs[s];
}
}
function transferAll() {
var all = document.getElementsByTagName("*");
var i;
for (i = 0; i < all.length; i++) {
transferComputedStyle(all[i]);
}
}
But it adds a lot more styles to the html file; most did not exist in the CSS head.
For a simple Hello everyone paragraph, it becomes:
The approach I suggested may be brutal but it isn't a regular expression. It's simply a literal search and replace for each HTML tag in the header that has been styled.
You can easily do that in your language of choice, of course. And I'd use an Execute ... action rather than a bunch of Keyboard Maestro S&R actions.
You could, for extra credit, parse the header's style section to build a list of tags to search and what their replacements are before feeding each of them to a function to do the work.
Then it wouldn't matter what deviations Scrivener produces on compile and it won't bloat your output file either.
I will need to get get these styles inside {style} one by one. This has to be a RegEx search, right?
Then, I'll seed to replace class="p1" with the information in {}, something like style="style". This is another RegEx search and replace.
Sorry, I should have put the pieces together in Keyboard Maestro. Here's a macro that reads the contents of the original file into a local variable which the Perl script modifies before writing the revisions.
You do have to change the path from /Users/mrpasini/Desktop/ in both actions. And put hello.html on your Desktop. You'll find styled.html on your Desktop after you run the macro. The results are also displayed in a window.
In something I'm working on (to be released in the next day or so) I take the in-line style and split at semicolons. I then filter to honour only the style pieces I actually want. (In my case it's starting with color and background-color.)
My code is Python (which gives my followers some idea which project this is for).
That’s great! Please keep me updated!
I’ve been playing with the Perl script to remove the unwanted styles. I once thought I would never be able to understand Perl, but I start to appreciate it more. I might start using it for RegEx search/replace b/c it’s so terse. But your python script will be most welcome!
Ah, missed that yesterday. But it's bone simple to add to the Perl script right after the foreach loop. The thing is you have to know precisely what it is you want to delete.
No problem!
I was able to study your code and understood enough to make changes to suit my needs.
Instead of removing the unwanted styles from $data after the replaces, I removed them from $styles before doing the replaces.
I may need to make more changes depending on my future needs, but this is how my code looks like now:
#!/usr/bin/perl
my $file = $ENV{KMVAR_FilePath};
my ($t, $s);
open(IN, "< $file") || die "Could not open HTML file.\n";
undef $/;
$data = <IN>;
close(IN);
$data =~ s/\x{C2}\x{A0}//g;
$data =~ s/<style .+?>\s*(.+?)\s*<\/style>\n//ms;
my $styles = $1;
$styles =~ s/(margin: .*?; ?)|(color: #000000;? ?)|(color: #222d35;? ?)|( -webkit-text-stroke: #000000)|(text-decoration: underline\s?;? ?)|(font-kerning: none;? ?)|(font: \d+\.0px 'Times New Roman'; ?)|(min-height: \d+\.0px)//g;
$styles =~ s/.*?\{\s*\}\s?//g;
$data =~ s/.+<body>\n/<body style="font: 16px 'Times New Roman';">\n/ms;
$data =~ s/\s<\/html>//ms;
$data =~ s/ ?class="Apple-converted-space"//g;
my @tags = split /\n/, $styles;
foreach my $tag (@tags) {
$tag =~ /.+?\.(.+?)\s/;
$t = $1;
$tag =~ /{(.+?)}/;
$s = $1;
$data =~ s/class="$t"/style="$s"/g;
}
$data =~ s/ class="\w+"//g;
$data =~ s/<span>\s*<\/span>//g;
#print $styles;
#print $data;
open(OUT, "> $file") || die "Could not open HTML file.\n";
print OUT $data;
close(OUT);