How to display markdown as human readable format?

Hello, after some time trying different things with Read xpath matches in Safari action and using JSONVALUE I have been able to extract the following table information from a web site:

        <thead>
            <tr>
                <th data-field="orden">Orden</th>
                <th data-field="tarea">Tarea</th>
                <th data-field="fecha_ini">Fecha Inicio</th>
                <th data-field="fecha_fin">Fecha Finalización</th>
            </tr>
        </thead>
        <tbody>
                   
            <tr>
                <td> 1 </td>
                <td> Recepción de Documentos </td>
                <td> 30-08-2019 07:08:46 </td>
                <td> 30-08-2019 07:08:43 </td>
            </tr>
                   
            <tr>
                <td> 2 </td>
                <td> Entrevista </td>
                <td> 30-08-2019 07:08:43 </td>
                <td> 27-12-2019 08:12:19 </td>
            </tr>
                   
            <tr>
                <td> 3 </td>
                <td> Reparto de Expedientes </td>
                <td> 27-12-2019 08:12:19 </td>
                <td>  </td>
            </tr>
                        </tbody>

How can I parse it like a table readable by a human to be displayed in a window or sent by email?

Thank you!

Well, that's a partial HTML snippet.

Are you asking:

  • how to convert it to markdown,
  • or how to display it as a rendered table ?

If you restore the missing outer <table> ... </table> opening and closing tags so that it looks like:

<table>
    <thead>
        <tr>
            <th data-field="orden">Orden</th>
            <th data-field="tarea">Tarea</th>
            <th data-field="fecha_ini">Fecha Inicio</th>
            <th data-field="fecha_fin">Fecha Finalización</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td> 1 </td>
            <td> Recepción de Documentos </td>
            <td> 30-08-2019 07:08:46 </td>
            <td> 30-08-2019 07:08:43 </td>
        </tr>
        <tr>
            <td> 2 </td>
            <td> Entrevista </td>
            <td> 30-08-2019 07:08:43 </td>
            <td> 27-12-2019 08:12:19 </td>
        </tr>
        <tr>
            <td> 3 </td>
            <td> Reparto de Expedientes </td>
            <td> 27-12-2019 08:12:19 </td>
            <td> </td>
        </tr>
    </tbody>
</table>

Then you are almost there because if you save that as a file with an .html extension, then any browser (even this wiki software) will display the wrapped strings in a tabulated layout. I've pasted it directly below:

Orden Tarea Fecha Inicio Fecha Finalización
1 Recepción de Documentos 30-08-2019 07:08:46 30-08-2019 07:08:43
2 Entrevista 30-08-2019 07:08:43 27-12-2019 08:12:19
3 Reparto de Expedientes 27-12-2019 08:12:19

I say almost because any non ASCII accented characters, as in the -ión of Recepción may need to be properly HTML-encoded, or UTF8 encoding may need to be explicitly specified, though that may already be taken care of on the website from which you take that material.

If you want to experiment with different CSS stylings, you could copy the HTML to the clipboard and use Preview > Clipboard Preview in Brett Terpstra's:

[Marked 2 - Smarter tools for smarter writers](https://marked2app.com/)

And Marked 2 will also allow you to export the rendered result in other formats, like RTF.

(But if your question really was about Markdown after all, then HTML -> Markdown conversion is a different matter ... )

2 Likes

If your question was about HTML -> Markdown conversion, then an obvious first step might be to install:

[Pandoc - About pandoc](https://pandoc.org/)

If, for example you had tidied up the HTML, adding the flanking <table> ... </table> tags, as above, and saved it to ~/Desktop/index.html,

and if the pandoc installation has placed it on, for example, the path:

/usr/local/bin/pandoc

Then a shell command line like:

/usr/local/bin/pandoc -f html -t markdown ~/Desktop/index.html

(Possibly run from a Keyboard Maestro Execute Shell Script action)

would yield the plain text (markdown) output:

  Orden   Tarea                     Fecha Inicio          Fecha Finalización
  ------- ------------------------- --------------------- ---------------------
  1       Recepción de Documentos   30-08-2019 07:08:46   30-08-2019 07:08:43
  2       Entrevista                30-08-2019 07:08:43   27-12-2019 08:12:19
  3       Reparto de Expedientes    27-12-2019 08:12:19 
1 Like

Example:

The following assumes that:

  • pandoc has been installed,
  • and the installation has placed it at: /usr/local/bin/pandoc

2 Likes

Thank you very much, very useful and kind explanation.

I will use it a lot.

1 Like