Scraping webpages like Allmusic for data

Hello everyone

I created my first advanced macro for scraping Allmusic for certain data and then add it to a markdown file saved in Obsidian. I am trying to create a database in Obsidian, and wan't to easily grab data from music and movie webpages. So first off is Allmusic and this one is done by the help and inspiration of JMichaelTX's example "Extract Data from Web Page and Display".

I post it here:
Extract Allmusic.kmmacros (8.5 KB)

If we take an example like: Ghost Song - Cécile McLorin Salvant | Songs, Reviews, Credits | AllMusic, the Macro will then scrape the site for album title, year, artist, genre, styles, duration and finally cover (and this is where I need some guidance). Unfortunately, I can't seem to manage to grab images and add as a variable. Any help would be appreciated!

This is where the image file comes in:

1 Like

Hi @Kullenej - welcome to the KM forum!

I can’t help much, but I can say that KM variables are able to hold only text, so your attempt to store an image in a variable will never work.

From the KM wiki: “Variables contain only plain, un-styled, text

I think you’ll find that named clipboards can store images.

I’ve never used named clipboards but if you search this forum I’m sure you’ll find something relevant.

Thank you so much! Makes sense - I will look into that.

Thank you again @tiffle! I managed to use named clipboards instead, but what I need some help with now is how to grab an image from the html. I tried copying the path in the console and also tried with xpath - nothing works. Does anyone know something about this?

1 Like

Well @Kullenej I've done a fair bit of web scraping with KM in my time, so maybe this will help.

Once you've got the path of the cover image into a variable - your example would be this:

https://rovimusic.rovicorp.com/image.jpg?c=2J_XvcH2by9tIU46jQCxmCpQg_7iAU1wjqLgK_xGXts=&f=4

You need to use the KM Get a URL action and load the image into a clipboard. Here's the example for you:

image

Notice that I've split the image path into the URL and the Parameter for this KM action. You can try to put the whole path into the URL instead as that might work too and also make it slightly easier to use the KM action. I've also specified the system clipboard - well you can use a named clipboard too.

Let me know how you get on!

1 Like

Thank you again for the effort of learning me this!

I understand, but how can I make it into a dynamic variable so that it always can grab the cover no matter what album I'm looking at - your example is based on one specific album.

1 Like

Oh - sorry, I thought you were already getting the URL of the cover image somehow.

I'm an old-school developer so I'm not familiar with using Xpath or other "modern" stuff like that so this is how I would do it:

This macro assumes you already have the album page loaded into the web browser.

Test Get Cover Image.kmmacros (2.6 KB)

I've done limited testing on this but it seems to work. To run it you'll need to enable it first once you've installed it.

Given the macro you posted at the top of this thread I would hope you would be able to understand how mine works.

If you or anyone else reading this knows a more efficient way of getting the cover image I would be interested to learn how!

1 Like

Awesome! :smiley:

Your macro works by itself, meaning that the cover is saved to the clipboard.
Now the weird thing, when I try to incorporate into my macro, the .md-file I create only says "No Text in Clipboard". Is there some certain action I need to use in order to add an image-url to the text file?

I'm sorry but I don't understand what you're now trying to do.

Our last exchange was about an actual image but here you're talking about image-url.

You need to upload your new/updated macro so I can see what you’re actually doing.

Yeah sorry - here is the macro:
Extract Allmusic.kmmacros (9.6 KB)

What I want to do is, for each music album I use the macro on, a .md-file is created that contains the metadata + the image. This is for Obsidian that uses markdown files.

This line that you put into the system clipboard

Cover:: ![%NamedClipboard%Allmusic%]

is incorrect markdown. As I understand MD, you need to link to a local file or a URL whereas you're trying to use the image itself.

If you want to keep the image locally, then save the image (as obtained through the use of my example macro) to a file on your Mac and then use that file path to link to in the MD. Otherwise, just link to the URL as also obtained through my example macro.

If you're not familiar with what I'm talking about, then have a look here Basic Syntax | Markdown Guide and scroll down to the Images section.

Time to rest...

1 Like

Thank you so much for the help! Appreciate it.

1 Like