Get data from HTML attributes?

rrs · May 2, 2024, 11:44pm

Hey gang, new user to KB, but really seeing the power of it so far.

I'm working on some automation, and I want to get the toggle state of a radio button pair on a site (not my own, and it's behind a password), so that I can then perform some other actions elsewhere on my desktop.

Here is the HTML of a "Sell" state, you can see that the 'data-checked' state is being stored in attributes in the div, label, and span elements... that's what tells me which button is active or inactive.

How can I extract this data from the HTML?

<div class="chakra-text css-4zelcx" role="radiogroup">
    <label class="css-0" data-checked>
        <input id="radio-:rdf:" type="radio" value="sell" name="side" style="border: Opx; clip: rectopx, 0p x, 0px, 0px); height: 1px; width: 1px; margin: -1px; padding: 0px; overflow: hidden; white-space: no wrap; position: absolute;">
        <div aria-hidden="true" class="css-u5dp7n" data-checked>
            <span class="css-0" data-checked>Sell</span>
        </div>
    </label>

    <label class="css-0">
        <input id="radio-:rdg:" type="radio" value="buy" checked name="side" style="border: 0px; clip: rect (0px, 0px, 0px, 0px); height: 1px; width: 1px; mar gin: -1px; padding: 0px; overflow: hidden; white-s pace: nowrap; position: absolute;">
        <div aria-hidden="true" class="css-14g0qmu">
            <span class="css-0">Buy</span>
        </div>
    </label>
</div>

mrpasini · May 3, 2024, 5:23am

See Custom_HTML_Prompt for how Keyboard Maestro handles form variables transparently using the name attribute.

griffman · May 3, 2024, 5:53am

Is this HTML you're writing, or HTML on a site that you'd like to extract the data from? I think you mean the latter? If so, exactly what are you trying to extract? That is, in the example you gave, what data do you want out?

-rob.

rrs · May 3, 2024, 1:05pm

Sorry, I left out that first detail, I updated my OP.

It's on another site, not my own, and it's password protected.

I'm interested in the 'data-checked' attribute to allow me to perform some actions in KM. You can see in the code for the radio buttons, one has that attribute and the other doesn't. That 'data-checked' state toggles between one button and the other when I click them, and as far as I can think of that seems to be the only way I can gather their states aside from some kind of OCR/visual method.

rrs · May 3, 2024, 1:38pm

Thanks for the pointer. I left out a detail in my OP which is that this HTML is on an exisisting site on the web, and it's behind a password.

It looks like the solution you link to has to do with if I were using my own HTML, right?

griffman · May 3, 2024, 1:48pm

I apologize, but I'm still confused: What is it you're trying to find out about that state? Are you just trying to determine which button is active? Or are you wanting to use it in an if/then kind of thing, where "if Buy button is active, do this; otherwise (assumes Sell is active), do this" ?

-rob.

rrs · May 3, 2024, 1:49pm

Yes to both. First I need to gather the state of the buttons and then yes, perform some if/then logic after I've got that info.

rrs · May 3, 2024, 2:02pm

Sorry for the poor initial delivery of information, I wrote this post at the end of the day, and after hours working on this!

griffman · May 3, 2024, 2:09pm

This seems to work in my testing...

Get button state.kmmacros (2.6 KB)

Macro screenshot

This looks for a span with the data-checked tag as the identifier for the active button, and then grabs the text content of that span, and returns it. The macro stores it in the local_theResult macro, and you can then do whatever you like with it.

And now, the secret sauce: ChatGPT is awful at a lot of stuff. But one thing it's quite good at, even the free version (which is all I use), is writing Javascript to pull data out of pages. My entire query to ChatGPT was this:

Given this HTML:
ㅤ
[your code]
ㅤ
How would I use Javascript to set a variable to the value in the tag that contains the term 'data-checked'?

I've used ChatpGPT often for stuff like this, and it's taught me a lot about querying for data. So I basically knew what I had to do, but I'd never seen a label (data-checked) on its own like that in an HTML tag, so I wasn't sure exactly how to reference it. ChatGPT was :).

-rob.

mrpasini · May 3, 2024, 4:16pm

Yes, right. I see Rob has you covered.

rrs · May 3, 2024, 5:17pm

Thanks so much for the input.

I did spend a couple hours with ChatGPT last night and it sent me down a python rabbit hole, trying to download the entire HTML and parse through it.

I haven't worked with js before, so I'm getting a bit of a sense for how useful it can be.

Regarding your input, adding the attribute in brackets [data-checked] was the key I needed.

In terms of current implementation, I realized that the original HTML snippet I posted wouldn't work as there are no static unique identifiers for this radio group vs. another radio group on the page.

But I was able to do what I needed with the Execute button at the bottom of the image. Either the Buy or Sell option is hidden depending on the buy/sell state. Looking for the hidden button gave me an adequate combination of unique static identifiers, so whatever result the code delivers I'll just take the inverse of it for my subsequent actions in KM.

var buttonState = document.querySelector('button[type="submit"][hidden]');

if (buttonState) {
    var buttonState = buttonState.textContent;
} else {
    var buttonState = "data-checked attribute not found.";
}

return buttonState;

ComplexPoint · May 3, 2024, 6:18pm

LLM's are just bubble-marketed rabbit holes. Generators of plausible sounding language, but not sources of information.

Look, for example, at the degradation of quality which they were inflicting on Stack Overflow (a much deeper and more useful resource), and at the steps it has had to take to prevent contagion by vacant auto-babble:

griffman · May 3, 2024, 6:25pm

I personally feel that's an overly-broad generality—it's not true of everything they return. They often generate nothing but junk, but they also can answer some questions quite well—at least well enough to help those of us not skilled in some areas do things we couldn't otherwise do.

Is it the best way to do something? Doubtful. But for me, short of deciding to try to learn some new language at my advancing age, they provide solutions I would never otherwise find on my own. It took me 30 seconds with ChatGPT to get the JavaScript I posted earlier. Searching the web, even somewhere as good as Stack Overflow, would've taken much more time and probably not led to an answer that was exactly what I was looking for.

Like any tool, it needs to be used in the right way, and you should never accept what it says as the best way—or even perhaps a plausible way—of getting something done. But for me, it's often the only way I get something working. And because of that, I find them quite useful in the right situations.

-rob.

ComplexPoint · May 3, 2024, 8:10pm

they also can answer some questions quite well

Only, when it happens, by accident.

They contain no modelling of knowledge, including no modelling of their own capacity.

They consist of distributionally profiled sets of linguistic tokens, and a capacity to re-compose those tokens in distributionally plausible sequences. That is an entirely linguistic computation, with no formation of, and no reference to, anything but token distribution statistics.

Vector slurry.

StackOverflow's observation of large volumes of usage is, to quote, that:

Overall, because the average rate of getting correct answers from ChatGPT and other generative AI technologies is too low, the posting of content created by ChatGPT and other generative AI technologies is substantially harmful to the site and to users who are asking questions and looking for correct answers.

The primary problem is that while the answers which ChatGPT and other generative AI technologies produce have a high rate of being incorrect, they typically look like the answers might be good and the answers are very easy to produce. There are also many people trying out ChatGPT and other generative AI technologies to create answers, without the expertise or willingness to verify that the answer is correct prior to posting. Because such answers are so easy to produce, a large number of people are posting a lot of answers. The volume of these answers (thousands) and the fact that the answers often require a detailed read by someone with significant subject matter expertise in order to determine that the answer is actually bad has effectively swamped our volunteer-based quality curation infrastructure.

i.e. its a poisoning of the commons, in the interests of investors, but at the expense of the public.

As these purely linguistic recompositions, (designed only to sound plausible and occasionally strike lucky, like any bluffer or poseur), begin to flood the public space, and themselves become part of the material on which language models are trained, progressive degradation will ensue.

Not a good investment our time, and not responsible to encourage the public to dive down into plausible-looking, but actually time-wasting, rabbit holes.

griffman · May 3, 2024, 8:44pm

There's a big difference between using a LLM to post an answer as authoritative on a forum designed to spread programming knowledge, and using one to help you solve a problem that you personally want to solve.

As a generally non-programmer user, ChatGPT and the like have enabled me to do things that I never would have been able to do before: I don't have the time, drive, or inclination to try to learn a programming language at the depths I'd need to learn it in order to be able to use it to solve any problem I might face. I especially don't have the time to do that across multiple languages that might be required to solve my problem.

So if I can personally spend a few minutes working with an LLM to come up with an answer that I would never get to—either on my own or with traditional Internet searching—then I'm going to do so.

When I use ChatGPT to come up with an answer, and use that answer in a public place, I disclose the fact that I did so, as I did here.

But I also use it regularly in macros I never post here, and for mundane stuff that would take me a long time to do on my own: Building lists of data (states and abbreviations), building tables out of raw table in HTML/CSS form (which I can do, but it does faster and better than I could), etc.

They can also be useful even when they're wrong. I was trying to solve a complicated AppleScript issue related to AppleScripting Automator. I got sort of close on my own, asked ChatGPT for its help, but none of its scripts worked. However, it revealed some key words that I hadn't seen before, even in the Dictionary. Using those words, I was able to use standard internet searches to find the answer I was looking for.

LLMs are not the greatest thing since sliced bread. They're not the answer to humanity's numerous issues. They're tools, like any other tool, and can be used in ways both good and bad. I'm all for Stack Overflow banning posts from LLMs—I would never think to post anything I found there on Stack Overflow, because I am not a programmer and try never to pretend to be a programmer.

But I'm fully for using LLMs where they can help people do things they wouldn't otherwise be able to do, or that would take hours and hours to do by hand. Is the world a worse place because I was able to use an LLM to write a Perl script to parse a huge text block in a macro I use at home to sort through various piles of data I have?

-rob.

rrs · May 3, 2024, 9:09pm

Big ditto to that.

I've taught myself a good few things with various types of scripting, C#, Excel formulas, HTML/CSS, but there's a limit to my abilities as well as my desire to learn new languages.

Like all tools there are big upsides and big potential downsides.

And regardless of how anyone feels about it LLMs are here and they are for sure not going anywhere.

Get data from HTML attributes?

Options