Is There a Way to Grab a Number From HTML Code Into a Variable?

Hello all,

I want to grab a number from the HTML code of a product page.

Example page: https://www.vanishingincmagic.com/card-magic/tumi-magic-presents-triple-helix/

Example number that I want to get: 66593

The number appears to be regularly surrounded by: "mpn": "66593",

Some numbers will be lower, so I can't reply on the length of the number.

Ideas?

This is a fairly complex requirement, since the data you want is inside of a script in the web page. So we need to use both querySelectior and RegEx in a Execute a JavaScript in Front Browser action:

// --- Get Script Block That Contains the "mpn" Data ---
var scriptElem = document.querySelector('script[type="application/ld+json"]');
var mpnStr;

if (scriptElem) {
  var scriptStr = scriptElem.innerText;
  
  //--- Extract the Value of mpn Using RegEx ---
  var matchArr	= scriptStr.match(/"mpn": "(\d+)/i);
    if (matchArr) {
      mpnStr	= matchArr[1];
    } else { mpnStr = "[ERROR] Match for 'mpn' NOT Found"; }
	
} else { mpnStr = "[ERROR] Script Element NOT Found.";}

mpnStr;

Let us know if this works for you.

1 Like

Thank you J Michael! You are a bleeping genius! Much appreciated!

1 Like

Recently, this script stopped working. I have no idea why.

How do I track down the issue and fix it?

The pertinent code appears to be the same on the website:

"mpn": "66593",

The regex seems to be fine:

Screen Shot 2021-07-28 at 3.50.50pm

But I am getting the first error (Match for 'mpn' NOT Found)).

Any ideas?

It would be best to post the actual HTML code, in a Forum Code Block from the current web page.

It looks like the forum prevents me from posting the entire code.

Here is the code with the desired MPN part (same page from the previous post):

<!doctype html>
<html lang="en">
<head><meta charset="utf-8" /><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no, minimum-scale=1.0" />

    <script>
        var viBrandID = 1;

        function kmtrack(method) {
            if (typeof _kmq !== 'undefined' && window.jQuery) {
                method();
            }
            else
                setTimeout(function () { kmtrack(method) }, 50);
        }
    </script>
  
    <meta name="p:domain_verify" content="9655037a6880c83a3741db8c49267ef7" />
        <link rel="stylesheet" type="text/css" href="/compressed/vi.css?d=19July21" />
        <link rel="apple-touch-icon" sizes="120x120" href="/apple-touch-icon.png" /><link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png" /><link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png" /><link rel="mask-icon" href="/safari-pinned-tab.svg" color="#5bbad5" /><meta name="msapplication-TileColor" content="#da532c" /><meta name="theme-color" content="#ffffff" />

        <script type="application/ld+json">
        {
            "@context": "https://schema.org",
            "@type": "Organization",
            "url": "http://www.vanishingincmagic.com",
            "logo": "https://vinc.gumlet.io/pics/vanishing-inc-logo.png"
        }
        </script>
    <link rel="canonical" href="/card-magic/tumi-magic-presents-triple-helix/" /><meta name="Description" content="Triple Helix: Amazing. Visual. Impossible. 
This is arguably one of the best deck productions ever made. Make a deck of cards visually appear out of thin air with..." /><link rel="preconnect" href="https://vinc.gumlet.io/" crossorigin="" /><link rel="dns-prefetch" href="https://vinc.gumlet.io/" />
    
            <meta property="og:image" content="https://vinc.gumlet.io/gallery/photos/tumi-magic-presents-triple-helix.jpg" />
        
        <meta property="og:title" content="Triple Helix" />
        <meta property="og:description" content="Amazing. Visual. Impossible. 
This is arguably one of the best deck productions ever made. Make a deck of cards visually appear out of thin air with..." />
        <meta property="og:url" content="https://www.vanishingincmagic.com/card-magic/tumi-magic-presents-triple-helix/" />
        <meta property="og:type" content="product" />
        <meta property="product:category" content="Card magic and card tricks" />
        <meta property="product:brand" content="Snake, Tumi Magic and John Byng" />
        <meta property="product:availability" content="instock" />
        <meta property="product:condition" content="new" />
        <meta property="product:type" content="Trick" />
        <meta property="product:price:amount" content="39.95"/>
        <meta property="product:price:currency" content="USD"/>
        <meta property="product:price:amount" content="29.00"/>
        <meta property="product:price:currency" content="GBP"/>
        <meta property="product:id" content="20973" />
        <meta property="product:retailer_item_id" content="20973" />
        
        <meta property="product:google_product_category" content="Arts & Entertainment > Hobbies & Creative Arts > Magic & Novelties" />

        <script type="application/ld+json">
        {
        "@context": "https://schema.org/",
        "@type": "Product",
        "name": "Triple Helix",
        "image": [
            "https://www.vanishingincmagic.com/gallery/photos/tumi-magic-presents-triple-helix.jpg","https://www.vanishingincmagic.com/gallery/photos/tumi-magic-presents-triple-helix-1.jpg","https://www.vanishingincmagic.com/gallery/photos/tumi-magic-presents-triple-helix-2.jpg"
        ],
        "description": "Amazing. Visual. Impossible. This is arguably one of the best deck productions ever made. Make a deck of cards visually appear out of thin air with...",
        "sku": "20973",
        "mpn": "66593",
        "brand": {
            "@type": "Thing",
            "name": "Snake, Tumi Magic and John Byng"
        },
        
        "aggregateRating": {
            "@type": "AggregateRating",
            "ratingValue": "4",
            "reviewCount": "10"
        },
        
        "offers": [
            {
                "@type": "Offer",
                "url": "https://www.vanishingincmagic.com/card-magic/tumi-magic-presents-triple-helix/",
                "price": "39.95",
                "priceCurrency": "USD",
                "priceValidUntil": "8/28/2021 12:06:54 AM",
                "itemCondition": "https://schema.org/NewCondition",
                "availability": "InStock",
                "seller": {
                    "@type": "Organization",
                    "name": "Vanishing Inc. Magic"
                }
            }]
        }

        
        {
            "@context": "https://schema.org",
            "@type": "QAPage",
            "mainEntity": [
                
            {
                "@type": "Question",
                "name": "So, the trailer shows two productions, but the description hints at only one.  Which is it?",
                "acceptedAnswer": {
                "@type": "Answer",
                    "text": "It comes with 2 gimmicks, ready to use."
                }
            },    
            {
                "@type": "Question",
                "name": "Do both card boxes have cards in them?  Could I ask the spectator choose red or blue?",
                "acceptedAnswer": {
                "@type": "Answer",
                    "text": "Nope, unless you use equivoque."
                }
            },    
            {
                "@type": "Question",
                "name": "How durable are the gimmicks?  ",
                "acceptedAnswer": {
                "@type": "Answer",
                    "text": "The trick has not been released yet, so it\u0027s impossible to tell I\u0027m afraid. But, as ever, with Vanishing Inc. if you are not happy with a purchase you can return it. We have a 100% Satisfaction Guarantee. "
                }
            }]
        }
        
        </script>

Still trying to track this down. The results in RegEx101 look great:

Screen Shot 2021-07-29 at 4.47.16pm

(I think there may now be an earlier match on that page for your querySelector pattern)

I doubt that the issue is a RegEx issue, since the text looks the same.
Most likely, it is a problem is selecting the HTM element using querySelector.
You can easily test this in the Chrome Dev window, by selecting the target text on the web page, right-click, and select "inspect"
Then you can enter various querySelector commands until you find one that works.

2 Likes

one that works.

I suppose one option, if you are happy to reach for XPath (as an alternative to querySelector), might be to specify the first script node which contains the string which you are looking for.

Perhaps something like:

//script[contains(text(),'"mpn": ')]
Expand disclosure triangle to view JS Source
(() => {
    "use strict";

    const
        nodes = document.evaluate(
            `//script[contains(text(),'"mpn": ')]`,
            document,
            null,
            XPathResult.ANY_TYPE,
            null
        ),

        scriptText = null !== nodes ? (
            nodes.iterateNext()
        ) : false;

    return scriptText ? (() => {
        const
            matches = scriptText.textContent.match(
                /"mpn": "(\d+)/iu
            );

        return matches ? (
            matches[1]
        ) : "No regex match";
    })() : "Script node not found";
})();
1 Like

Thanks to both @JMichaelTX and @ComplexPoint — Very helpful and much appreciated!

In which case save it as a text file, zip it, and post the zip file.

-Chris