How Do I Extract Pieces of Text from a Larger Block Using Regex with a Loop

Try this for a starting point:

  • Your Alt Texts are every line that begins with "- Alt Text: "
  • Your Title Attribs are every line that begins with "- Title Attribute: "
  • Your Descriptions are every line that begins with "- Description: "

You can use a "For Each: substrings" for each of those, then use the same method as the previous macro to get them into individual variables.

Have a go and see how you get on.

1 Like

the for each thing is still hanging me up... I just have a wall about it, I guess

Then take it back a step...

Doing as described above will only work if every "record" has a line for every "field". If they might not you might be better off going line-by-line, testing and assigning as you go.

That's also easier to get your head round than a "Collection of substrings" using some weird regex. So give it a go using "For Each: Lines in Collection".

Crude I know.... work in progress..... more tomorrow

Image 01-05 Information Regex K0334 TESTING Macro (v11.0.3)

Image 01-05 Information Regex K0334 TESTING.kmmacros (59 KB)

FWIW a single-variable JSON version, so that you can write things like:

File name of image 4:
	%JSONValue%local_JSON[4].Filename%

Image Title Attribute of image 4:
	%JSONValue%local_JSON[4].Image_Title_Attribute%

JSON Array of Image Details.kmmacros (7.9 KB)


Expand disclosure triangle to view full JSON
[
  {
    "Filename": "1 bridal-attendant-01.jpg",
    "Alt_Text": "Muffetta Household Staffing Agency bridal attendant assisting during wedding preparation. A bridal attendant helping the bride put on earrings.",
    "Title_Attribute": "Bridal attendant services by Muffetta Household Staffing Agency",
    "Description": "A bridal attendant helping the bride put on earrings. 🏡👰 Muffetta Household Staffing Agency provides professional bridal attendant services for a perfect wedding day. #muffettastaffing",
    "Image_Title_Attribute": "Bridal Attendant Helping Bride with Earrings"
  },
  {
    "Filename": "2 bridal-attendant-02.jpg",
    "Alt_Text": "Muffetta Household Staffing Agency bridal attendant assisting during wedding preparation. A bridal attendant assisting the bride with her wedding dress.",
    "Title_Attribute": "Bridal attendant services by Muffetta Household Staffing Agency",
    "Description": "A bridal attendant assisting the bride with her wedding dress. 🏡👰 Muffetta Household Staffing Agency provides professional bridal attendant services for a perfect wedding day. #muffettastaffing",
    "Image_Title_Attribute": "Bridal Attendant Helping Bride with Wedding Dress"
  },
  {
    "Filename": "3 bridal-attendant-03.jpg",
    "Alt_Text": "Muffetta Household Staffing Agency bridal attendant assisting during wedding preparation. A luxury wedding venue with elegant pink and white decor.",
    "Title_Attribute": "Bridal attendant services by Muffetta Household Staffing Agency",
    "Description": "A luxury wedding venue with elegant pink and white decor. 🏡👰 Muffetta Household Staffing Agency provides professional bridal attendant services for a perfect wedding day. #muffettastaffing",
    "Image_Title_Attribute": "Elegant Wedding Venue with Pink and White Decor"
  },
  {
    "Filename": "4 bridal-attendant-04.jpg",
    "Alt_Text": "Muffetta Household Staffing Agency bridal attendant assisting during wedding preparation. A bridal attendant helping a bride adjust her earrings.",
    "Title_Attribute": "Bridal attendant services by Muffetta Household Staffing Agency",
    "Description": "A bridal attendant helping a bride adjust her earrings. 🏡👰 Muffetta Household Staffing Agency provides professional bridal attendant services for a perfect wedding day. #muffettastaffing",
    "Image_Title_Attribute": "Bridal Attendant Adjusting Bride's Earrings"
  },
  {
    "Filename": "5 bridal-attendant-05.jpg",
    "Alt_Text": "Muffetta Household Staffing Agency bridal attendant assisting during wedding preparation. A wedding coordinator preparing decorations for a luxury wedding event.",
    "Title_Attribute": "Bridal attendant services by Muffetta Household Staffing Agency",
    "Description": "A wedding coordinator preparing decorations for a luxury wedding event. 🏡👰 Muffetta Household Staffing Agency provides professional bridal attendant services for a perfect wedding day. #muffettastaffing",
    "Image_Title_Attribute": "Wedding Coordinator Preparing Luxury Event Decor"
  }
]
Expand disclosure triangle to view JS source
return (() => {
    "use strict";

    const main = () =>
        parts(/\s*---\s*/u)(
            kmvar.local_Source
        )
            .map(
                x => dictionary(
                    lines(x).map(keyValue)
                )
            );

    // --------------------- GENERIC ---------------------

    const dictionary = kvs =>
        kvs.reduce(
            (a, [k, v], i) =>
                0 < i
                    ? ({ ...a, [k]: v })
                    : a,
            {}
        );

    const keyValue = s => {
        const [k, ...v] = s.split(":")

        return [
            noBullet(k).replace(/ /g, "_"),
            v.join("").trim()
        ];
    };

    const lines = s =>
        // A list of strings derived from a single string
        // which is delimited by \n or by \r\n or \r.
        0 < s.length
            ? s.split(/\r\n|\n|\r/u)
            : [];

    const noBullet = s =>
        s.startsWith("- ")
            ? s.slice(2)
            : s;

    const parts = delimiter =>
        s => s.split(delimiter).filter(
            x => 0 < x.trim().length
        );


    return JSON.stringify(main(), null, 2);
})();
1 Like

Again (and you'll hate me for this), back up a step.

Consider:

That could mean

  • I want to create variables Alt_Text_1 through Alt_Text_5, Title_Attributes_1, through Title_Attributes_5, and Descriptions_1 through Descriptions_5, or
  • I want to create variables myVar_1 through myVar_15

Which will make your life easier for present, and especially future, you -- naming each variable for what it is for and having easy access to "related" values, or having one big, undifferentiated, set of variables and remembering that each "label is in every third variable but you start counting at different positions"?

Even if you go with the "one big group" approach (again, you may be constrained by the macro/process the results of this will feed in to), you don't need to literally reference cb_zz01, cb_zz02, etc. See the macro in this post for how you can use the counter to do the numbers for you.

1 Like

Thank you for your help. Greatly appreciated.
There is a reason for the variable to be named in the current fashion so in this case it is not beneficial to use different names.

Can you check the link to 'this post' it seems to go to this current post.
I'd like to take a look at it. Although I've been here before with a counter number being added to a variable to create a 'numbered variable'.

Again, I am grateful @Nige_S and @ComplexPoint

@ComplexPoint - woof! wow, very quick, and powerful.
I'll have to ask LLM what the JSON is doing so I can at least start to understand what it's doing.
Thank you.

It links about 10 posts up (works fine for me) -- the "QandAs into Many Vars" macro.

ah, thank you

It isn't really "doing" – just defining the pattern of an output.

return (() => {
    "use strict";

    const 
        sectionDividers = /\s*---\s*/u,
        lineDividers = /\s*\n\s*/u;

    const main = () =>
        // An array of sections,
        kmvar.local_Source.split(sectionDividers)
        .flatMap( 
            // with non-empty sections represented as key:value dictionaries,
            section => 0 < section.length
                ? [
                    // A key:value Object (dictionary) of key-value pairs,
                    Object.fromEntries(
                        // obtained from an array of lines in the section,
                        section.split(lineDividers)
                        
                        // with each line sub-divided into a key:value entry pair.
                        .map(keyValue)
                    )
                ]
                // and empty sections discarded.
                : []
        );

    // --------------------- GENERIC ---------------------

    const keyValue = s => {
        const [k, ...v] = s.split(":")

        return [
            noBullet(k).replaceAll(" ", "_"), 
            v.join(":").trim()
        ];
    };

    const noBullet = s =>
        s.startsWith("- ")
            ? s.slice(2)
            : s;

    return JSON.stringify(main(), null, 2);
})();

JSON Array of Image Details (JS simplified- commented).kmmacros (8.0 KB)

1 Like

I'm going to argue this one -- it is always beneficial for variables to have meaningful names, and more obvious information is nearly always better. When you get an error in a couple of months time and need to work out what's missing, which helps more?

  • zz_cb07
  • Alt_Text_3
  • Alt_Text[3]
  • ImageDetails[3].Alt_Text

For the first you'll need to work your way through your macro, maybe do some finger counting. For the rest (single variables, array, and JSON) you can see at a glance that image 3's alt text is the problem.

But that's something to consider for future macros. For this one, treat it just like a printed sheet that you are extracting info from to a bunch of individual Post-Its. Most people would go down the sheet one line at a time, see if the start of the line matched anything they were looking for, and if so they'd write that info on a newly-labelled Post-It.

In pseudo-code:

set my post-it counter to 1
for each line in the text
   if the start of the line matches something I'm looking for
      copy the line to a new post-it labelled with the post-it counter number
      increase the post-it counter by 1
   end if
end for

Now consider the "if..." bit of that. In this case you don't care what it matches, just that it does match -- your "fields" are in the same order throughout, so working line-by-line means they'll be written to your Post-Its in the right order.

So, re-writing the above in a more KM way:

set counter to 1
for each line in text
    -- use enough for a single match AND enough that you understand the meaning at a glance
   if start of line is "- Alt" or "- Title" or "- Description"
      then
         set variable zz_cb<counter> to line
         set counter to counter + 1
      otherwise
         -- there is no otherwise! Saying that shows you didn't forget...
   end if

At which point you can build the macro almost line-for-line from the pseudo-code:

Image Attributes (Multi Var).kmmacros (10.0 KB)

Image

(As before, I've used local variables for testing -- delete "Set Variable 'Local_zz_...' " and enable "Set Variable 'zz_...' " to use globals.)

And you can make it more concise by replacing the three "If..." conditions with one:

1 Like

Morning, that's a great solution and I appreciate all the explanations!
I'm closer to understanding the for each method.
Question for learning, would the 'substring' 'mode' usually be a regex?
Thank you, Troy

"It depends."

Pick the simplest solution that meets your needs. If you wanted to extract each number from 1,2,3,4,5 it would make sense to use a "separated by a simple string" match:

But if it was from Every1good2boy3deserves4fish5 you'd ignore the separators and use a regex to "get each number in":

It is all just text matching. The difference is that a "simple match" looks for the same literal character(s), a regex looks for anything that matches a pattern you define, and the action lets you choose whether to extract "the things that match" or "the text separated by the things that match".

1 Like