How to convert XML to HTML?

I have had a look at Pandoc but it looks that it is not possible to do this conversion: I'm trying to convert the attached xml file to html to have a preview file for translation.

Best I can think of is a bunch of Find and replace actions: first delete everything between &lt; and &gt; and then replace the expressions between < and > to their html equivalents.
Test.zip (3.5 KB)

An XML prettifier in Visual Studio Code would at least give you:

a slightly clearer layout
<?xml version="1.0" encoding="UTF-16"?>
<Transit version="4.0">
    <Header>
        <FFD GUID="{3A80FEB1-2B0D-11D4-A5DF-00104B026E96}" Name=""/>
        <SecureSeg level="no"/>
        <HistoryOpt active="no"/>
    </Header>
    <Body>
        <Tag pos="Point" type="Inline">&lt;!!! id=1&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;html&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;head&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;meta http-equiv=&quot;content-type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Paragraph">&lt;title&gt;</Tag>
        <Seg SegID="1" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Test</Seg>
        <Tag pos="End" type="Paragraph">&lt;/title&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;/head&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;body&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Chapter1">&lt;h1&gt;</Tag>
        <Seg SegID="2" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Heading One</Seg>
        <Tag pos="End" type="Chapter1">&lt;/h1&gt;</Tag>
        <Seg SegID="3" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">
            <WS type="n"/>
Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="4" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‚๎ฟ ไ€‚๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="5" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€ƒ๎ฟ ไ€ƒ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="6" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€„๎ฟ ไ€„๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich.<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="Begin" i="1" type="ListBullet">&lt;ul&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="7" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Bullet list</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="8" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‚๎ฟ ไ€‚๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Bullet list</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="9" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€ƒ๎ฟ ไ€ƒ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Bullet list</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="ListBullet">&lt;/ul&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Paragraph">&lt;p&gt;</Tag>
        <Seg SegID="10" Data="๎ค‚๎จ„๎ธ†๏˜‚๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€…๎ฟ ไ€…๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="11" Data="๎ค‚๎จ„๎ธ†๏˜‚๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€†๎ฟ ไ€†๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich.<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
            <NL x="2">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="Paragraph">&lt;/p&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListNumber">&lt;ol&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="12" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">First item</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="13" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Second item</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>
        <Seg SegID="14" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Third item</Seg>
        <Tag pos="End" type="ListItem">&lt;/li&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="ListNumber">&lt;/ol&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Chapter2">&lt;h2&gt;</Tag>
        <Seg SegID="15" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Heading two</Seg>
        <Tag pos="End" type="Chapter2">&lt;/h2&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Paragraph">&lt;p&gt;</Tag>
        <Seg SegID="16" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‡๎ฟ ไ€‡๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="17" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€ˆ๎ฟ ไ€ˆ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich. </Seg>
        <Seg SegID="18" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‰๎ฟ ไ€‰๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Zwรถlf Boxkรคmpfer jagen Viktor zum SpaรŸ quer รผber den Sylter Deich.<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="Paragraph">&lt;/p&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Table">&lt;table width=&quot;100%&quot; cellspacing=&quot;2&quot; cellpadding=&quot;2&quot; border=&quot;1&quot;&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;tbody&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableRow">&lt;tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="19" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">One<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="20" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Two<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="21" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Three<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="22" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Four<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="23" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Five<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="TableRow">&lt;/tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableRow">&lt;tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="24" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="25" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‚๎ฟ ไ€‚๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="26" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€ƒ๎ฟ ไ€ƒ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="27" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€„๎ฟ ไ€„๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="28" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€…๎ฟ ไ€…๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="TableRow">&lt;/tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableRow">&lt;tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="29" Data="๎ค‚๎จ„๎ธ†๏˜๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€†๎ฟ ไ€†๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="30" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‡๎ฟ ไ€‡๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="31" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€ˆ๎ฟ ไ€ˆ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="32" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‰๎ฟ ไ€‰๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="33" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€Š๎ฟ ไ€Š๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="TableRow">&lt;/tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableRow">&lt;tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="34" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‹๎ฟ ไ€‹๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="35" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€Œ๎ฟ ไ€Œ๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="36" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="37" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€Ž๎ฟ ไ€Ž๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="38" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="TableRow">&lt;/tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableRow">&lt;tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="39" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€๎ฟ ไ€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="40" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€‘๎ฟ ไ€‘๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="41" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€’๎ฟ ไ€’๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="42" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€“๎ฟ ไ€“๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="TableCell">&lt;td valign=&quot;top&quot;&gt;</Tag>
        <Seg SegID="43" Data="๎ค‚๎จ„๎ธ†๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พ ไ€”๎ฟ ไ€”๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Item</Seg>
        <Tag pos="End" type="TableCell">&lt;/td&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="TableRow">&lt;/tr&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;/tbody&gt;</Tag>
        <WS type="n"/>
        <Tag pos="End" type="Table">&lt;/table&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Paragraph">&lt;p&gt;</Tag>
        <Seg SegID="44" Data="๎ค‚๎จ„๎ธ†๏˜„๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">
            <FontTag pos="Begin" i="1" x="1" type="Bold">&lt;b&gt;</FontTag>Bold text<FontTag pos="End" i="1" type="Bold">&lt;/b&gt;</FontTag> and <FontTag pos="Begin" i="2" x="2" type="Italic">&lt;i&gt;</FontTag>italics<FontTag pos="End" i="2" type="Italic">&lt;/i&gt;</FontTag> here. </Seg>
        <Seg SegID="45" Data="๎ค‚๎จ„๎ธ†๏˜„๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">And this is a <Tag pos="Begin" i="3" x="3" type="Inline">&lt;a moz-do-not-send=&quot;true&quot; href=&quot;http://www.niederlaendisch.nl&quot;&gt;</Tag>link<Tag pos="End" i="3" type="Inline">&lt;/a&gt;</Tag>.            <NL x="4">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="Paragraph">&lt;/p&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="Paragraph">&lt;p&gt;</Tag>
        <Seg SegID="46" Data="๎ค‚๎จ„๎ธ†๏˜ƒ๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">Last but not least: </Seg>
        <Seg SegID="47" Data="๎ค‚๎จ„๎ธ†๏˜ƒ๏”‚๎ผƒ0๎ผ„20230407๎ผ…060026๎ผ†HL๎ผไ€€๎พฐไ€ไ€ไ€ไ€ไ€ไ€€ไ€€๎ฟ–ไ€„">an image.<NL x="1">&lt;br&gt;</NL>
            <WS type="n"/>
            <Tag pos="Point" x="2" type="InlineImage">&lt;img moz-do-not-send=&quot;true&quot; src=&quot;<UT type="Attr">file:///Users/hl/Dropbox/CT/Img/alinea%20building..jpg</UT>&quot; title=&quot;<SubSeg>Tooltip</SubSeg>&quot; alt=&quot;<SubSeg>Alternate text</SubSeg>&quot; width=&quot;3264&quot; height=&quot;1836&quot;&gt;</Tag>
            <NL x="3">&lt;br&gt;</NL>
            <WS type="n"/>
        </Seg>
        <Tag pos="End" type="Paragraph">&lt;/p&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;/body&gt;</Tag>
        <WS type="n"/>
        <Tag pos="Begin" i="1" type="UnknownStruct">&lt;/html&gt;</Tag>
        <WS type="n"/>
    </Body>
</Transit>

and you derive various transforms into HTML using XQuery, but ...

  1. There is no naturally defined mapping from arbitrary XML structures to HTML (which is itself already a subset of XML), and
  2. Without knowing the encoding, there's not much that you could do with those Data attributes in the Seg elements.
1 Like

Thank you, Rob!

Since I only need the html to be generated for reference (preview), the data attributes can be removed. As a matter of fact, that will be the first step that my macro would do.

I'll try to convert this xml to html in BBEdit first, via multiple steps. If this works, I'll try to group the steps in a macro.

The proper way to do this is to create an XLST stylesheet that transforms your XML to HTML. Who/whatever is generating the XML may already have such a thing, so start there.

Otherwise, should (comparing your XML to the HTML file) be able to delete most of the non-HTML content, because your HTML tags are already in the document -- no "replace" required. Zeroing in on those non-HTML tags could be tricky to generalise completely but, at a quick glance, you should be able to do the majority by deleting "anything that starts with < and is followed by a capital letter, up to and including the next >" -- because all the HTML tags in the sample are lowercase while most of the XML tags start with an uppercase letter.

2 Likes

That's not really a pre-defined morphism / transform. XML is not a format โ€“ just a syntax for building a family of formats.

Certainly feasible, but in practice I think most people would find XQuery syntax a bit less daunting.

Hello Nige,

The xml is created by Transit NXT, a tool for computer-assisted translation. I now realize that the example that I have posted must be quite unclear. What I did was: create an html with Seamonkey, import it in Transit. Because I'm a little familiar with html and the preview file should be in that format, I used html for a starter. But Transit can import docx, odt, IDML etc. too. So html was only an example.

I'm afraid it's a horrible example -- all it's done is wrap your HTML tags in XML tags, making it look like you can simply ignore the XML.

But I'm confused as to why you have XML in the first place? I would have thought that if Transit could import eg .docx it could also export the translation as .docx -- and it will a lot easier to convert that to an HTML preview!

I'm obviously missing something here. Perhaps you could explain what you are trying to do, from beginning to end?

and have you asked other users of Transit NXT whether they have developed analogous preview tools ?

( KM itself seems slightly marginal to this task )

Yes Rob, I have asked, but I got no reply.

Sorry for the confusion, Nige.

The strange xml file is the file format that Transit creates from all file formats that it can import. They separate the text content from binary content (images etc.) and store it in the intermediate xml format. (Once the translation is finished, they merge the xml with the binary content to recreate the original files.)

Inside Transit, these InDesign, FrameMaker, Ms Word etc. files are displayed kind of wysiwyg, via this xml:


The thing is that I often don't receive the imported files or that I cannot open them because I don't have the app that created the imported files.

Another thing is dat I don't want to translate these xml files (generated by Transit) in Transit itself but in a Java app (CafeTran) on macOS. The handling of the xml is already perfect in CafeTran, but there is often no preview file (unless the client sends a pdf that I can open in Skim).

So what I am trying to do here is:

  • Unpack the translation package that the client sends to me.
  • Open the target language xml in CafeTran to translate it.
  • Convert the source language xml to html to preview it via the html viewer in CafeTran (which is synced per paragraph/segment).

I ran a series of replacement actions in BBEdit just to test the concept:

Etc. etc.

It seems to work for this simple file in html format that I have imported in Transit as a first test. It is very likely that more complex file formats, like docx, create more complex xml files in Transit format. That would be the next test.

EDIT: I have created an Ms Word document with the same structure and content and imported that in Transit too. The Transit xml file for this docx is here:

Test Ms Word.ENG.xml.zip (3.4 KB)

A lot of extra font info, but the structure looks similar to the xml created from the html.

To respond to Rob's remark about Keyboard Maestro only being slightly relevant here: that is correct. However, I can try to do the replacements in Keyboard Maestro too, at a later stage.

I'll attach to this post the BBEdit'ed version of the html I created. I added a header manually and didn't care (at this moment) for the image and link.

Test.xml.html.zip (1.0 KB)

I have decided to create a BBEdit textfactory:

Create Transit preview file.textfactory.zip (1.5 KB)

This is tested with xml files created by importing docx files in Transit. It is a first draft.

Nige and Rob: thank you for your input!

What you actually have there is an entity-encoded XML representation of your Word doc, wrapped up in Transit's XML. For example:

<Tag pos="Begin" i="1" type="ListItem">&lt;li&gt;</Tag>

...where the XML <Tag...> and </Tag> enclose &lt;li&gt;, which is the encoded entity <li> -- originally a list item in the Word doc.

This may not be true for other file formats, so anything you do here may only work for Office docs...

Without the appropriate XLSTs it is going to be tricky to get all possible data out of these files. You're going to lose styles (unless you put in a huge amount of work translating the definitions) and, using the S'n'Rs you list above, you are going to lose element identifiers. If that isn't an issue then it's probably the quickest and easiest way to get to where you want.

On a side note: I just read this:

By default, BBEdit will remember the 16 most recent search and replace strings/patterns for the popup menu in the Find and Multi-File Search windows. The size of this history is adjustable:

defaults write com.barebones.bbedit FindAndReplaceHistorySize -int [some number]

A minimum history size of 2 is enforced. Setting the history to zero makes it infinite.

1 Like

Don't set the history to infinite โ€“ if you do you'll eventually regret it...

AppleScripted find/replace is faster in BBEdit than text factories, and I think it's much easier to maintain.

tell application "BBEdit"
   tell front text window's text
      replace "MATCH" using "REPLACE" options {search mode:grep, case sensitive:false, starting at top:true}
      replace "MATCH" using "REPLACE" options {search mode:grep, case sensitive:false, starting at top:true}
   end tell
end tell
1 Like

Thank you, Chris! I agree that this is easier to maintain ... and also available in the free mode.

What would be the syntax for a literal replacement?

Edit: guess what: search mode:literal

1 Like