How do I turn poorly formatted HTML into an XML Object that I can use with an XPath in Power Automate?

  Kiến thức lập trình

I have a Power Automate Flow that I am using to ingest emails, and output information into Sharepoint. When I convert the email body which is HTML into XML and then use XPath to search for information, Power Automate gives me an error saying that the XML is poorly formatted.

Error:
Action ‘Initialize_email_header’ failed: Unable to process template language expressions in action ‘Initialize_email_header’ inputs at line ‘0’ and column ‘0’: ‘The template language function ‘xml’ parameter is not valid. The provided value cannot be converted to XML: ‘The ‘meta’ start tag on line 2 position 2 does not match the end tag of ‘head’. Line 2, position 70.’. Please see https://aka.ms/logicexpressions#xml for usage details.’.

Function I am using:
xpath(xml(triggerOutputs()?['body/body']), '//h2[contains(@id, "client_name")]')

Here is the de-identified beginning of the HTML that it is choking on.

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
            <title>Receipt 87612836123 from Redacted</title>
        </head>
        <body>



            <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
            <html>
                <head>

                    <meta http-equiv="X-UA-Compatible" content="IE=edge">
                        <meta name="viewport" content="width=device-width, initial-scale=1">
                            <title>Transaction #87612836123</title>
                        </head>

I have tried pulling out the body as a variable in a Compose action and operating on that. As expected, it made no difference.

I have tried converting the body to an html file, then ingesting that, but Power Automate says that it is a json object with multiple values and cannot turn it into XML.

I have also tried to figure out how to remove the Meta tags from the file, but haven’t figured how to delete entire lines. I thought maybe I could use a Regex for that (‘<Meta.*’), but I can’t seem to find documentation of Power Automate supporting RegEX, only PowerApps.

I’m open to other ways to find values in an html email body where the body can be dynamic.

New contributor

Asher Noel is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website

LEAVE A COMMENT