How to Clean an HTML String in JavaScript?

Estimated read time 1 min read

To clean an HTML string in JavaScript, you can use the DOMParser API to create a new DOM tree from the HTML string and then serialize it back to a string using the XMLSerializer API. This process effectively removes any invalid or unsafe elements, attributes, or content from the HTML string. Here’s an example:

// HTML string to be cleaned
const dirtyHtml = '<p>Some <script>evil()</script> HTML</p>';

// Create a new DOMParser object
const parser = new DOMParser();

// Parse the HTML string into a new DOM tree
const cleanDoc = parser.parseFromString(dirtyHtml, 'text/html');

// Serialize the cleaned DOM tree back to an HTML string
const cleanHtml = new XMLSerializer().serializeToString(cleanDoc);

console.log(cleanHtml); // Output: "<p>Some HTML</p>"

In this example, the DOMParser API parses the dirtyHtml string into a new DOM tree, which effectively removes the <script> tag and its contents. The XMLSerializer API then serializes the cleaned DOM tree back to a string, which results in the cleaned HTML string <p>Some HTML</p>.

You May Also Like

More From Author

+ There are no comments

Add yours

Leave a Reply