How to Extract All Links from a Website Using JavaScript?

Estimated read time 2 min read

You can extract all links from a website using JavaScript by making an AJAX request to the website using the XMLHttpRequest object or the fetch() function, then using the DOMParser object to parse the HTML response into a document object, and finally using the getElementsByTagName() method to get all the a (anchor) elements in the document and extract their href attribute values. Here’s an example code snippet:

const xhr = new XMLHttpRequest();
xhr.open("GET", "https://www.example.com");
xhr.onload = () => {
  if (xhr.status === 200) {
    const parser = new DOMParser();
    const htmlDoc = parser.parseFromString(xhr.responseText, "text/html");
    const links = htmlDoc.getElementsByTagName("a");
    const hrefs = Array.from(links).map(link => link.href);
    console.log(hrefs);
  } else {
    console.error("Error loading website:", xhr.statusText);
  }
};
xhr.send();

In the above code, the XMLHttpRequest object is used to make a GET request to a website (in this case, “https://www.example.com“). The onload callback function is called when the request completes, and the responseText property of the XMLHttpRequest object is used to get the HTML content of the website. The DOMParser object is then used to parse the HTML content into a document object, which is stored in the htmlDoc variable. The getElementsByTagName() method is used to get all the a elements in the document, and the href attribute values of each element are extracted using the map() method and stored in the hrefs array. Finally, the hrefs array is logged to the console. Note that this method may not work if the website’s content is loaded dynamically using JavaScript.

You May Also Like

More From Author

+ There are no comments

Add yours

Leave a Reply