Introduction
Hope you must have gone through my latest web scraping post and now we would have a vivid discussion on how to Extract Links From Web Page using HTML Agility Pack. Here, mostly the digital marketing professionals, SEO professionals, Research & analyst and many more could take the advantages to extract more back links or other useful links for some specific purpose. Unlike extraction of text, Data scraping, favicon extraction, capturing image, Data Mining, meta information, and other things, Link extraction is also another valuable activity for the online advertising professionals.![]() |
Step #1
Declare function ExtractLinksFromWebPageusingHTMLAgilityPack (String URL)Step #2
Then Declare HTML document through HTMLWeb() method.Step #3
Load HTML document (doc) through HtmlAgilityPack.HtmlDocument() method.Step #4
Define parameter to doc through web.load(URL) method.Step #5
Now, we have to extract all links available in the web page through foreach(HtmlNode link in doc.DocumentNode.SelectNodes(“//a[@href]”)) method.Step #6
Then declare HtmlAttribute through link.Attributes[“href”]method.Step #7
Don’t forget to mention the URL of the web page before the 1st step to detect the targeted web page, For example, ExtractLinksFromWebPageusingHTMLAgilityPack (“http://www.technologycrowds.com”)using System; using HtmlAgilityPack; public class Program { public static void Main() { // calling method ExtractHref("https://technologycrowds.com"); } static void ExtractHref(string URL) { // declaring & loading dom HtmlWeb web = new HtmlWeb(); HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); doc = web.Load(URL); // extracting all links foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]")) { HtmlAttribute att = link.Attributes["href"]; if (att.Value.Contains("a")) { // showing output Console.WriteLine(att.Value); } } } }
Post A Comment:
0 comments: