Labels

slider

Recent

Navigation

HAP: Extract Links From Web Page using HTML Agility Pack

Extract Links From Web Page using HTML Agility Pack C#
Extract Links From Web Page using HTML Agility Pack

Introduction

Hope you must have gone through my latest web scraping post and now we would have a vivid discussion on how to Extract Links From Web Page using HTML Agility Pack. Here, mostly the digital marketing professionals, SEO professionals, Research & analyst and many more could take the advantages to extract more back links or other useful links for some specific purpose. Unlike extraction of text, Data scraping, favicon extraction, capturing image, Data Mining, meta information, and other things, Link extraction is also another valuable activity for the online advertising professionals.

Step #1

Declare function ExtractLinksFromWebPageusingHTMLAgilityPack (String URL)

Step #2

Then Declare HTML document through HTMLWeb() method.

Step #3

Load HTML document (doc) through HtmlAgilityPack.HtmlDocument() method.

Step #4

Define parameter to doc through web.load(URL) method.



Step #5

Now, we have to extract all links available in the web page through foreach(HtmlNode link in doc.DocumentNode.SelectNodes(“//a[@href]”)) method.

Step #6

Then declare HtmlAttribute through link.Attributes[“href”]method.

Step #7

Don’t forget to mention the URL of the web page before the 1st step to detect the targeted web page, For example, ExtractLinksFromWebPageusingHTMLAgilityPack (“http://www.technologycrowds.com”)
I hope you’ll be definitely able to Extract Links From Web Page using HTML Agility Pack with the help of the above steps.

Working Sample :

Share

Anjan kant

Outstanding journey in Microsoft Technologies (ASP.Net, C#, SQL Programming, WPF, Silverlight, WCF etc.), client side technologies AngularJS, KnockoutJS, Javascript, Ajax Calls, Json and Hybrid apps etc. I love to devote free time in writing, blogging, social networking and adventurous life

Post A Comment:

1 comments: