Labels

slider

Recent

Navigation

Learn HAP: Extract Meta-Information from the website using HTML agility pack

Extract Meta Information from website using html agility pack, scrap website using html agility pack, html agility pack tutorial c#

Introduction

Until far, we have discussed What is HTML Agility Pack and then about how to install it through HAP: Learn Install HTML agility pack and Load an HTML Document. Following it, you were able to unleash the power of this useful library by loading the HTML document and extracting all the href values present in a web page in the Learn HAP: Extract all Href value from HTML Document using HTML agility pack. In this tutorial, we are about to advance to another level by gaining knowledge on how to extract Meta-Information from the website using HTML agility pack and thus, you are also about to learn on scrap website using HTML agility pack.

Extract Meta-Information from the website using HTML agility pack

Free Video Library: Learn HTML Agility Pack Step by Step

Get the Meta Info using HAP

Namespace

using System.Collections.Generic; using System.Linq;
using System.Text;
using System.Threading.Tasks; using HtmlAgilityPack;

Load HTML document using HTML Agility Pack

You have already learnt how to load HTML Document using HTML Agility Pack

HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc = web.Load("http://technologyCrowds.com");
GetMetaInformation(doc, "description");

Creating Methods to Extract Meta Information

static void GetMetaInformation(HtmlAgilityPack.HtmlDocument htmldoc, string value)
{
 HtmlNode tcNode = htmldoc.DocumentNode.SelectSingleNode("//meta[@name='" + value + "']");
 string fulldescription = string.Empty;
 if (tcNode != null)
 {
  HtmlAttribute desc;
  desc = tcNode.Attributes["content"];
  Console.ForegroundColor = ConsoleColor.Red;
  Console.Write(desc.Value);
  Console.ReadLine();
 }
}
Extract Meta-Information from the website

Now you see above steps, how extracted meta information from the website.

Conclusion

Through the example that has been provided here, you are now in a position to scrap website using HTML agility pack. You can thereafter use this tutorial as help to extract Meta-Information from the website using HTML agility pack.
Suggested Reading
Share

Anjan kant

Outstanding journey in Microsoft Technologies (ASP.Net, C#, SQL Programming, WPF, Silverlight, WCF etc.), client side technologies AngularJS, KnockoutJS, Javascript, Ajax Calls, Json and Hybrid apps etc. I love to devote free time in writing, blogging, social networking and adventurous life

Post A Comment:

0 comments: