Navigation

Learn HAP: HTML Manipulation using html agility pack

HTML Traversing using html agility pack, child nodes in html agility pack, first child in html agility pack, last child in html agility pack, next sibling in html agility pack, parent node in html agility pack, HTML Manipulation using html agility pack

Introduction

The previous sessions contained a great discussion on the topics such as Learn Select Nodes using Html Agility Pack. In the present one, we would be learning yet another important topic which is  HTML Manipulation using html agility pack. In the world of dynamic HTML requirements, it is very much a part of the job to be able to manipulate the HTML content according to the demography and needs of the clients. There are four important properties of an HTML with which we can modify or change the complete contents on the fly. Each of them have been described below along with few methods that are also available to use.

HTML Manipulation using html agility pack

Free Video Library: Learn HTML Agility Pack Step by Step

Inner HTML

This is a public method, which means it could be accessed from anywhere. Using this, you could either set or get the HTML content present within the boundaries of opening and closing tags of the mentioned HTML object. If getting the content is your objective, then you would be obtaining it in a string data type. One thing to note is that the InnerHtml in html agility pack is indeed a member of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
This is <b>C#, ASP.Net</b> paragraph
   <h1>
.Net Core with Angular</h1>
This is <b>HTML Agility Pack</b> sample


  </body>";

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);

var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");

foreach (var node in htmlNodes)
{

 Console.WriteLine(node.InnerHtml);
}

Ouput

This is <b>C#, ASP.Net</b> paragraph This is <b>HTML Agility Pack</b> sample

Inner Text

This method is also a public one and returns string if you are going to access it for getting the contents. The InnerText in html agility pack is your choice if all you want is just the text between the opening and closing tags of the desired HTML object. You could get the text present within the elements and thus is an easy task for you to perform the read operation dynamically. This method is also a part of the member of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>
.Net Core</h1>
This is <b>C#, ASP.Net</b> paragraph
   <h1>
.Net Core with Angular</h1>
This is <b>HTML Agility Pack</b> sample
  </body>";

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);

var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");

foreach (var node in htmlNodes)
{
 Console.WriteLine(node.InnerText);
}

Output

This is C#, ASP.Net paragraph This is HTML Agility Pack sample

Outer Html

This method lets you to get the object as well as the contents inside the one that you have mentioned to it. Seemingly, it could have a resemblance with the innerHTML but there is quite a big difference when using the OuterHtml in html agility pack as with the OuterHTML you have straightaway access to the HTML object. Again, this method is a public one and returns the output in the form of a string. Needless to say, it is a part of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
<p>This is <b>C#, ASP.Net</b> paragraph</p>
   
<h1>.Net Core with Angular</h1>
<p>This is <b>HTML Agility Pack</b> sample</p>
</body>";

 var htmlDoc = new HtmlDocument();
 htmlDoc.LoadHtml(html);

 var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");

 foreach (var node in htmlNodes)
 {
  Console.WriteLine(node.OuterHtml);
 }

Output

<h1>.Net Core</h1> <h1>.Net Core with Angular</h1>

Parent Node

We have yet another useful feature in the form of ParentNode in html agility pack where we can obtain the handle of the parent node of the mentioned HTML object. Few times, it is necessary to know the parent node and this method fits into the right category of use. It returns the parent node and hence the method has the return type as HtmlNode. Thus, one can finally conclude that even this method is also a part of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
<p>This is <b>C#, ASP.Net</b> paragraph</p>   
<h1>.Net Core with Angular</h1>
<p>This is <b>HTML Agility Pack</b> sample</p>
</body>";

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);

var node = htmlDoc.DocumentNode.SelectSingleNode("//body/h1");

HtmlNode parentNode = node.ParentNode;
Console.WriteLine(parentNode.Name);

Output

body

Relevant Reading

Share

Anjan Kant

Outstanding journey in Microsoft Technologies (ASP.Net, C#, SQL Programming, WPF, Silverlight, WCF etc.), client side technologies AngularJS, KnockoutJS, Javascript, Ajax Calls, Json and Hybrid apps etc. I love to devote free time in writing, blogging, social networking and adventurous life

Post A Comment:

0 comments: