C# web scraping

Jan 022017 Tagged with , , 0 Responses

HtmlAgilityPack to parse HTML in .NET

Introduction

web-scraping-using-htmlagilitypackScreen Scraping also known as Data Scraping or Data Extraction is a technique of collecting different kind of data from a web page like meta tag information, titles, images, links, contact information(phone & email) and other important data like weather forecasts.

To make Web Scraping into action using .NET, we have very useful .NET library known as HTMLAgilityPack. It provides essential methods navigating, modifying and searching DOM(Document Object Model) Tree. HTMLAgilityPack parses anything you give it even if it’s malformed HTML having missing closing tags, very tolerant! It supports XPath and XSLT for navigating the web page. Read More…