Web scraping using C#

Jan 022017 Tagged with , , 0 Responses

HtmlAgilityPack to parse HTML in .NET

Introduction

web-scraping-using-htmlagilitypackScreen Scraping also known as Data Scraping or Data Extraction is a technique of collecting different kind of data from a web page like meta tag information, titles, images, links, contact information(phone & email) and other important data like weather forecasts.

To make Web Scraping into action using .NET, we have very useful .NET library known as HTMLAgilityPack. It provides essential methods navigating, modifying and searching DOM(Document Object Model) Tree. HTMLAgilityPack parses anything you give it even if it’s malformed HTML having missing closing tags, very tolerant! It supports XPath and XSLT for navigating the web page. Read More…

Sep 192016 Tagged with , , , 1 Response

Custom Scripting in Content Grabber

Custom Scraping ScriptWhile Content Grabber is very easy to use web scraping software, you shouldn’t make the mistake to think it is not also very flexible and powerful. Part of this flexibility comes from providing developers with a sophisticated scripting capability for controlling a user’s web scraping agent and managing the data being extracted.

Content Grabber provides scripting in different ways to customize Content Grabber behavior based on your specific needs or to extend and enhance standard functionality. Content Grabber scripts are .NET functions written in C# or VB.NET, or regular expressions. Read More…