Featured Posts

Sun's ZFS now has built-in deduplication A few years ago, Deduplication was the industries biggest buzz word. Well that hype actually pan-out and is a vital way to save disk space. In a nutshell, Deduplication is the process of eliminating duplicate...

Read more

Windows XP Freezes There are many reasons why Windows XP freezes. One of the reasons may be that the registry is polluted with errors, or it is likely that your computer is bogged down by temporary Internet files. Having...

Read more

WPMeS eBay Affiliate Wordpress Plugin

Parsing Web Pages

Posted by thewriter | Posted in Web Design | Posted on 17-10-2009

Tags: , ,

3

Parsing a Web Page is also known as screen scraping. This task is to identify certain portions of a web page and extract the data out to fit your needs. For example, if you went to a website that showed today’s stock totals. You may want to use just the totals so that you can export it into an Excel spreadsheet.

If you are a programmer or familiar with program languages, you of course cold hack through and create your own algorithm to parse through the html tags. Unfortunately there is no standard way to display information on any given web page. Some webmasters for example will place table or row/column type information into an html tag called table. Others and most of the newer web designers are now displaying this type of information in html tags known as div. But, within the tables or divs, there could be other tags. So if you really want to parse web pages yourself, you should first have a grasp of the html language.