Posts Tagged web scraping
Parsing Web Pages
Posted by thewriter in Web Design on October 17th, 2009
Parsing a Web Page is also known as screen scraping. This task is to identify certain portions of a web page and extract the data out to fit your needs. For example, if you went to a website that showed today’s stock totals. You may want to use just the totals so that you can export it into an Excel spreadsheet.
If you are a programmer or familiar with program languages, you of course cold hack through and create your own algorithm to parse through the html tags. Unfortunately there is no standard way to display information on any given web page. Some webmasters for example will place table or row/column type information into an html tag called table. Others and most of the newer web designers are now displaying this type of information in html tags known as div. But, within the tables or divs, there could be other tags. So if you really want to parse web pages yourself, you should first have a grasp of the html language.


