Web Page Classification for Web Data Mining

Web page classification is the process of assigning a web page to one or more predefined categories and it is one of the essential techniques of web mining. Web page classification identifies what type of web page we are extracting data from and can help search engines to effectively deal with and rank web pages into categories. Machine learning and data mining techniques are usually used in web page classification. This project will review, analyze and compare several existing machine learning and data mining methods and select optimal one(s) that can fulfill our goals. For companies like our industrial partner SweetIQ that provide local analytics and insight for large brands and marketing agencies, web page classification techniques can help them to build a healthy mix of listings on search engines, large directories, niche directories, blogs, wikis and so on. Eventually, the technique will provide more insight as to the distribution of the types of web pages their local business listings are found on.

Faculty Supervisor:

Dr. Morad Benyoucef


Zhengyang (Steve) Lu






Information and communications technologies


University of Ottawa



