Algorithm for Robust Spam Blog Detection

In the world of today’s internet, ‘web notifications’ are becoming increasingly common. As opposed to a traditional search which is executed once against entered keywords, a web notification search is performed on a regular or continual basis, keeping track of available information changes in real time. The user is then notified once an item of interest appears on-line. Notification services receive continuous pings from blogs which inform of updates. Unfortunately many blogs are created with the purpose of artificial inflation of the page ranking of some websites, or of enticing the user to click on advertising banners, or for aggressive advertising of (often dubious) goods and services not related to the keywords that the page will be shown for. These types of blog are usually referred as 'spam blogs', or 'splogs'. An essential feature of a useful web notification system is the capability of filtering out these unwanted results. The goal of this project is to find a set of heuristics that will robustly determine if a particular blog is spam, and thus to enable the web notification service to ignore unwanted updates.

Faculty Supervisor:

Dr. Andrei Bulatov

Student:

Evgeny Skvortsov

Partner:

Something Simpler Systems

Discipline:

Computer science

Sector:

Information and communications technologies

University:

Simon Fraser University

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects