High-throughput linguistic content comparison and sentiment analysis

Scrawlr is a platform for unconstrained, global interaction with all internet content and users. Scrawlr allows
for user evaluation and unconstrained classification of any Scrawlr-hosted or non-Scrawlr content. For non-
Scrawlr content, this evaluation and classification allowance will be first at the URL level but will subsequently
be provided at the individual content component level. Scrawlr will require the capacity to, in multiple
languages, identify equivalent and similar content. This will enable several key internal functions, including
rapid detection of spam, identification of trending topics in multiple languages, and the automatic
identification of plagiarism in multiple languages. Scrawlr also intends to provide automated sentiment
analysis of the contents. This requires expansion in capacity related to high-throughput multi-language
sentiment analysis and classification. This will enable several key internal functions including determination
of sentiment in multiple languages and comparison of this sentiment in relation to evaluation and
classification metrics. This research project is a critical aspect for the company to ensure automated
protection of unique content on its platform, and in particular content that is protected content, from
duplication and reproduction.

Faculty Supervisor:

Nick Koudas

Student:

Partner:

Scrawlr Development Inc.

Discipline:

Computer science

Sector:

Agriculture; Information and cultural industries; Professional, scientific and technical services

University:

University of Toronto

Program: