High-throughput linguistic content comparison and sentiment analysis

Scrawlr is a platform for unconstrained, global interaction with all internet content and users. Scrawlr allows for user evaluation and unconstrained classification of any Scrawlr-hosted or non-Scrawlr content. For non-Scrawlr content, this evaluation and classification allowance will be first at the URL level but will subsequently be provided at the individual content component level. Scrawlr will require the capacity to, in multiple languages, identify equivalent and similar content. This will enable a number of key internal functions, including rapid detection of spam, identification of trending topics in multiple languages, and the automatic identification of plagiarism in multiple languages. Scrawlr also intends to provide automated sentiment analysis of the contents. This requires expansion in capacity related to high-throughput multi-language sentiment analysis and classification. This will enable several key internal functions including determination of sentiment in multiple languages and comparison of this sentiment in relation to evaluation and classification metrics. This research project is a critical aspect for the company to ensure automated protection of unique content on its platform, and in particular content that is protected content, from duplication and reproduction.

Faculty Supervisor:

Fatemeh Hendijani Fard;Nick Koudas

Student:

Partner:

Scrawlr Development Inc.

Discipline:

Computer science

Sector:

Agriculture; Information and cultural industries; Professional, scientific and technical services

University:

The University of British Columbia - Okanagan; University of Toronto

Program: