Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
The main goal of this proposed project is to develop a novel bilingual topic model, which explicitly models the word co-occurrence cross-lingual in document-aligned comparable data using a novel merging and shuffling strategy, called CL-BTM. Given a document-aligned multilingual corpus, CL-BTM can be employed to extract latent cross-lingual topics that optimally describe the observed data and discover language-specific per-topic word distributions in each language. A novel bilingual topic model is used to obtain the shared global topic distributions and language-specific topic-word distributions. Ideally, the hierarchical representations of text would be well applied for text understanding and classifications. For further application, the topic coherence and the correlation between entities can be accurately extracted in a document using both the local information (represented as biterm) and the global knowledge (topic knowledge) in a knowledge base, by jointly modeling and exploiting the context compatibility.
Yong Zeng
Huazhong University of Science and Technology
Computer science
Education
Concordia University
Globalink Research Award
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.