Zihao Li

I am currently pursuing Master’s degree in Data Science at the University of Helsinki. Previously, I got my Bachelor's degree in Computer Science from Northeastern University, China.

Since 2024, I have been conducting research with the Language Technology Research Group at the University of Helsinki, where I'm fortunate to work with Dr. Timothee Mickus, Dr. Shaoxiong Ji, and Prof. Jörg Tiedemann.

I am interested in Multilingual NLP, Large Language Model, and Machine Translation. My current research focuses on expanding the multilingual capabilities of LLM, especially in low-resource languages.

Email  /  CV  /  Github  /  LinkedIn  /  Google Scholar

profile photo

Publications

A close-up of a llama standing in a mountainous landscape, with a golden-brown and white coat, set against a backdrop of rolling hills and snow-capped peaks. EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji, Zihao Li, Indraneil Paul, Jaakko Paavola, Peiqin Lin, Pinzhen Chen, Dayyán O'Brien, Hengyu Luo, Hinrich Schütze, Jörg Tiedemann Barry Haddow
Arxiv
[Paper] / [Github] / [Model] / [Datasets]
A diagram showing different multilingual pretraining objectives for language models. It includes five models: 2-LM, 2-MT, MLM, CLM, and TLM. Each model has an encoder-decoder structure, except MLM, which only has an encoder, and CLM, which only has a decoder. The inputs and outputs are various sentences in English, French, and Chinese, showing different encoding and decoding processes in multilingual contexts. A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives
Zihao Li, Shaoxiong Ji*, Timothee Mickus*, Vincent Segonne, Jörg Tiedemann
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024)
[Paper] / [Github]

Education

UH

University of Helsinki

Master of Science - MS, Data Science
2023 – 2025
NEU

Northeastern University (CN)

Bachelor of Engineering - BE, Computer Science
2019 – 2023

Experience

UH

University of Helsinki

Research Assistant
2024.7 – 2024.9
EDF

Électricité de France

NLP Engineer Intern
2024.4 – 2024.6
BMW

BMW Brilliance Automotive

Product Intern
2023.3 – 2023.6
Maxnet

Maxnet Inc.

Software Engineer Intern
2022.7 – 2022.8