👨‍🎓 Biography

I am Dr. Zhe Li (李哲), currently a Postdoctoral Fellow at the The University of Hong Kong (HKU).

My research focuses on speech large language models (LLMs) 🧠 and robust speaker representation learning 🔊, with a broader interest in multimodal AI for healthcare 🩺.

💼 Research Experience

🎓 Postdoctoral Fellow, The University of Hong Kong (HKU)
Medical AI • Speech-Language Models • Multimodal Learning
💻 Research Intern, Microsoft Research Asia (MSRA)
Supervised by Dr. Shujie Liu
LLM fine-tuning • Speech reasoning models • Multilingual adaptation
🧮 Visiting PhD Researcher, Stanford University 🇺🇸
Supervised by Prof. Mert Pilanci
Optimization theory • Efficient adaptation • Spectral methods
🎓 PhD in Electrical and Electronic Engineering, The Hong Kong Polytechnic University (PolyU) 🇭🇰
Supervised by Prof. Man-Wai Mak
Speaker representation • Speaker Verification
🎓 MSc in Software Engineering, Xinjiang University 🇨🇳
Supervised by Prof. Wushour Silamu, Academician of the Chinese Academy of Engineering
Low-resource NLP • Uyghur language modeling • Harmful content detection

🔬 Research Interests

🔊 Speaker Representation Learning – disentanglement, cross-lingual robustness, and PEFT strategies
🩺 Multimodal AI for Healthcare – speech, text, and imaging fusion for disease prediction
🌏 Low-Resource & Multilingual NLP – Uyghur, morphologically rich languages, and cross-lingual transfer

“You are more than what you have become!”

📰 News

🏆 2026

Jan 2026 — 🎉 Our paper “Towards A Unified Perspective on Parameter-Efficient Fine Tuning for Speaker Verification” accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Prof. Mak!
Jan 2026 — 🎉 Two papers accepted to ICASSP 2026 — see you in Barcelona, Spain ! 🇪🇸

🏆 2025

Dec 2025 — 🎉 My First Tutorial! Our tutorial Speech Large Language Models: Architectures, Efficient Adaptation, and Applications has been accepted by IEEE ICME 2026 — see you in Bangkok, Thailand 🇹🇭 (July 5–9, 2026)!
29 Sep 2025 — 🎉 Our paper “WhisMultiNet: Advancing End-to-End Speech Topic Classification with Whisper and MultiGateGNN” has been accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Xiaozhe Qi!
04 Sep 2025 — 🎉 Our paper “Disentangling Speech Representations Learning with Latent Diffusion for Speaker Verification” accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Prof. Mak!
20 Aug 2025 — 🎉 One paper accepted to EMNLP 2025 — see you in Suzhou, China 🇨🇳!
18 Jun 2025 — 🎉 One paper accepted to MICCAI 2025 — see you in Daejeon, South Korea 🇰🇷!
14 Jun 2025 — 🎉 Our paper “Mutual Information-Enhanced Contrastive Learning with Margin for Maximal Speaker Separability” accepted by IEEE/ACM T-ASLP. Thanks to Prof. Mak!
19 May 2025 — 🎉 Two papers accepted to Interspeech 2025 — see you in Rotterdam, Netherland 🇳🇱!
04 Mar 2025 — 🧑🏻‍🏫 Paper Sharing Session: I gave a talk on Spectral-Aware Low-Rank Adaptation for Speaker Verification (ICASSP 2025).
11 Feb 2025 — 🧑🏻‍💻 Joined Microsoft Research Asia (MSRA) as a Research Intern, focusing on multimodal large models for healthcare.

🏆 2024

21 Dec 2024 — 🎉 Four papers accepted to ICASSP 2025 — see you in Hyderabad, India 🇮🇳!
04 Dec 2024 — 🏅 Enhancing Multimodal Rumor Detection with Statistical Image Features and Modal Alignment via Contrastive Learning received Best Student Paper Runner-Up Award 🥈 at PRICAI 2024.
17 Jun 2024 — 🧑🏻‍🏫 Paper Sharing Session: Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification (Interspeech 2024).
03 Apr 2024 — 🧑🏻‍🏫 Paper Sharing Session: Dual Parameter-Efficient Fine-Tuning for Speaker Representation via Speaker Prompt Tuning and Adapters (ICASSP 2024).

🎤 2023

08 Dec 2023 — Presented Maximal Speaker Separability via Robust Speaker Representation Learning at NCMMSC 2023, Soochow, China 🇨🇳.
03 Dec 2023 — Presented Maximal Speaker Separability via Contrastive Learning with Angular Margin and Class-Aware Attention for Hard Samples at International Doctoral Forum 2023, Hong Kong SAR 🇭🇰.

📚 2022–2020

15 May 2023 — Paper Sharing Session: Discriminative Speaker Representation via Contrastive Learning with Class-Aware Attention in Angular Space (ICASSP 2023).
01 Jul 2022 — Participant Talk: Shared on speaker verification at Odyssey-CNSRC Workshop 2022.
29 May 2021 — 🎓 Completed Master’s oral examination.
14 Nov 2020 — 🏅 CAAI Award: Received the Excellent Scientific and Technological Achievements Award of the Chinese Association for Artificial Intelligence.
29 Oct 2020 — Video: Uploaded CCL 2020 oral presentation.
11 Oct 2020 — Video: Uploaded CCMT 2020 oral presentation.

Zhe LI