๐Ÿ‘จโ€๐ŸŽ“ Biography

Dr. Zhe Li is a Postdoctoral Fellow at The University of Hong Kong (HKU). His research focuses on speech large language models (Speech LLMs) and robust speaker representation learning, with broader interests in multimodal AI for healthcare applications. He received his Ph.D. degree from the Department of Electrical and Electronic Engineering at The Hong Kong Polytechnic University (PolyU). He was a research intern at Microsoft Research Asia (MSRA) and previously conducted international collaborative research as a visiting student scholar with the Department of Electrical Engineering, Stanford University. As a key contributor, he received the 2020 Excellent Science and Technology Achievement Award from the Chinese Association for Artificial Intelligence, and his co-authored paper received the Best Student Paper Runner-Up Award at PRICAI 2024.

โ€œYou are more than what you have become!โ€


๐Ÿ”ฌ Research Interests

  • ๐Ÿง  Speech Large Language Models (Speech LLMs) โ€“ efficient fine-tuning, post-training alignment, and speech-based healthcare applications
  • ๐Ÿ—ฃ๏ธ Speech Signal Processing โ€“ speaker representation learning, accent recognition, and robust speech modeling
  • ๐Ÿฉบ Multimodal and Deep Learning โ€“ multimodal representation learning, cross-modal fusion

๐Ÿ“ฐ News

๐Ÿ† 2026

  • Apr. 2026 โ€” ๐ŸŽ‰ Our paper โ€œUncertainty-Aware Multi-Head Multi-Mode Knowledge Distillation for Self-Supervised Speaker Verificationโ€ accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Dr. Jin!
  • Apr. 2026 โ€” ๐ŸŽ‰ Our tutorial Speech Large Language Models for Under-Resourced Languages has been accepted by InterSpeech 2026 โ€” see you in September 27 โ€“ October 1, Sydney, Australia ๐Ÿ‡ฆ๐Ÿ‡บ!
  • Mar. 2026 โ€” ๐ŸŽ‰ Our paper โ€œTowards A Unified Perspective on Parameter-Efficient Fine Tuning for Speaker Verificationโ€ accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Prof. Mak!
  • Jan 2026 โ€” ๐ŸŽ‰ Two papers accepted to ICASSP 2026 โ€” see you in 4-8 May 2026, Barcelona, Spain! ๐Ÿ‡ช๐Ÿ‡ธ

๐Ÿ† 2025

  • Dec 2025 โ€” ๐ŸŽ‰ My First Tutorial! Our tutorial Speech Large Language Models: Architectures, Efficient Adaptation, and Applications has been accepted by IEEE ICME 2026 โ€” see you in Bangkok, Thailand ๐Ÿ‡น๐Ÿ‡ญ (July 5โ€“9, 2026)!
  • 29 Sep 2025 โ€” ๐ŸŽ‰ Our paper โ€œWhisMultiNet: Advancing End-to-End Speech Topic Classification with Whisper and MultiGateGNNโ€ has been accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Xiaozhe Qi!
  • 04 Sep 2025 โ€” ๐ŸŽ‰ Our paper โ€œDisentangling Speech Representations Learning with Latent Diffusion for Speaker Verificationโ€ accepted by IEEE Transactions on Audio, Speech, and Language Processing (T-ASLP)! Thanks to Prof. Mak!
  • 20 Aug 2025 โ€” ๐ŸŽ‰ One paper accepted to EMNLP 2025 โ€” see you in Suzhou, China ๐Ÿ‡จ๐Ÿ‡ณ!
  • 18 Jun 2025 โ€” ๐ŸŽ‰ One paper accepted to MICCAI 2025 โ€” see you in Daejeon, South Korea ๐Ÿ‡ฐ๐Ÿ‡ท!
  • 14 Jun 2025 โ€” ๐ŸŽ‰ Our paper โ€œMutual Information-Enhanced Contrastive Learning with Margin for Maximal Speaker Separabilityโ€ accepted by IEEE/ACM T-ASLP. Thanks to Prof. Mak!
  • 19 May 2025 โ€” ๐ŸŽ‰ Two papers accepted to Interspeech 2025 โ€” see you in Rotterdam, Netherland ๐Ÿ‡ณ๐Ÿ‡ฑ!
  • 04 Mar 2025 โ€” ๐Ÿง‘๐Ÿปโ€๐Ÿซ Paper Sharing Session: I gave a talk on Spectral-Aware Low-Rank Adaptation for Speaker Verification (ICASSP 2025).
  • 11 Feb 2025 โ€” ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป Joined Microsoft Research Asia (MSRA) as a Research Intern, focusing on multimodal large models for healthcare.

๐Ÿ† 2024

  • 21 Dec 2024 โ€” ๐ŸŽ‰ Four papers accepted to ICASSP 2025 โ€” see you in Hyderabad, India ๐Ÿ‡ฎ๐Ÿ‡ณ!
  • 04 Dec 2024 โ€” ๐Ÿ… Enhancing Multimodal Rumor Detection with Statistical Image Features and Modal Alignment via Contrastive Learning received Best Student Paper Runner-Up Award ๐Ÿฅˆ at PRICAI 2024.
  • 17 Jun 2024 โ€” ๐Ÿง‘๐Ÿปโ€๐Ÿซ Paper Sharing Session: Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification (Interspeech 2024).
  • 03 Apr 2024 โ€” ๐Ÿง‘๐Ÿปโ€๐Ÿซ Paper Sharing Session: Dual Parameter-Efficient Fine-Tuning for Speaker Representation via Speaker Prompt Tuning and Adapters (ICASSP 2024).

๐ŸŽค 2023

  • 08 Dec 2023 โ€” Presented Maximal Speaker Separability via Robust Speaker Representation Learning at NCMMSC 2023, Soochow, China ๐Ÿ‡จ๐Ÿ‡ณ.
  • 03 Dec 2023 โ€” Presented Maximal Speaker Separability via Contrastive Learning with Angular Margin and Class-Aware Attention for Hard Samples at International Doctoral Forum 2023, Hong Kong SAR ๐Ÿ‡ญ๐Ÿ‡ฐ.

๐Ÿ“š 2022โ€“2020

  • 15 May 2023 โ€” Paper Sharing Session: Discriminative Speaker Representation via Contrastive Learning with Class-Aware Attention in Angular Space (ICASSP 2023).
  • 01 Jul 2022 โ€” Participant Talk: Shared on speaker verification at Odyssey-CNSRC Workshop 2022.
  • 29 May 2021 โ€” ๐ŸŽ“ Completed Masterโ€™s oral examination.
  • 14 Nov 2020 โ€” ๐Ÿ… CAAI Award: Received the Excellent Scientific and Technological Achievements Award of the Chinese Association for Artificial Intelligence.
  • 29 Oct 2020 โ€” Video: Uploaded CCL 2020 oral presentation.
  • 11 Oct 2020 โ€” Video: Uploaded CCMT 2020 oral presentation.

๐Ÿ’ผ Research Experience


Hidden Visit Tracker