Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

About me

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

2022.07 Speaker of Participant Talk

Attachment is the Certificate of Appreciation in recognition and appreciation of your contribution as ‘Speaker of Participant Talk’ in Odyssey-CNSRC workshop 2022.!

publications

talks

YuQ: A Chinese-Uyghur Medical-domain Neural Machine Translation Dataset Towards Knowledge-driven

Published:

Recent advances in deep learning have been successful in delivering state-of-the-art performance in medical analysis, However, deep neural networks (DNNs) require a large amount of training data with a high-quality annotation which is not available or expensive in the field of the medical domain. The research of medical domain neural machine translation(NMT) is largely limited due to the lack of parallel sentences that consist of medical domain background knowledge annotations. To this end, we propose a Chinese Uyghur NMT knowledge-driven dataset, YuQ, which refers to ground medical domain knowledge graphs. Our corpus contains 65K parallel sentences from the medical domain and 130K utterances. By introducing medical domain glossary knowledge to the training model, we can win the challenge of low translation accuracy in Chinese-Uyghur machine translation professional terms. We provide several benchmark models. Ablation study results show that the models can be enhanced by introducing domain knowledge.

Low-Resource Text Classification via Cross-Lingual Language Model Fine-Tuning

Published:

Text classification tends to be difficult when data are inadequate considering the amount of manually labeled text corpora. For low-resource agglutinative languages including Uyghur, Kazakh, and Kyrgyz (UKK languages), in which words are manufactured via stems concatenated with several suffixes and stems are used as the representation of text content, this feature allows infinite derivatives vocabulary that leads to high uncertainty of writing forms and huge redundant features. There are major challenges of low-resource agglutinative text classification the lack of labeled data in a target domain and morphologic diversity of derivations in language structures. It is an effective solution which fine-tuning a pre-trained language model to provide meaningful and favorable-to-use feature extractors for downstream text classification tasks. To this end, we propose a low-resource agglutinative language model fine-tuning 𝐴𝑔𝑔𝑙𝑢𝑡𝑖𝐹𝑖𝑇, specifically, we build a low-noise fine-tuning dataset by morphological analysis and stem extraction, then fine-tune the cross-lingual pre-training model on this dataset. Moreover, we propose an attention-based fine-tuning strategy that better selects relevant semantic and syntactic information from the pre-trained language model and uses those features on downstream text classification tasks. We evaluate our methods on nine Uyghur, Kazakh, and Kyrgyz classification datasets, where they have significantly better performance compared with several strong baselines.

teaching

Foundations of Data Science

Postgraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Artificial Intelligence and Science Fiction

Postgraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Multimodal Human Computer Interaction Technologies

Undergraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Deep Learning and Deep Neural Networks

Undergraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Digital Image Processing

Postgraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Speech Processing and Recognition

Postgraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.

Computer Programming

Undergraduate Course, Hong Kong Polytechnic University, Department of Electronic and Information Engineering, 2024

Laboratory supervision, laboratory exercise development, conducting tutorials, marking tests/homework scripts, guiding project students, examination invigilation, etc.