I am Mingkai Sheng, a postgraduate student of University of Chinese Academy of Sciences(UCAS) , registered at Institute of Software Chinese Academy of Sciences(ISCAS) , and studying at Hangzhou Institute for Advanced Study(HIAS), UCAS. .

My graduate supervisor is Dr. Lingfang Zeng, Head of the ZJ Lab-Enflame Joint Innovation Research Center. Currently, I am engaged in scientific research in Zhejiang Laboratory .

I am now working on Multi-modal Deep learning, including VQA, Text-Image Retrieval, Referring Expression Comprehension and so on. If you are seeking any form of academic cooperation, please feel free to email me at shengmingkai22@mails.ucas.ac.cn.

My research interest includes Large Language Models(LLM) and Multi-modal Deep Learning. I have published ## papers at the top international AI conferences such as NeurIPS, ICML, ICLR, KDD.

πŸ”₯ News

  • 2023.09: Β πŸŽ‰πŸŽ‰ A paper were accepted by the 2023 IEEE International Conference on Knowledge Graph (ICKG) …
  • 2022.08: Β πŸŽ‰πŸŽ‰ He is doing something …
- *2023.05*: πŸŽ‰ Five papers are accepted by ACL 2023 - *2023.04*: πŸ”₯ We release [AudioGPT](https://github.com/AIGC-Audio/AudioGPT) (⭐️6k+) - *2023.04*: πŸŽ‰ One paper ([Make-an-Audio](https://text-to-audio.github.io/)) is accepted by ICML 2023 - *2023.01*: DiffSinger was introduced in [a very popular video](https://www.bilibili.com/video/BV1uM411t7ZJ) (2000k+ views) in Bilibili! - *2023.01*: Three papers are accepted by ICLR 2023! - *2023.01*: I join [Bytedance AI Lab, Speech & Audio Team](https://ailab.bytedance.com/) as a research scientist in Singapore! - *2022.12*: πŸŽ‰ My [google scholar](https://scholar.google.com/citations?user=4FA6C0AAAAAJ) citations have exceeded 2000! - *2022.02*: I release a modern and responsive academic personal [homepage template](https://github.com/RayeRen/acad-homepage.github.io). Welcome to STAR and FORK!

πŸ“ Publications

πŸ‘Ύ Multi-Modal

ICASSP 2024
sym

CLIP Multi-modal Hashing: A new baseline CLIPMH
Jian Zhu, Mingkai Sheng, Mingda Ke, Zhangmin Huang, Jingfei Chang

Project

  • CLIP Multi-modal Hashing: A new baseline.
  • Academic Impact: Compared with state-of-the-art unsupervised and supervised multi-modal hashing methods, experiments reveal that the proposed CLIPMH can significantly enhance performance (Maximum increase of 8.38%).
  • Industry Impact: ###.

πŸ“š Visual Question Answering

  • ACL 2024 TODO, Mingkai Sheng |

🎼 Understanding Generation

  • AAAI 2024 TODO, Mingkai Sheng |

πŸ§‘β€πŸŽ¨ Generative Model

  • ICLR 2024 TODO, Mingkai Sheng |

Others

  • ACM-MM 2024 TODO, Mingkai Sheng |

πŸŽ– Honors and Awards

πŸ“– Educations

  • 2022.09 - (now), Master - University of Chinese Academy of Sciences, major in Artificial Intelligence.
  • 2013.07 - 2017.09, Undergraduate - Zhengzhou University, major in Computer Science and Technology.

πŸ’¬ Invited Talks

  • 2021.06, Mingkai Sheng Mingkai Sheng. | [Topic]
  • 2021.03, Mingkai Sheng Mingkai Sheng. | [Video]

πŸ’» Internships

  • 2022.04 - 2022.08, SeeHi, AI Algorithm Engineer.
  • 2019.05 - 2021.05, Tuya Inc, AI Algorithm Engineer.
  • 2017.05 - 2018.05, Dtstack, Big Data Engineer.

This homepage is under construction. I will update it soon...