About Me
Introduction
Hello! I’m Sicong Leng, a 2nd-year P.h.D. at Nanyang Technological University. I am currently under Alibaba-NTU talent programme and jointly supervised by Prof.Lu Shijian (Visual-Intelligence Lab/S-lab) and Dr.Bing Lidong (Alibaba DAMO Academy).
I specialize in Deep Learning with a focus on Multi-modality research, especially for Vision+Language. Feel free to reach out to me for collaborations, questions, or just to chat!
News
- [24.10] Inf-CLIP has been released! Check out our project here.
- [24.10] CMM has been released! Check out our project here.
- [24.09] 1 paper accepted by NeurIPS 2024! Congratulations to the co-authors!
- [24.06] VideoLLaMA 2 has been released! Check out our paper and code here
- [24.04] VCD has received the CVPR 2024 Highlight!
- [24.03] 3 papers accepted by CVPR 2024! Congratulations to the co-authors!
- [23.11] VCD has been released! Check out our paper and code here
- [23.08] We present our work at Nvidia Internal Technical Sharing!
- [23.08] We present our work at AAAI 2023 Summer Symposium Series!
- [23.07] Tell2Design has received the Area Chair Award and Best Paper Nomination at ACL 2023!
- [23.06] Our paper Tell2Design has been accepted by ACL 2023 as a long oral paper!
Awards
- ACL 2023 Area Chair Award
- ACL 2023 Best Paper Nomination
Selected Publications
- Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss [paper] [code]
- Zesen Cheng, Hang Zhang, Kehan Li, Sicong Leng, Zhiqiang Hu, Fei Wu, Deli Zhao, Xin Li, Lidong Bing
- ArXiv 2024
- The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio [paper] [project] [code]
- Sicong Leng*, Yun Xing*, Zesen Cheng*, Yang Zhou, Hang Zhang, Xin Li, Deli Zhao, Shijian Lu, Chunyan Miao, Lidong Bing
- ArXiv 2024
- VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs [paper] [code]
- Zesen Cheng*, Sicong Leng*, Hang Zhang*, Yifei Xin*, Xin Li*, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing
- ArXiv 2024
- Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding [paper] [code]
- Sicong Leng*, Hang Zhang*, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing
- CVPR 2024 $\color{red}{\text{(Highlight)}}$
- Tell2Design: A Dataset for Language-Guided Floor Plan Generation [paper] [code]
- Sicong Leng*, Yang Zhou*, Mohammed Haroon Dupty, Wee Sun Lee, Sam Conrad Joyce, Wei Lu
- ACL 2023 $\color{red}{\text{(Area Chair Award) (Best Paper Nomination)}}$
Please refer to Google Scholar for the full list of publications.
Service
- Reviewer:
- 2025: NAACL
- 2024: EMNLP, WACV
- 2023: EMNLP, CoNLL, NIPS, ACL
- Program Committee:
- EMNLP 2023 Industry Track
Work experience
- Aug 2021 - Aug 2023: Research Assistant
- StatNLP Lab, Singapore University of Technology and Design
- Research on NLP and Multi-modal Learning
- Supervisor: Professor Lu Wei
Website last updated on 25th October 2024.