Currently, I am a third-year PhD student in Computer Science at the Computer Vision Group in the School of Electronic Engineering and Computer Science at the Queen Mary University of London, under the supervision of Prof. Shaogang Gong.

Before joining QMUL, I earned my Bachelor’s degree at Communication University of China and my Master’s degree at University College London.

My current research interests revolve around Deep Learning and Computer Vision, with a particular focus on Multimodal Video Understanding and Self-supervised Learning.

🔥 News

  • 2025.09:  🎉🎉 One paper is accepted to NeurIPS’25!
  • 2025.04:  🎉🎉 Our INT is accepted to IJCAI’25!
  • 2024.07:  🎉🎉 One paper is accepted to ECCV’24!

📝 Publications

ArXiv 2025
NeurIPS2025

Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Video Temporal Grounding, NeurIPS 2025

Jian Hu, Zixu Cheng, Shaogang Gong, Isabel Guan, Jianye Hao, Jun Wang, Kun Shao

PaperArXiv | Code | Webpage

  • Unlabelled cross-domain temporal grounding
  • Uncertainty-quantified Rollout Policy Adaptation with uncertainty-weighted rewards
ArXiv 2025
Arxiv2025

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning, Arxiv 2025

Zixu Cheng, Jian Hu, Ziquan Liu, Chenyang Si, Wei Li, Shaogang Gong

ArXiv | Code | Webpage | HF Dataset

  • Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
  • Highlight the weakness of comtemporary Video-LLMs
ArXiv 2025
Arxiv2025

CoS: Chain-of-Shot Prompting for Long Video Understanding, Arxiv 2025

Jian Hu, Zixu Cheng, Chenyang Si, Wei Li, Shaogang Gong

ArXiv |Code

  • Long-video understanding by visual prompt learning
  • Training-free mosaicing binary coding with pseudo-temporal grounding for long video understanding
IJCAI 2025
IJCAI2025

INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation, IJCAI 2025

Jian Hu, Zixu Cheng, Shaogang Gong

PaperArXiv

  • Training-free test-time adaptation for task-generic promptable segmentation
  • Progressive negative mining identifies hard-to-distinguish error categories
ECCV 2024
ECCV2024

SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding, ECCV 2024

Zixu Cheng*, Yujiang Pu*, Shaogang Gong, Parisa Kordjamshidi, Yu Kong (*equal contribution)

PaperArXiv| Code

  • LLM-driven methods for hard negative generation
  • Coarse-to-Fine Saliency Ranking for Compositional Temporal Grounding

📖 Educations

  • 2023.09 - now, PhD, Queen Mary University of London (QMUL), London.
  • 2021.09 - 2022.09, Postgraduate, University College London (UCL), London.
  • 2016.09 - 2020.06, Undergraduate, Communication University of China (CUC), Beijing.