👋 About Me

I’m currently a 1st-year PhD student at Tsinghua University Shenzhen International Graduate School, supervised by Prof. Yansong Tang and Prof. Jiwen Lu. I got my bachelor’s degree from the Department of Automation, Tsinghua University in 2023.

My research interests lie in Computer Vision, such as Video Understanding, Video Generation, and Embodied Visual Perception.

Email / Github


✨ News


  • 2024-03: One paper on video understanding (Narrative Action Evaluation) is accepted to CVPR 2024
  • 2023-03: One paper on video understanding (Action Quality Assessment) is accepted to CVPR 2023

🔬 Research


diseManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu, Shiyi Zhang, Ziwei Wang, Changliu Liu, Jiwen Lu and Yansong Tang.
Preprint
[PDF] [Project Page]

We propose a dynamic Gaussian Splatting method named ManiGaussian for multi-task robotic manipulation, which mines scene dynamics via future scene reconstruction.

diseNarrative Action Evaluation with Prompt-Guided Multimodal Interaction
Shiyi Zhang*, Sule Bai*, Guangyi Chen, Lei Chen, Jiwen Lu, Junle Wang, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[PDF] [Project Page]

We investigate a new problem called narrative action evaluation (NAE) and propose a prompt-guided multimodal interaction framework.

diseLOGO: A Long-Form Video Dataset for Group Action Quality Assessment
Shiyi Zhang, Wenxun Dai, Sujia Wang, Xiangwei Shen, Jiwen Lu, Jie Zhou, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[PDF] [Project Page]

LOGO is a new multi-person long-form video dataset for action quality assessment.