Zhongkai Xue
Zhongkai Xue
Shenzhen, China | email | scholar | github | twitter | blog
PhD Student @ CASIA

Introduction

👋 I am a first-year PhD student at the Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by Prof. Liang Wang. Previously, I obtained my bachelor's degree from The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), where I worked with Prof. Tianshu Yu and Xinjian Zhao, and interned at ByteDance.

🤖 I am currently working on building agent systems that efficiently structure, retrieve, and utilize knowledge. Always happy to discuss research ideas or collaborations — feel free to reach out!

Education

🎓 PhD Student in Computer Science @ Institute of Automation, Chinese Academy of Sciences
Sep 2026 – Present
🎓 Exchange Student in Math & Statistics @ University of Oxford, St Hilda's College
Oct 2024 – Mar 2025
🎓 Bachelor in Financial Engineering @ The Chinese University of Hong Kong, Shenzhen
Sep 2022 – May 2026

Selected Publications

RoboMemory
RoboMemory: A Brain-Inspired Multi-Memory Framework for Interactive Environmental Learning in Physical Embodied Systems
Under Review '26 — Mingcong Lei, Honghao Cai, Yuyuan Yang, Yimou Wu, Jinke Ren, Zezhou Cui, Liangchen Tan, Junkun Hong, Gehan Hu, Shuangyu Zhu, Zhongkai Xue ... Yatong Han et al.
Abstract: We present RoboMemory, a brain-inspired framework that integrates spatial, temporal, episodic, and semantic memory within a parallelized architecture for embodied agents. Equipped with a dynamic spatial knowledge graph and a closed-loop planner with a critic module, RoboMemory enables efficient long-horizon planning and interactive environmental learning.
[arxiv] [code]
Graph-Survey
When Vision Meets Graphs: A Survey on Graph Reasoning and Learning
IJCAI '26 — Xinjian Zhao, Wei Pang, Zhixuan Yu, Xiangru Jian, Xiaozhuang Song, Yaoyao Xu, Zhongkai Xue, Dingshuo Chen, Shu Wu, Philip Torr, Tianshu Yu.
Abstract: We survey recent advances at the intersection of vision and graph learning, highlighting how visual representations can complement symbolic graph reasoning. Based on existing work into three major threads (vision for graph reasoning, vision for graph learning, and scientific graphs), we delve into a unified taxonomy and outline future directions toward more effective graph understanding.
[techrxiv]
VIS-GNN
The Underappreciated Power of Vision Models for Graph Structural Understanding
NeurIPS '25 — Xinjian Zhao*, Wei Pang*, Zhongkai Xue*, Xiangru Jian, Lei Zhang, Yaoyao Xu, Xiaozhuang Song, Shu Wu, Tianshu Yu.
Abstract: We conduct a systematic analysis that uncovers how visual perception and message-passing offer complementary strengths in graph understanding, and introduce a novel benchmark to showcase these insights. Our findings reveal that vision models can significantly enhance graph structural understanding, outperforming traditional GNNs in various tasks.
[arxiv] [code]
MJ-VIDEO
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
NeurIPS Spotlight '25 — Haibo Tong, Zhaoyang Wang, Zhaorun Chen, Haonian Ji, Shi Qiu, Siwei Han, Kexin Geng, Zhongkai Xue, Yiyang Zhou, Peng Xia, Mingyu Ding, Rafael Rafailov, Chelsea Finn, Huaxiu Yao.
Abstract: We present MJ-VIDEO, a Mixture-of-Experts reward model for fine-grained video preference evaluation, which is built upon MJ-BENCH-VIDEO, a large-scale benchmark covering alignment, safety, coherence, and bias. Our model achieves significant improvements in preference judgment and enhances alignment in video generation.
[arxiv] [code] [site]
Political-LLM
Political-LLM: Large Language Models in Political Science
Arxiv '25 — Lincan Li, Jiaqi Li, Catherine Chen, Fred Gui, Hongjia Yang, Chenxiao Yu, Zhengguang Wang, Jianing Cai, Junlong Aaron Zhou, Bolin Shen, Alex Qian, Zhongkai Xue ... Yue Zhao, Yushun Dong et al.
Abstract: We propose Political-LLM, a framework that bridges large language models with political science. It provides a dual-perspective taxonomy, political tasks and computational methods, while outlining key challenges and future directions, aiming to guide ethical and effective AI use in political research.
[arxiv] [code] [site]

* indicates equal contribution.

Research Experience

🔬 Visiting Research Assistant @ Graph and Geometric Learning Lab, Yale University
Apr 2025 – Aug 2025
Jan 2025 – Jun 2026

Industry Experience

💼 Generative AI Research Intern @ ByteDance Shenzhen Office
Nov 2025 – Present
💼 Quantitative Research Intern (Machine Learning) @ Jupiter Investment (深圳巨博华投资)
Jun 2024 – Oct 2024

Miscellaneous

🧐 Service: I served as a reviewer for ACL Rolling Review (ARR) '25.
👨‍🏫 Teaching: I served as a TA for Financial Management, Intro to AI Programming and Intro to C++ at CUHK-Shenzhen.
💰 Interests: I am also interested in financial markets and quantitative investment.