Jie Liu's Homepage

Hello! I'm currently a second-year Ph.D. student at MMLab, The Chinese University of Hong Kong, supervised by Prof. Wanli Ouyang. Throughout my academic journey, I've had the privilege of collaborating closely with Prof. Dong Xu and Prof. Yaodong Yang.
I've also interned at Shanghai AI Lab.
My primary research interests are in the fields of Large Language Models and Reinforcement Learning.

News

🥳2025.2:	We release VideoAlign, a systematic pipeline that harnesses human feedback to improve video generation!
🥳2024.8:	Our paper Emulated Disalignment won the Outstanding Paper Award at ACL 2024!
2024.5:	Four Papers on Large Language Model are accepted at ACL 2024!
2023.12:	One Paper on Offline-to-Online RL (SO2) is accepted at AAAI 2024!
2023.10:	We release MODPO, a multi-objective direct preference optimization algorithm for language models!
2023.10:	We release MaskMA, a masked pretraining framework for multi-agent decision-making!
2023.08:	Become a Ph.D. student at MMLab in the Chinese University of Hong Kong.
2023.04:	One Paper on Autonomous Driving (ASAP-RL) is accepted at RSS 2023!
2022.11:	One Paper on Multi-agent RL (ACE) is accepted at AAAI 2023!
2021.03:	Inception Convolution is accepted at CVPR 2021 as an oral paper!

Selected Publications

Arxiv

Improving Video Generation with Human Feedback

Jie Liu*, Gongye Liu*, Jiajun Liang, Ziyang Yuan, Xiaokun Liu, Mingwu Zheng, Xiele Wu, Qiulin Wang, Wenyu Qin, Menghan Xia, Xintao Wang, Xiaohong Liu, Fei Yang, Pengfei Wan, Di Zhang, Kun Gai, Yujiu Yang, Wanli Ouyang

Arxiv, 2024

PDF CODE MODEL Dataset
Arxiv

Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level

Jie Liu*, Zhanhui Zhou*, Jiaheng Liu, Xingyuan Bu, Chao Yang, Han-Sen Zhong, Wanli Ouyang

Arxiv, 2024

PDF
NeurIPS

Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models

Zhanhui Zhou, Zhixuan Liu, Jie Liu, Zhichen Dong, Chao Yang, Yu Qiao

Neural Information Processing Systems (NeurIPS), 2024

PDF CODE
NN

Adaptive pessimism via target Q-value for offline reinforcement learning

Jie Liu*, Yinmin Zhang*, Chuming Li, Yaodong Yang, Yu Liu, Wanli Ouyang

Neural Networks, 2024

PDF
ACL

Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization

Zhanhui Zhou*, Jie Liu*, Chao Yang, Jing Shao, Yu Liu, Xiangyu Yue, Wanli Ouyang, Yu Qiao

Findings of Association for Computational Linguistics (ACL), 2024

PDF CODE
ACL

MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues

Ge Bai*, Jie Liu*, Xingyuan Bu, Yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng, Wanli Ouyang

Association for Computational Linguistics (ACL), 2024

PDF CODE
ACL

ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models

Yanan Wu*, Jie Liu*, Xingyuan Bu, Jiaheng Liu, Zhanhui Zhou, Yuanxing Zhang, Chenchen Zhang, Zhiqi Bai, Haibin Chen, Tiezheng Ge, Wanli Ouyang, Wenbo Su, Bo Zheng

Findings of Association for Computational Linguistics (ACL), 2024

PDF CODE
ACL

Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!

Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao

Association for Computational Linguistics (ACL), 2024, 🏆Outstanding Paper Award (< 1% of submission)

PDF CODE
AAAI

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

Yinmin Zhang*, Jie Liu*, Chuming Li* , Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024

PDF Code
TMLR

Masked Pretraining for Multi-Agent Decision Making

Jie Liu*, Yinmin Zhang*, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang

Transactions on Machine Learning Research, 2024

PDF
RSS

Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors

Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu, Steven L Waslander

Robotics: Science and Systems (RSS), 2023

PDF Code Video
AAAI

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Chuming Li*, Jie Liu*, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023

PDF Code
ECAI

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning

Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Proceedings of the European Conference on Artificial Intelligence (ECAI), 2023

PDF
CVPR Oral

Inception convolution with efficient dilation search

Jie Liu*, Chuming Li*, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, Oral (< 4% of submission)

PDF Code
Arxiv

Adaptive Gradient Method with Resilience and Momentum

Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang

Arxiv, 2020

PDF

Academic Service

Conference Reviewer
NeurIPS 2022, ICML 2023, NeurIPS 2023, ICLR 2024, CVPR 2024, ICML 2024, NeurIPS 2024, ICLR 2024, ICML 2025