Jie Liu (刘杰)
jieliu [at] link [dot] cuhk [dot] edu [dot] hk

Hello! I'm currently a third-year Ph.D. student at MMLab, The Chinese University of Hong Kong, supervised by Prof. Wanli Ouyang. My research primarily focuses on Reinforcement Learning, Generative Models, and LLM.

News

🥳2025.5: Flow-GRPO and VideoAlign are accepted at NeurIPS 2025!
🥳2025.5: We release Flow-GRPO, the first method integrating online RL into flow matching models!
2025.2: We release VideoAlign, a systematic pipeline that harnesses human feedback to improve video generation!
2024.8: Our paper Emulated Disalignment won the Outstanding Paper Award at ACL 2024!
2024.5: Four Papers on Large Language Model are accepted at ACL 2024!
2023.12: One Paper on Offline-to-Online RL (SO2) is accepted at AAAI 2024!
2023.10: We release MODPO, a multi-objective direct preference optimization algorithm for language models!
2023.10: We release MaskMA, a masked pretraining framework for multi-agent decision-making!
2023.08: Become a Ph.D. student at MMLab in the Chinese University of Hong Kong.
2023.04: One Paper on Autonomous Driving (ASAP-RL) is accepted at RSS 2023!
2022.11: One Paper on Multi-agent RL (ACE) is accepted at AAAI 2023!
2021.03: Inception Convolution is accepted at CVPR 2021 as an oral paper!

Selected Publications

  1. NeurIPS
    Flow-GRPO: Training Flow Matching Models via Online RL
    Jie Liu*, Gongye Liu*, Jiajun Liang, Yangguang Li, Jiaheng Liu, Xintao Wang, Pengfei Wan, Di Zhang, Wanli Ouyang
    Neural Information Processing Systems (NeurIPS), 2025
  2. NeurIPS
    Improving Video Generation with Human Feedback
    Jie Liu*, Gongye Liu*, Jiajun Liang, Ziyang Yuan, Xiaokun Liu, Mingwu Zheng, Xiele Wu, Qiulin Wang, Wenyu Qin, Menghan Xia, Xintao Wang, Xiaohong Liu, Fei Yang, Pengfei Wan, Di Zhang, Kun Gai, Yujiu Yang, Wanli Ouyang
    Neural Information Processing Systems (NeurIPS), 2025
  3. ACL
    Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
    Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao
    Association for Computational Linguistics (ACL), 2024, 🏆Outstanding Paper Award (< 1% of submission)
  4. ACL
    Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
    Zhanhui Zhou*, Jie Liu*, Chao Yang, Jing Shao, Yu Liu, Xiangyu Yue, Wanli Ouyang, Yu Qiao
    Findings of Association for Computational Linguistics (ACL), 2024
  5. Arxiv
    Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
    Jie Liu*, Zhanhui Zhou*, Jiaheng Liu, Xingyuan Bu, Chao Yang, Han-Sen Zhong, Wanli Ouyang
    Arxiv, 2024
  6. ACL
    MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
    Ge Bai*, Jie Liu*, Xingyuan Bu, Yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng, Wanli Ouyang
    Association for Computational Linguistics (ACL), 2024
  7. TMLR
    Masked Pretraining for Multi-Agent Decision Making
    Jie Liu*, Yinmin Zhang*, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang
    Transactions on Machine Learning Research, 2024
  8. AAAI
    ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
    Chuming Li*, Jie Liu*, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
    Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023
  9. AAAI
    A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
    Yinmin Zhang*, Jie Liu*, Chuming Li* , Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
    Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024
  10. CVPR Oral
    Inception convolution with efficient dilation search
    Jie Liu*, Chuming Li*, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, Oral (< 4% of submission)
  11. RSS
    Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors
    Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu, Steven L Waslander
    Robotics: Science and Systems (RSS), 2023
  12. Arxiv
    Adaptive Gradient Method with Resilience and Momentum
    Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang
    Arxiv, 2020

Experience

  • Bytedance Seed Shenzhen, China
    July 2025 – Now
    Topseed Research Intern
    Developing advanced RL algorithms for the next-generation Seedream image generation model
  • Kuaishou Kling AI Shenzhen, China
    September 2024 – June 2025
    Research Intern
    Reserch Topic: Reinforcement Learning for Image and Video Generation
  • Shanghai AI Lab Shanghai, China
    January 2023 – August 2024
    Research Intern
    Reserch Topic: LLM Post-training, Multi-agent RL
  • SenseTime Group Limited Beijing, China
    November 2019 – January 2023
    Research Intern
    Reserch Topic: Offline RL.

Academic Service

    • Conference Reviewer
      NeurIPS 2022, ICML 2023, NeurIPS 2023, ICLR 2024, CVPR 2024, ICML 2024, NeurIPS 2024, ICLR 2024, ICML 2025