Chaoyue Liu

I am currently a postdoc at Halıcıoğlu Data Science Institute (HDSI), UC San Diego, working with Dr. Misha Belkin. I obtained my Ph.D. degree in Computer Science from The Ohio State University in 2021, where I was advised by Dr. Misha Belkin. After that, I spent one year at Meta Platforms Inc., as a research scientist. I also hold B.S. and M.S. degrees in physics from Tsinghua University.

Research Interests: My research focuses on the theoretical foundation of deep learning optimization and of its acceleration. I am enthusiastic in theoretically understanding the dynamics of neural network training, and the mechanisms behind. My recent research effort has been devoted to finding fundamental properties of neural networks and/or algorithms that are responsible for the practical fast training. By doing so, we were able to establish optimization theories and develop accelerated algorithms for neural networks. I am also interested in the connections between optimization and generalization performance of neural networks.

News

  • 2024/01: Our paper on quadratic models for understanding neural network catapult dynamics was accepted by ICLR 2024! [Preprint]
  • 2023/09: One paper was accepted by NeurIPS 2023! Arxiv version: arXiv:2306.02601
  • 2023/06: New paper showing that spikes in SGD training loss are catapult dynamics, with Libin Zhu, Adityanarayanan Radhakrishnan, Misha Belkin. See arXiv:2306.04815
  • 2023/06: New paper on the large learning rate and fast convergence of SGD for wide neural networks, with Dmitriy Drusvyatskiy, Misha Belkin, Damek Davis and Yi-An Ma. See arXiv:2306.02601
  • 2023/06: New paper studying the mechanism underlying clean-priority learning in noisy-label scenario, with Amirhesam Abedsoltan and Misha Belkin. See arXiv:2306.02533
  • 2023/05: New paper showing the effect of ReLU non-linear activation on the NTK condition number, with Like Hui. See arXiv:2305.08813
  • 2022/09: I am now a postdoc at the Halıcıoğlu Data Science Institute at UC San Diego.

Publications

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks [pdf]
Chaoyue Liu, Like Hui
arXiv:2305.08813 (In submission)

On Emergence of Clean-Priority Learning in Early Stopped Neural Networks [pdf]
Chaoyue Liu*, Amirhesam Abedsoltan* and Mikhail Belkin
arXiv:2306.02533 (In submission)

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
arXiv:2306.04815 (In submission)

Quadratic models for understanding neural network dynamics [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2024.

Aiming towards the minimizers: fast convergence of SGD for overparametrized problems [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
Neural Information Processing Systems (NeurIPS), 2023.

SGD batch saturation for training wide neural networks [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
NeurIPS Workshop on Optimization for Machine Learning, 2023.

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Applied and Computational Harmonic Analysis (ACHA) 2022.

Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2022. (spotlight paper, 5.2% of all submissions)

Transition to linearity of general neural networks with directed acyclic graph architecture [pdf]
Libin Zhu, Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2022.

Understanding and Accelerating the Optimization of Modern Machine Learning [pdf]
Chaoyue Liu
Ph.D. dissertation, The Ohio State University. 2021.

Two-Sided Wasserstein Procrustes Analysis. [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IJCAI, 2021

On the linearity of large non-linear models: when and why the tangent kernel is constant [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2020. (spotlight paper, 3.0% of all submissions)

Accelerating sgd with momentum for over-parameterized learning [pdf]
Chaoyue Liu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2020. (spotlight paper, 4.2% of all submissions)

Otda: a unsupervised optimal transport framework with discriminant analysis for keystroke inference [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IEEE Conference on Communications and Network Security (CNS), 2020

Parametrized accelerated methods free of condition number [pdf]
Chaoyue Liu, Mikhail Belkin
arXiv:1802.10235

Clustering with Bregman divergences: an asymptotic analysis [pdf]
Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2016.

Talks

  • Transition to Linearity of Wide Neural Networks, Math Machine Learning Seminar, Max Planck Institute & UCLA, April 2022.
  • Large Non-linear Models: Transition to Linearity & An Optimization Theory, NSF-Simons Journal Club, January 2021

Teaching

  • OSU CSE 5523: Machine Learning, 17’Sp, 18’Sp, 19’Sp. (Teaching assistant)
  • OSU CSE 3421: Intro. to Computer Architecture, 18’Au. (Teaching assistant)
  • OSU CSE 2111: Modeling and Problem Solving with Spreadsheets and Databases, 16’Sp. (Teaching assistant)
  • OSU CSE 2321: Discrete Structure, 15’Au. (Teaching assistant)

Services

Reviewer

  • 2023: ICLR, NeurIPS, ICML, TMLR, IEEE TNNLS, IMA, NeuroComputing
  • 2022: ICLR, NeurIPS, ICML, TMLR, AAAI, Swiss NSF grant
  • 2021: ICLR, NeurIPS, ICML, JASA, AAAI
  • 2020: NeurIPS, ICML, AAAI, UAI
  • 2019: NeurIPS, ICML, UAI
  • 2018: NeurIPS