Chaoyue Liu
I am currently a postdoc at Halıcıoğlu Data Science Institute (HDSI), UC San Diego, working with Dr. Misha Belkin. I obtained my Ph.D. degree in Computer Science from The Ohio State University in 2021, where I was advised by Dr. Misha Belkin. After that, I spent one year at Meta Platforms Inc., as a research scientist. I also hold B.S. and M.S. degrees in physics from Tsinghua University.
Research Interests: My research focuses on the theoretical foundation of deep learning optimization and of its acceleration. I am enthusiastic in theoretically understanding the dynamics of neural network training, and the mechanisms behind. My recent research effort has been devoted to finding fundamental properties of neural networks and/or algorithms that are responsible for the practical fast training. By doing so, we were able to establish optimization theories and develop accelerated algorithms for neural networks. I am also interested in the connections between optimization and generalization performance of neural networks.
News
- 2024/01: Our paper on quadratic models for understanding neural network catapult dynamics was accepted by ICLR 2024! [Preprint]
- 2023/09: One paper was accepted by NeurIPS 2023! Arxiv version: arXiv:2306.02601
- 2023/06: New paper showing that spikes in SGD training loss are catapult dynamics, with Libin Zhu, Adityanarayanan Radhakrishnan, Misha Belkin. See arXiv:2306.04815
- 2023/06: New paper on the large learning rate and fast convergence of SGD for wide neural networks, with Dmitriy Drusvyatskiy, Misha Belkin, Damek Davis and Yi-An Ma. See arXiv:2306.02601
- 2023/06: New paper studying the mechanism underlying clean-priority learning in noisy-label scenario, with Amirhesam Abedsoltan and Misha Belkin. See arXiv:2306.02533
- 2023/05: New paper showing the effect of ReLU non-linear activation on the NTK condition number, with Like Hui. See arXiv:2305.08813
- 2022/09: I am now a postdoc at the Halıcıoğlu Data Science Institute at UC San Diego.
Publications
ReLU soothes the NTK condition number and accelerates optimization for wide neural networks [pdf]
Chaoyue Liu, Like Hui
arXiv:2305.08813 (In submission)
On Emergence of Clean-Priority Learning in Early Stopped Neural Networks [pdf]
Chaoyue Liu*, Amirhesam Abedsoltan* and Mikhail Belkin
arXiv:2306.02533 (In submission)
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
arXiv:2306.04815 (In submission)
Quadratic models for understanding neural network dynamics [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2024.
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
Neural Information Processing Systems (NeurIPS), 2023.
SGD batch saturation for training wide neural networks [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
NeurIPS Workshop on Optimization for Machine Learning, 2023.
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Applied and Computational Harmonic Analysis (ACHA) 2022.
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2022. (spotlight paper, 5.2% of all submissions)
Transition to linearity of general neural networks with directed acyclic graph architecture [pdf]
Libin Zhu, Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2022.
Understanding and Accelerating the Optimization of Modern Machine Learning [pdf]
Chaoyue Liu
Ph.D. dissertation, The Ohio State University. 2021.
Two-Sided Wasserstein Procrustes Analysis. [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IJCAI, 2021
On the linearity of large non-linear models: when and why the tangent kernel is constant [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2020. (spotlight paper, 3.0% of all submissions)
Accelerating sgd with momentum for over-parameterized learning [pdf]
Chaoyue Liu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2020. (spotlight paper, 4.2% of all submissions)
Otda: a unsupervised optimal transport framework with discriminant analysis for keystroke inference [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IEEE Conference on Communications and Network Security (CNS), 2020
Parametrized accelerated methods free of condition number [pdf]
Chaoyue Liu, Mikhail Belkin
arXiv:1802.10235
Clustering with Bregman divergences: an asymptotic analysis [pdf]
Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2016.
Talks
- Transition to Linearity of Wide Neural Networks, Math Machine Learning Seminar, Max Planck Institute & UCLA, April 2022.
- Large Non-linear Models: Transition to Linearity & An Optimization Theory, NSF-Simons Journal Club, January 2021
Teaching
- OSU CSE 5523: Machine Learning, 17’Sp, 18’Sp, 19’Sp. (Teaching assistant)
- OSU CSE 3421: Intro. to Computer Architecture, 18’Au. (Teaching assistant)
- OSU CSE 2111: Modeling and Problem Solving with Spreadsheets and Databases, 16’Sp. (Teaching assistant)
- OSU CSE 2321: Discrete Structure, 15’Au. (Teaching assistant)
Services
Reviewer
- 2023: ICLR, NeurIPS, ICML, TMLR, IEEE TNNLS, IMA, NeuroComputing
- 2022: ICLR, NeurIPS, ICML, TMLR, AAAI, Swiss NSF grant
- 2021: ICLR, NeurIPS, ICML, JASA, AAAI
- 2020: NeurIPS, ICML, AAAI, UAI
- 2019: NeurIPS, ICML, UAI
- 2018: NeurIPS