Chaoyue Liu
I am an Assistant Professor in ECE department at Purdue University. I obtained my Ph.D. degree in Computer Science from The Ohio State University in 2021, where I was advised by Dr. Misha Belkin. After that, I spent one year at Meta (a.k.a. Facebook) as a research scientist, and two years in Halıcıoğlu Data Science Institute (HDSI) at UC San Diego as a postdoc, working with Dr. Misha Belkin. I also hold B.S. and M.S. degrees in Physics from Tsinghua University.
I am looking for PhD students to join my group. Please email me if you are interested in working with me!
Research Interests: My research focuses on the (theoretical) foundation of deep learning and its applications. I am enthusiastic in studying fundamental deep learning problems and opening the “black box” of deep learning, by theoretically understanding the neural network models and the dynamics of neural network training. I am also interested in applying these new findings to solve practical problems.
In the past few years, my research effort has been devoted to finding fundamental properties of neural networks and/or algorithms that are responsible for the practical fast training. By doing so, we were able to establish optimization theories and develop accelerated algorithms for neural networks. Lately, I am also working on fundamental problems of deep learning, including properties and training dynamics of attention-models, feature learning, architecture’s effect on feature representation, and so on.
My research interests also include: experimentally finding new phenomena in deep learning and understanding/explaining them using mathematical tools, and the connections between optimization and generalization performance of neural networks.
News
- 2024/08: Joining Purdue Uniersity, ECE department, as an Assistant Professor.
- 2024/05: Our paper on catapult dynamics of SGD was accepted by ICML 2024! (with Libin, Adityanarayanan and Misha) [Preprint]
- 2024/01: Our paper on quadratic models for understanding neural network catapult dynamics was accepted by ICLR 2024! [Preprint]
- 2023/09: One paper was accepted by NeurIPS 2023! Arxiv version: arXiv:2306.02601
- 2023/06: New paper showing that spikes in SGD training loss are catapult dynamics, with Libin Zhu, Adityanarayanan Radhakrishnan, Misha Belkin. See arXiv:2306.04815
- 2023/06: New paper on the large learning rate and fast convergence of SGD for wide neural networks, with Dmitriy Drusvyatskiy, Misha Belkin, Damek Davis and Yi-An Ma. See arXiv:2306.02601
- 2023/06: New paper studying the mechanism underlying clean-priority learning in noisy-label scenario, with Amirhesam Abedsoltan and Misha Belkin. See arXiv:2306.02533
- 2023/05: New paper showing the effect of ReLU non-linear activation on the NTK condition number, with Like Hui. See arXiv:2305.08813
- 2022/09: I am now a postdoc at the Halıcıoğlu Data Science Institute at UC San Diego.
Publications
ReLU soothes the NTK condition number and accelerates optimization for wide neural networks [pdf]
Chaoyue Liu, Like Hui
arXiv:2305.08813 (In submission)
presented at Minisymposium: The Dynamical View of Machine Learning at SIAM Conference on Mathematics of Data Science (MDS24), 2024.
On the Predictability of Fine-grained Cellular Network Throughput using Machine Learning Models
Omar Basit*, Phuc Dinh*, Imran Khan*, Z. Jonny Kong*, Y. Charlie Hu, Dimitrios Koutsonikolas, Myungjin Lee, Chaoyue Liu
The IEEE International Conference on Mobile Ad-Hoc and Smart Systems (MASS), 2024
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
International Conference on Machine Learning (ICML), 2024.
Quadratic models for understanding neural network dynamics [pdf]
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2024.
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
Neural Information Processing Systems (NeurIPS), 2023.
SGD batch saturation for training wide neural networks [pdf]
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma
NeurIPS Workshop on Optimization for Machine Learning, 2023.
On Emergence of Clean-Priority Learning in Early Stopped Neural Networks [pdf]
Chaoyue Liu*, Amirhesam Abedsoltan* and Mikhail Belkin
arXiv:2306.02533
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Applied and Computational Harmonic Analysis (ACHA) 2022.
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2022. (spotlight paper, 5.2% of all submissions)
Transition to linearity of general neural networks with directed acyclic graph architecture [pdf]
Libin Zhu, Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2022.
Understanding and Accelerating the Optimization of Modern Machine Learning [pdf]
Chaoyue Liu
Ph.D. dissertation, The Ohio State University. 2021.
Two-Sided Wasserstein Procrustes Analysis. [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IJCAI, 2021
On the linearity of large non-linear models: when and why the tangent kernel is constant [pdf]
Chaoyue Liu, Libin Zhu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2020. (spotlight paper, 3.0% of all submissions)
Accelerating sgd with momentum for over-parameterized learning [pdf]
Chaoyue Liu, Mikhail Belkin
International Conference on Learning Representations (ICLR), 2020. (spotlight paper, 4.2% of all submissions)
Otda: a unsupervised optimal transport framework with discriminant analysis for keystroke inference [pdf]
Kun Jin, Chaoyue Liu, Cathy Xia
IEEE Conference on Communications and Network Security (CNS), 2020
Parametrized accelerated methods free of condition number [pdf]
Chaoyue Liu, Mikhail Belkin
arXiv:1802.10235
Clustering with Bregman divergences: an asymptotic analysis [pdf]
Chaoyue Liu, Mikhail Belkin
Neural Information Processing Systems (NeurIPS), 2016.
*: equal contribution
Talks
- Why does SGD converge so fast on over-parameterized neural networks, CSE AI Seminar, CSE @ UCSD, Apr 2024.
- Why does SGD converge so fast on over-parameterized neural networks, Information Theory and Application (ITA) workshop, San Diego, Feb 2024.
- Transition to Linearity & Optimization Theories of Wide Neural Networks, Control and Pizza (Co-PI) seminar, ECE@ UCSD, Nov 2023.
- Transition to Linearity of Wide Neural Networks, Math Machine Learning Seminar, Max Planck Institute & UCLA, Apr 2022.
- Large Non-linear Models: Transition to Linearity & An Optimization Theory, NSF-Simons Journal Club, Jan 2021.
- Accelerating SGD with Momentum for over-parameterized learning, MoDL workshop, Dec 2020.
- Clustering with Bregman divergences: an asymptotic analysis, CSE AI seminar, Ohio State University, 2017.
Teaching
- Purdue University, ECE 57000: Artificial Intelligence, 24 Fall. (Instructor, with Xiaoqian Wang)
- OSU CSE 5523: Machine Learning, 17’Sp, 18’Sp, 19’Sp. (Teaching assistant)
- OSU CSE 3421: Intro. to Computer Architecture, 18’Au. (Teaching assistant)
- OSU CSE 2111: Modeling and Problem Solving with Spreadsheets and Databases, 16’Sp. (Teaching assistant)
- OSU CSE 2321: Discrete Structure, 15’Au. (Teaching assistant)
Services
Organizer
Reviewer
- 2024: ICLR, ICML, NeurIPS, TMLR, JMLR
- 2023: ICLR, NeurIPS, ICML, TMLR, IEEE TNNLS, IMA, NeuroComputing
- 2022: ICLR, NeurIPS, ICML, TMLR, AAAI, Swiss NSF grant
- 2021: ICLR, NeurIPS, ICML, JASA, AAAI
- 2020: NeurIPS, ICML, AAAI, UAI
- 2019: NeurIPS, ICML, UAI
- 2018: NeurIPS