I have 6 fantastic students and post-docs who are on the academic job market this year. Here is a short thread summarizing their work along with one representative paper:
Niladri Chatterji (@niladrichat) develops holistic theoretical understanding in the brave new world of deep learning, capturing optimization and generalization in non-convex and overparametrized settings. Benign overfitting without linearity: arxiv.org/pdf/2202.05928…
Ananya Kumar (@ananyaku) focuses on foundation models for robustness to distribution shift. He develops theory on the role of data in pretraining and how to best fine-tune; these insights lead to SOTA results. Fine-tuning can distort features: arxiv.org/pdf/2202.10054…
Mina Lee (@MinaLee__) studies how humans interact with language models for writing and other tasks. She brings a fresh human-centered perspective to the default automation framing of LMs. Evaluating human-LM interaction: arxiv.org/pdf/2212.09746…
Steve Mussmann (@MussmannSteve) develops theory (upper and lower bounds) for active learning that yields practical insights, for example, explaining the surprising success of uncertainty sampling. Data subset selection via machine teaching: drive.google.com/file/d/1j7K7f5…
John Thickstun (@jwthickstun) develops methods to control generative models without fine-tuning, tackling challenging discrete modalities such as language & music and handling complex controls. Sampling from autoregressive models using Langevin dynamics: arxiv.org/pdf/2105.08164…
Dimitris Tsipras (@tsiprasd) has done seminal work in adversarial robustness. Recently, he has pivoted to language models - understanding in-context learning and making major contributions to the HELM benchmark. Transformers can do in-context learning: arxiv.org/pdf/2208.01066…
They all have distinct research styles and directions, and each has produced exciting and insightful results that have surprised me. Of course this is a super compressed summary - check out their work to learn more!