Percy Liang@percyliang
Associate Professor in computer science @Stanford @StanfordHAI @StanfordAILab @stanfordnlp #foundationmodels | Pianist cs.stanford.edu/~pliang/ Stanford, CA Joined October 2009-
Tweets395
-
Followers16.2K
-
Following386
View a Private Twitter Instagram Account

Percy Liang @percyliang
3 days agoWhile instruction tuning is clearly necessary for producing usable interfaces like ChatGPT, the "magic" of language models comes from self-supervised learning on broad data, which enables emergent behavior like in-context learning and chain-of-thought.

Tianyi Zhang @Tianyi_Zh
a week agoHave large language models solved news summarization? Almost there. Our new study shows that text-davinci-002 is comparable to freelance writers. arxiv.org/abs/2301.13848

Joël Niklaus @joelniklaus
a week agoI am thrilled to announce our new work "LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain”!
Collab w @MatoshiVeton, Pooja Rani, @aGalaxy42, @maemst, and @KiddoThe2B
📜: arxiv.org/abs/2301.13126
💾: huggingface.co/datasets/joeli…
💻: github.com/JoelNiklaus/LE…
🧵👇1/7

Michael Zhang @mzhangio
a week agoOur group's been thinking about how AI is having its Linux 🐧 moment. Open source models + community are driving amazing progress. There’s so much to do, and so many ways to get involved! Check out these thoughts at the @HazyResearch blog hazyresearch.stanford.edu/blog/2023-01-3…

Percy Liang @percyliang
a week ago@joshalbrecht One issue is that even if you eval on your unpublished test set, then you necessarily have to send it to an API, which could cause leakage.

Percy Liang @percyliang
2 weeks agoBut this might not be enough either: if we want to measure cross-task generalization, we have to ensure that no examples of a task/domain are represented in the training data. This is essentially impossible.

Percy Liang @percyliang
2 weeks agoA better solution would to have all the LM providers agree on a common repository of examples that should be excluded from any training run.

Percy Liang @percyliang
2 weeks agoI worry about language models being trained on test sets. Recently, we emailed [email protected] to opt out of having our (test) data be used to improve models. This isn't enough though: others running evals could still inadvertently contribute those test sets to training.

LAION @laion_ai
2 weeks agoWe release a new ViT-G/14 CLIP model with OpenCLIP which achieves 80.1% zero-shot accuracy on ImageNet and 74.9% zero-shot image retrieval ([email protected]) on MS COCO. As of January 2023, this is the best open source CLIP model.
laion.ai/blog/giant-ope…
huggingface.co/laion/CLIP-ViT…

Stanford HAI @StanfordHAI
2 weeks ago"Many of our social relations are now partly or wholly constituted by algorithmic intermediaries," says @sethlazar at the 2023 Tanner Lecture on AI and Human Values @Stanford, co-hosted by @StanfordEthics.
Watch the livestream here: hai.stanford.edu/events/tanner-…

Omar Khattab @lateinteraction
2 weeks agoIntroducing Demonstrate–Search–Predict (𝗗𝗦𝗣), a framework for composing search and LMs w/ up to 120% gains over GPT-3.5.
No more prompt engineering.❌
Describe a high-level strategy as imperative code and let 𝗗𝗦𝗣 deal with prompts and queries.🧵
arxiv.org/abs/2212.14024

Ashwin Ramaswami @aramaswamis
2 weeks agoI wrote a post on open source maintainers: What they need and how to support them -- check it out! linuxfoundation.org/blog/open-sour…

Together @togethercompute
2 weeks agoIntroducing FlashConv, a new technique for training state space models. Runs up to 35X faster than FlashAttention and runs the new H3 language model 2.4X faster than Transformers! Research by @tri_dao and our own @realDanFu. together.xyz/blog/h3

Dan Fu @realDanFu
2 weeks agoAttention is all you need... but how much of it do you need? Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023! 📣 w/ @tri_dao 📜 arxiv.org/abs/2212.14052 1/n

Tiffany Vlaar @TiffanyVlaar
4 weeks agoCall for reviewers for our #ICLR2023 workshop on Mathematical and Empirical Understanding of Foundation Models.
Fill out this form if you are interested
sites.google.com/view/me-fomo20…
and we will aim to get back to you asap.
Paper deadline: 3 Feb
Tentative reviewing period: 10-24 Feb.

Ben Lorica 罗瑞卡 @bigdata
3 weeks agoWe also discuss #foundationmodels: @percyliang describes the Center for Research on Foundation Models @Stanford, enterprise #foundationmodels the likely emergence of decentralized custom models in the future thedataexchange.media/evaluating-lan…

Dan Fu @realDanFu
3 weeks agoCe Zhang (@DS3Lab and @togethercompute) has done some crazy stuff in distributed training. In this talk, he goes over the magic behind distribute training and inference on a GLOBAL scale over slow networks! Tune in tomorrow at 3:30 pm Pacific! youtube.com/watch?v=e7o2C0…

Dan Roy @roydanroy
4 weeks agoYou may have received an email today, asking you to split your NeurIPS paper into two separate PDFs: one "main" paper (~9 pages + refs) and another "supplement". Why are we doing this still? Sign this petition to stop this practice forms.gle/h3BuVvhMYgc7hv…

elvis @omarsar0
4 weeks agoThis CS324-LLM course has a really nice set of notes on large language models.
You can also find lots of key material to read.
stanford-cs324.github.io/winter2022/ twitter.com/i/web/status/1…

Percy Liang @percyliang
4 weeks agoThey all have distinct research styles and directions, and each has produced exciting and insightful results that have surprised me. Of course this is a super compressed summary - check out their work to learn more!

Percy Liang @percyliang
4 weeks agoDimitris Tsipras (@tsiprasd) has done seminal work in adversarial robustness. Recently, he has pivoted to language models - understanding in-context learning and making major contributions to the HELM benchmark. Transformers can do in-context learning: arxiv.org/pdf/2208.01066…

Percy Liang @percyliang
4 weeks agoJohn Thickstun (@jwthickstun) develops methods to control generative models without fine-tuning, tackling challenging discrete modalities such as language & music and handling complex controls. Sampling from autoregressive models using Langevin dynamics: arxiv.org/pdf/2105.08164…

Percy Liang @percyliang
4 weeks agoSteve Mussmann (@MussmannSteve) develops theory (upper and lower bounds) for active learning that yields practical insights, for example, explaining the surprising success of uncertainty sampling. Data subset selection via machine teaching: drive.google.com/file/d/1j7K7f5…

Percy Liang @percyliang
4 weeks agoMina Lee (@MinaLee__) studies how humans interact with language models for writing and other tasks. She brings a fresh human-centered perspective to the default automation framing of LMs. Evaluating human-LM interaction: arxiv.org/pdf/2212.09746…

Percy Liang @percyliang
4 weeks agoAnanya Kumar (@ananyaku) focuses on foundation models for robustness to distribution shift. He develops theory on the role of data in pretraining and how to best fine-tune; these insights lead to SOTA results. Fine-tuning can distort features: arxiv.org/pdf/2202.10054…

Percy Liang @percyliang
4 weeks agoNiladri Chatterji (@niladrichat) develops holistic theoretical understanding in the brave new world of deep learning, capturing optimization and generalization in non-convex and overparametrized settings. Benign overfitting without linearity: arxiv.org/pdf/2202.05928…

Percy Liang @percyliang
4 weeks agoI have 6 fantastic students and post-docs who are on the academic job market this year. Here is a short thread summarizing their work along with one representative paper:

Siddharth Karamcheti @siddkaramcheti
a month agoWant to build robots that adapt to language corrections in real-time?
Introducing "No, to the Right – Online Language Corrections for Manipulation via Shared Autonomy" (arxiv.org/abs/2301.02555) w/ @YuchenCui1, Raj, Nidhya, @percyliang & @DorsaSadigh at #HRI2023 - 🧵👇 (1/N).

Percy Liang @percyliang
a month ago@sarahdingwang But it's the latest that we have access to right now.