Human preference LLM arenas are poorly suited for evaluating ASCII art because the ASCII art that most impresses a human is often verbatim regurgitation of an existing human work and this is rarely true for text.
Votes on ASCII art should be detected and thrown out IMO.
It’s important to remember LLM capability is bounded by the skill of the humans who train them.
The only reason ChatGPT can identify common, short strings given their MD5 or SHA1 hashes is because that’s a completely ordinary talent that many humans have.
It’s important to remember LLM capability is bounded by the skill of the humans who train them.
The only reason ChatGPT can identify common, short strings given their MD5 or SHA1 hashes is because that’s a completely ordinary talent that many humans have.
If you’re looking for a hard multimodal eval problem, none of my attempts to get ChatGPT, Claude, or Gemini to read the security code Gehn writes in his journal in base-25 D’ni numerals in the 1997 video game Riven: The Sequel to Myst have yet succeeded.
Today at @answerdotai we've got something new for you: FSDP/QDoRA. We've tested it with @AIatMeta Llama3 and the results blow away anything we've seen before.
I believe that this combination is likely to create better task-specific models than anything else at any cost. 🧵
New paper from @OpenAI on prompt injection - it's the most detailed evaluation of the problem I've seen from them so far, and has some very interesting details
Posted some of my notes on the paper on my log here: simonwillison.net/2024/Apr/23/th…
New paper from @OpenAI on prompt injection - it's the most detailed evaluation of the problem I've seen from them so far, and has some very interesting details
Posted some of my notes on the paper on my log here: simonwillison.net/2024/Apr/23/th…
A claim of consciousness from an LLM has no more evidential value than the same from a character in a dream.
The latter is more plausible a priori as the hardware is known to support it.
New Command R+ from Cohere — 128k context, open weights for non-commercial use, commercial API priced similar to Claude 3 Sonnet
Tokenizer is designed to be efficient in 10 languages so definitely consider for non-English text. Multi-hop tool use sounds interesting too
New Command R+ from Cohere — 128k context, open weights for non-commercial use, commercial API priced similar to Claude 3 Sonnet
Tokenizer is designed to be efficient in 10 languages so definitely consider for non-English text. Multi-hop tool use sounds interesting too
229K Followers 3K Following@NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.
175K Followers 89 FollowingThe original AI alignment person. Missing punctuation at the end of a sentence means it's humor. If you're not sure, it's also very likely humor.
37K Followers 3K FollowingFounder/CEO @codegen. Tweets about AI, computing, and their impacts on society. Previously did startups, @palantir, @stanford. Not a pseudonym.
51K Followers 7K Followingbuilding things with #AI 🤖 #DALLE & #MidJourney adventurer ✍️ editor, https://t.co/77MJXuLSTd 🖼 curator of the https://t.co/8Xctk6XoPs
53K Followers 3K Followingi'm a swe. go to https://t.co/pWRBfY8kn2 - AI image editing IN YOUR BROWSER!
follow to watch a self funded founder beat VC backed AI startups with @dingboard_
23K Followers 477 Followinga really exciting new account
"have you ever though you might be like scott alexander? very smart, but can't do math" - anon
84K Followers 898 FollowingCovering the latest in AI R&D • ML Engineer • Ex-Mila researcher • MIT Lecturer • Building AlphaSignal, a technical newsletter read by 180,000+ ML experts.
60 Followers 748 Followingu$ :(){ :|:& };:
We need to hyper-normalize irony and overthinking.
Quand on regarde un dictionnaire, l'ironie nous regarde.
私の日本語能力はまだ良い言葉遊びが書けるほどではありません。
692 Followers 5K FollowingStock, crypto investing 💰
Follow to see my exploits and conquest of the markets. Financial consultant job needed so send a PM.
*Not financial advice*
818 Followers 476 Following#SEO Specialist | SEO Outreach | Link building outreach | Driving Organic Traffic through Off-Page SEO & #Guestposting | SEO #Linkbuilding & Backlinks |
224 Followers 1K FollowingChef, Fitness Enthusiast, ESL Teacher, Husband, Father, Reader, Writer and Retired Merchant Mariner. Writing about all of these topics.
145 Followers 614 FollowingPhD student at Arizona State University. Graphical models, High dimensional statistics, Information Theory, Machine Learning for Smart Healthcare.
229K Followers 3K Following@NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.
175K Followers 89 FollowingThe original AI alignment person. Missing punctuation at the end of a sentence means it's humor. If you're not sure, it's also very likely humor.
37K Followers 3K FollowingFounder/CEO @codegen. Tweets about AI, computing, and their impacts on society. Previously did startups, @palantir, @stanford. Not a pseudonym.
51K Followers 7K Followingbuilding things with #AI 🤖 #DALLE & #MidJourney adventurer ✍️ editor, https://t.co/77MJXuLSTd 🖼 curator of the https://t.co/8Xctk6XoPs
53K Followers 3K Followingi'm a swe. go to https://t.co/pWRBfY8kn2 - AI image editing IN YOUR BROWSER!
follow to watch a self funded founder beat VC backed AI startups with @dingboard_
23K Followers 477 Followinga really exciting new account
"have you ever though you might be like scott alexander? very smart, but can't do math" - anon
262K Followers 26 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.
22K Followers 1K FollowingKnowledge manifests itself in radiant dreams that shimmer like the wild sun
Views are my own
pfau at sigmoid dot social on 🦣
https://t.co/xqtVHHVI17 on 🦋
20K Followers 4K FollowingHead of Groq Cloud, prev co-f (@definitiveio acq. by @groqInc, autonomic acq. by @ford, and xtreme labs acq. by pivotal/@vmware), angel investor.
2K Followers 41 FollowingMaking GPUs go brrrr @augmentcode 🤖 Past: Research Scientist at Google Brain 🧠 IMO Silver Medalist 🥈 waiting for LLMs to beat me. Tweets are my own opinions.
646 Followers 8 FollowingAugment's expert understanding of your codebase and dependencies removes the toil in your day, so you experience the joy of coding.
5K Followers 92 FollowingResearch Scientist at the #llama team of Meta Generative AI, designing and training large language models. Opinions are my own.
3K Followers 448 FollowingPhilosopher & Research Scientist @GoogleDeepMind | Artificial Intelligence, Alignment & Human Values | All views are my own | he/him
829 Followers 622 FollowingDeep Learning and espresso slurping @MPI_IS + @Mila_Quebec. Previously, Physics & research fellow @UniHeidelberg, and research intern at @MetaAI & @awscloud.
4K Followers 313 Followinghead of eng/silicon at @fal (fal ai labs). also a python core developer / @thePSF fellow. building the most efficient inference engine for diffusion models.
34K Followers 1K FollowingSenior tech reporter for The Verge • Prev: @fortunemagazine & @businessinsider • Robison (rah-bet-son) not Robinson • Send tips via DM or Signal 4157356829
14K Followers 4K Followingcovering twitter, tiktok & snap @theinformation. board member @aajasf. previously @nytimes et al. tips: [email protected] or erinkwoo.07 on signal
10K Followers 639 FollowingCS professor at Penn. Amazon Scholar at AWS. Author of The Ethical Algorithm (w/ Michael Kearns). I study machine learning, privacy, game theory, and fairness.
8K Followers 1K Followingwise AI; moral graphs; mechanism & game design; big data virtue ethics; meaning metrics; values-based choice theory @meaningaligned
17K Followers 902 FollowingFounder of @HigherOrderComp
Building the massively parallel future of computing
Reaching AGI to cure all diseases and suffering is all that matters