Saining Xie @sainingxie
researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiego sainingxie.com Joined July 2020-
Tweets278
-
Followers14K
-
Following1K
-
Likes3K
🥁 Llama3 is out 🥁 8B and 70B models available today. 8k context length. Trained with 15 trillion tokens on a custom-built 24k GPU cluster. Great performance on various benchmarks, with Llam3-8B doing better than Llama2-70B in some cases. More versions are coming over the next…
🎨Spent some time refactoring the 2021 post on diffusion model with new content: lilianweng.github.io/posts/2021-07-… ⬇️ ⬇️ ⬇️ 🎬Then another short piece on diffusion video models: lilianweng.github.io/posts/2024-04-… (Yes, I had an intensive weekend🥹)
The model works, results are great, but 8737 image tokens?! A100 40GB can't handle inference in 4KHD mode. Also, models for hi-res images should be evaluated on the V* dataset.
The model works, results are great, but 8737 image tokens?! A100 40GB can't handle inference in 4KHD mode. Also, models for hi-res images should be evaluated on the V* dataset. https://t.co/FPRlmhQg1e
I have been working on vision+language models (VLMs) for a decade. And every few years, this community re-discovers the same lesson -- that on difficult tasks, VLMs regress to being nearly blind! Visual content provides minor improvement to a VLM over an LLM, even when these…
I have been working on vision+language models (VLMs) for a decade. And every few years, this community re-discovers the same lesson -- that on difficult tasks, VLMs regress to being nearly blind! Visual content provides minor improvement to a VLM over an LLM, even when these… https://t.co/StilR2HbyO
(1/2) 📢 Introducing LL3M: Large Language, Multimodal, and Moe Model Open Research Plan 👉github.com/jiasenlu/LL3M With the following goals: - Build an open-sourced codebase in Jax / Flax that supports large-scale training in LLM, LMM, and MoE models. - Record and share the…
It was a blast seeing everyone at nyu and getting to learn about all the cool work. this is why nyc (and surroundings) are a great place for computer vision 😊
It was a blast seeing everyone at nyu and getting to learn about all the cool work. this is why nyc (and surroundings) are a great place for computer vision 😊
TacticAI is an AI system that can advise football coaches on tactics & plays deepmind.google/discover/blog/… This was a really fun project to work on in collaboration with my much loved Liverpool FC @LFC - fingers crossed we win the league this year to give Klopp a fitting send off!
🔍 New LLM Research 🔍 Conventional wisdom says that deep neural networks suffer from catastrophic forgetting as we train them on a sequence of data points with distribution shifts. But conventions are meant to be challenged! In our recent paper led by @YanlaiYang, we discovered…
People regularly ask me about interning. I don't have any intern positions this year, but my half-decade-long super-close collaborator Xiaohua has! If you're interested in our last papers (siglip, pali, rl-tuning, all things vlm really) definitely reach out to him.
People regularly ask me about interning. I don't have any intern positions this year, but my half-decade-long super-close collaborator Xiaohua has! If you're interested in our last papers (siglip, pali, rl-tuning, all things vlm really) definitely reach out to him.
The Stable Diffusion 3 paper is here 🥳 I think my colleagues have done a great job with this paper so thought I'd do a quick walk-thru thread (1/13)↓
The Stable Diffusion 3 paper is here 🥳 I think my colleagues have done a great job with this paper so thought I'd do a quick walk-thru thread (1/13)↓
(1/3) Today, we're publishing our research paper that dives into the underlying technology powering Stable Diffusion 3. Prompt: A beautiful painting of flowing colors and styles forming the words “The SD3 research paper is here!”, the background is speckled with drops and…
🚨Please RT🚨 If you're an undergraduate (or know one) interested in AI/ML for materials and (bio)chemical discovery, join us at @SimonsCenterNYU for our 10-week undergraduate summer program, with a $10k stipend! Applications close in a week, March 1st! wp.nyu.edu/sccpc/summer-u…
Ask not what LLMs can do for planning, ask what planning can do for LLMs.
Thanks @_akhaliq for promoting our work! We propose Searchformer, a Transformer that imitate-learns A* search dynamics (i.e., how the search is performed), and can be fine-tuned to still output optimal plans ~94% of the time, but with ~27% shorter search dynamics in Sokoban,…
Thanks @_akhaliq for promoting our work! We propose Searchformer, a Transformer that imitate-learns A* search dynamics (i.e., how the search is performed), and can be fine-tuned to still output optimal plans ~94% of the time, but with ~27% shorter search dynamics in Sokoban,…
This classic was the first math book I read for fun as a teenager. It was first published in 1941, but the second edition in 1996 is arguably even better with one more chapter by Ian Stewart on recent progress in mathematics.
Cats are all you need. @phillip_isola @TinghuiZhou
Cats are all you need. @phillip_isola @TinghuiZhou
Diffusion Transformer architecture + Flow Matching / Stochastic Interpolants objective? Great work and looking forward to the technical report! In SiT (scalable-interpolant.github.io) we have also studied this new design space under class conditional generation (though on a much…
Diffusion Transformer architecture + Flow Matching / Stochastic Interpolants objective? Great work and looking forward to the technical report! In SiT (scalable-interpolant.github.io) we have also studied this new design space under class conditional generation (though on a much…
AK @_akhaliq
307K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
227K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Yi Ma @YiMaTweets
71K Followers 120 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Jia-Bin Huang @jbhuang0604
51K Followers 285 Following Associate Professor @umdcs; Part-time Research Scientist @Meta. I like pixels.Soumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Lucas Beyer (bl16) @giffmana
56K Followers 442 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Alfredo Canziani @alfcnz
86K Followers 269 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityXiaolong Wang @xiaolonw
11K Followers 943 Following Assistant Professor @UCSDJacobs Postdoc @berkeley_ai PhD @CMU_RoboticsYuandong Tian @tydsh
16K Followers 795 Following Research Scientist and Senior Manager in Meta AI (FAIR). AI-guided Optimization and Representation Learning. Novelist in spare time. PhD in @CMU_Robotics.Kyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Elliott / Shangzhe Wu @elliottszwu
5K Followers 765 Following Postdoc @StanfordSVL working on unsupervised 3D perception and inverse rendering, PhD from @Oxford_VGG. Public office hours: https://t.co/iSSemSi1NQEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pRosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRYuliang Xiu @yuliangxiu
5K Followers 4K Following Ph.D. in Vision & Graphics @MPI_IS, previously @USC_ICT. Focusing on democratizing human-centric digitization. Intern at @RealityLabs @UbisoftAnimesh Garg @animesh_garg
21K Followers 1K Following Foundation Models for Generalizable Autonomy. Assistant Professor in AI Robotics @GeorgiaTech + @NvidiaAI. prev @Stanford @berkeley_ai @UofTCompSciRoss Wightman @wightmanr
18K Followers 1K Following Computer Vision @ 🤗. Ex head of Software, Firmware Engineering at a Canadian 🦄. Currently building ML, AI systems or investing in startups that do it better.Horace He @cHHillee
23K Followers 445 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemalerohan anil @_arohan_
12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.Prof. Anima Anandkuma.. @AnimaAnandkumar
25K Followers 2K Following Bren Professor @caltech, Fmr Sr Director of #AI research @nvidia, Fmr Principal Scientist @awscloud, AI+Science, PDE, Neural operators. Views my own.Xin Eric Wang @xwang_lk
7K Followers 1K Following Multimodal and Embodied AI Researcher / Professor @UCSC. Director of https://t.co/Y4swOBag21. AI for Humanity in the long run. he/himWenkai Sun @SunWenkai
1 Followers 85 FollowingAlkın Kaz @alkin_kaz
560 Followers 2K Following physics, computation, society. princeton '23 | fentek '19 | ipho medalist | tweets in 🇹🇷/🇺🇸三千 @XKpwXpp4rIkV9c7
3 Followers 44 FollowingMartin Fan @perfectoid_ai
365 Followers 8K FollowingXiaoyuan Zhang @XiaoyuanZh1907
3 Followers 68 FollowingFred Luv @Billthefreeman
43 Followers 539 Following Software Engineer. Into disruptive technology and value investingZhang Xiaojun @zhangxj_hk
0 Followers 12 FollowingJindong Gu @Jindong73504766
136 Followers 540 Following Senior Research Fellow in University of Oxford Faculty Researcher @Google Homepage: https://t.co/YOSVO3jb6hJingHe @Jingheya
1 Followers 15 FollowingPingyue Zhang @PingyueZhangNU
0 Followers 15 FollowingZhoujun (Jorge) Cheng @ChengZhoujun
408 Followers 345 Following Incoming UCSD Ph.D. | RA @XLangNLP @HKUniversity NLP | Undergrad & M.S. @sjtu1896VariationalGuy @Variationa94754
1 Followers 5 FollowingYaofeng Xie @xi_9856_xi
0 Followers 38 FollowingAlpay Ariyak @AlpayAriyak
1K Followers 1K Following 𝗔𝗜 @RunPod_io | 𝗟𝗲𝗮𝗱: @OpenChatDev (𝟲𝟬𝟬𝗸+ 𝗱𝗼𝘄𝗻𝗹𝗼𝗮𝗱𝘀 on HuggingFace🤗)cC Ninja @cCninjaX
6 Followers 14 FollowingJunhao LI | Indie Mak.. @Junhao_LI_168
2 Followers 28 Following Solo entrepreneur founder of Beffroi Studio https://t.co/QfFTAvU82J https://t.co/mPV0qhGlkb €60 MRR #buildinpublicVi @AvimanyuRoy3
574 Followers 1K Following 🍎🕊/🦦☕️/😴🛌/he/him Shouting into the Void (TM) GPU poor peasantLuca Weihs @LucaWeihs
714 Followers 197 Following Research scientist @allen_ai; stat PhD @UW; math undergrad @UCBerkeley. Political views/opinions are my own.CHAOI TUAN @KaraTuan1414
0 Followers 15 Following崔永亮 @Fridemn_C
1 Followers 48 FollowingErin Grant @ermgrant
3K Followers 1K Following Senior Research Fellow @GatsbyUCL & @SWC_Neuro {learning, representations, structure} in 🧠💭🤖 @[email protected] @[email protected]Mark R. Hinkle @mrhinkle
7K Followers 5K Following I help enterprises understand and use artificial intelligence. Leveraging my 25 years of enterprise software experience in emerging technology to drive results.Shivanand_Kundargi @Shivark2001
105 Followers 801 Following Computer Vision Researcher | Surviving on air, water, food and GPUMiles Yan @MgYuanYan
9 Followers 101 Following顾宝成 @bao_cheng29849
0 Followers 52 FollowingAtanu Chakraborty @alecsyde
5 Followers 75 FollowingYan @YanMaci30628514
25 Followers 721 Followingvaadsaara (वाद�.. @VaadSaara
7 Followers 69 FollowingNathan Huanrong LIU @HuanrongLIU715
0 Followers 13 FollowingSuLvXiangXin @SuLvXiangXin
5 Followers 90 FollowingYHW @jerx2y
4 Followers 30 FollowingAndré @andre_li999
0 Followers 86 FollowingTowaki Takikawa / 瀧.. @yongyuanxi
5K Followers 2K Following 3D脱サラマン (datsu sararīman) computer graphicist algorizzms former research scientist @NVIDIAQodicat @CQodi33534
7 Followers 66 FollowingJonas Bacci @jonasbacci
8 Followers 64 FollowingKevin Messali @Kev1MSL
12 Followers 48 Following Software Engineer that fell in love with Computer Vision & 3D AI✨️ | Math&CS BSc @PolytechniqueKai Qu @qubill282
38 Followers 86 Following 2011 graduate candidate from Duke University, major in Master of Engineering Management. Interested in Consulting, Media and Finance Service.Jessie @KkkkkkK075
3 Followers 75 Followinglisarong @lisaronge
0 Followers 30 Followingpalmeng @palmeng0608
4 Followers 29 Following灶桀 @zaojie12339744
9 Followers 134 FollowingShijie Zhou @ShijieZhoucla
89 Followers 97 Following PhD Student @UCLA, 3D Computer Vision | Incoming Intern @Google | Previously @Columbia.Hongwei Yan @HongweiYan2
26 Followers 128 Following Bio Ph.D. student @Tsinghua_Uni | Previously CS B.S. @ IIIS, THU. Interested in Bio-inspired AI and Continual LearningSiddhartha Gairola @sidgairo18
1K Followers 974 Following @ELLISforEurope 🇪🇺 PhD student at MPI-INF & IST Austria Previously: @MSFTResearch , @Adobe , @iiit_hyderabadAK @_akhaliq
307K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
227K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Yi Ma @YiMaTweets
71K Followers 120 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Jia-Bin Huang @jbhuang0604
51K Followers 285 Following Associate Professor @umdcs; Part-time Research Scientist @Meta. I like pixels.Kosta Derpanis @CSProfKGD
47K Followers 198 Following #CS Associate Prof @YorkUniversity, #ComputerVision Scientist Samsung #AI, @VectorInst Faculty Affiliate, TPAMI AE, #CVPR2024/#ECCV2024 Publicity Co-chairMichael Black @Michael_J_Black
58K Followers 638 Following Director, Max Planck Institute for Intelligent Systems (@MPI_IS). Chief Scientist @meshcapade. Building 3D digital humans using vision, graphics, and learning.Soumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Lucas Beyer (bl16) @giffmana
56K Followers 442 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Peyman Milanfar @docmilanfar
67K Followers 261 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Alfredo Canziani @alfcnz
86K Followers 269 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityMatthias Niessner @MattNiessner
31K Followers 161 Following Professor for Visual Computing & Artificial Intelligence @TU_Muenchen Co-Founder @synthesiaIOXiaolong Wang @xiaolonw
11K Followers 943 Following Assistant Professor @UCSDJacobs Postdoc @berkeley_ai PhD @CMU_RoboticsYuandong Tian @tydsh
16K Followers 795 Following Research Scientist and Senior Manager in Meta AI (FAIR). AI-guided Optimization and Representation Learning. Novelist in spare time. PhD in @CMU_Robotics.Kyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Elliott / Shangzhe Wu @elliottszwu
5K Followers 765 Following Postdoc @StanfordSVL working on unsupervised 3D perception and inverse rendering, PhD from @Oxford_VGG. Public office hours: https://t.co/iSSemSi1NQDmytro Mishkin 🇺�.. @ducha_aiki
18K Followers 590 Following Marrying classical CV and Deep Learning. I do things, which work, rather than being novel, but not working.Jürgen Schmidhuber @SchmidhuberAI
106K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pYi Tay @YiTayML
28K Followers 97 Following Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶Megan Richards @megan_richards_
127 Followers 287 Following AI Resident @AIatMeta, previously @DukeInnovate. Reliable/Responsible AI.Erin Grant @ermgrant
3K Followers 1K Following Senior Research Fellow @GatsbyUCL & @SWC_Neuro {learning, representations, structure} in 🧠💭🤖 @[email protected] @[email protected]udio @udiomusic
26K Followers 0 FollowingOpen Philanthropy @open_phil
15K Followers 17 Following Open Philanthropy's mission is to help others as much as we can with the resources available to us.Mike Shou @MikeShou1
1K Followers 285 Following Asst Prof at NUS. Previously at Facebook AI and Columbia U. Passionate about video, multi-modal, AI assistant.Tianle Cai @tianle_cai
5K Followers 4K Following Machine learning PhD @Princeton. Life-long learner, hacker, and builder. Previously @togethercompute @GoogleDeepMind @MSFTResearch @citsecurities.Dandan Shan @DandanShan_
893 Followers 1K Following PhD Candidate @UMichCSE; Visiting Scholar @NYU_Courant; Working on Computer VisionJuliana Freire @jfreirenet
1K Followers 258 Following Juliana Freire is a Professor at the Department of Computer Science and Engineering and Data Science at New York University.David Hall @dlwh
2K Followers 1K Following Research Engineering Lead at @StanfordCRFM . Previously co-founder at Semantic Machines ⟶ MSFT. Lead developer of Levanter, Breeze. he/him @[email protected]kache (dingboard.com) @yacineMTB
51K Followers 3K Following go to https://t.co/pWRBfY8kn2 - AI image editing IN YOUR BROWSER! follow to watch a self funded founder beat VC backed AI startups with @dingboard_Ate-a-Pi @8teAPi
36K Followers 2K Following self aware neuron; historian from 2130; epistemic polluter; 95 yr old man;Kimin @kimin_le2
1K Followers 333 Following Assistant professor at KAIST. Prev: Research scientist @GoogleAI, Postdoc @berkeley_ai & Ph.D at KAIST.Adam Karvonen @a_karvonen
1K Followers 285 Following Interested in ML and software. I prefer email to DM.Peter J. Liu @peterjliu
4K Followers 2K Following Research Scientist @ Google B̵r̵a̵i̵n̵ DeepMind, frontier language models research (aka chatbot engineer). Opinions are my own. 🤖🔄🚀Tao Xu @txhf
6K Followers 887 Following Learning Machine at OpenAI, previously Airbnb, Quora, Facebook and Microsoft.Karl Tuyls @karl_tuyls
2K Followers 334 Following Ex: team lead @ DeepMind,@GoogleDeepMind - still working on AGI in a Multi-Agent world. CS professor (Liverpool/Leuven) and LFC fan.Daniel Han @danielhanchen
7K Followers 924 Following Building @UnslothAI. Finetune LLMs 30x faster https://t.co/aRyAAgKOR7. Prev ML at NVIDIA. Hyperlearn used by NASA. I like maths, making code go fastStealth Startup Spy @StealthCoSpy
3K Followers 1 Following Real-time notifications to uncover companies under stealth by tracking the inflows and outflows of talent who work at “Stealth Startups” on LinkedIn!Anjali Gupta @AnjaliWGupta
8 Followers 24 FollowingNathan Shipley @CitizenPlain
11K Followers 871 Following Exploring AI film & animation. VFX, creative technologist, motion graphics artist. Currently @buck_tv. Prev: Resident at Stochastic Labs, Director of AI @GSPRobert Scoble @Scobleizer
504K Followers 72K Following Follow me on my new podcast with AI startups, Unaligned. Tech industry color commentator since 1993. Author/Blogger. Former strategist @Microsoft.Brandon McKinzie @mckbrando
2K Followers 2K Following Multimodal LLMs @Apple. Prev: Physics/CS @UCBerkeley.Alberto Hojel @AlbyHojel
899 Followers 1K Following vision-based AGI at @berkeley_ai // incoming summer of abundance at @RainmakerCorp // happy receiver of book recs // 🇲🇽🇺🇸; epoch=20Yu Xiang @YuXiang_IRVL
78 Followers 182 Following Assistant Professor @UT_Dallas, Intelligent Robotics and Vision Lab @IRVLUTD, PhD @UMich, Previously Research Scientist @NVIDIAChen Change Loy @ccloy
3K Followers 667 Following Professor @NTUsg Director of @MMLabNTU Computer vision and deep learningCorey Lynch @coreylynch
10K Followers 1K Following AI at @figure_robot, previously research scientist at @GoogleDeepMind.Physical Intelligence @physical_int
4K Followers 8 Following Physical Intelligence (Pi), bringing AI into the physical world.Teortaxes▶️ @teortaxesTex
7K Followers 1K Following Ours is the age of unaligned utilitarians. Other problems are relatively unimportant, but sometimes I tweet about them anyway. (кто/кого)Mathilde Caron @mcaron31
1K Followers 27 Following Research Scientist @googIeresearch Grenoble ⛰️ Previously PhD student @Inria & @MetaAI (FAIR)Xuezhe Ma (Max) @MaxMa1987
1K Followers 348 Following Research Lead @USC_ISI and Research Assistant Professor @CSatUSC PhD at CMU ML/NLP @LTIatCMU @CarnegieMellonEunsol Choi @eunsolc
5K Followers 811 Following on natural language processing / machine learning. assistant professor @UTCompSci. prev @googleai, @uwcse, @Cornell. opinions are of my own.Sitan Chen @sitanch
1K Followers 159 Following assistant professor of computer science @hseas, provable algorithms for data science, 🎹Pietro Schirano @skirano
34K Followers 702 Following Founder @everartai. 🎨✨ Previously, led AI at @brexHQ. @Uber, @Facebook, and @OpenTable.Grant Rotskoff @GrantRotskoff
812 Followers 101 Following Assistant Professor of Chemistry at Stanford. Biophysics, applied mathematics, machine learning.Schmidt Futures @SchmidtFutures
14K Followers 93 Following Supporting projects at the intersection of talent and technology.Kyle Wiggers @Kyle_L_Wiggers
65K Followers 4K Following Technology journalist. Senior Enterprise Reporter @TechCrunch ([email protected]). Pronouns: he/him. Mastodon: https://t.co/wesC0GePagZonghan Yang @yang_zonghan
727 Followers 2K Following PhD student at Tsinghua NLP & AIR, obsessed with LLM ∩ Control (alignment and agent; and they are equivalent!); Two drifters with the world to see.What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
How do model components (conv filters, attn heads) collectively transform examples into predictions? Is it possible to somehow dissect how *every* model component contributes to a prediction? w/ @harshays_ @andrewilyas, we introduce a framework for tackling this question!…
Congrats to @AIatMeta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching…
A belated post from some of our group (and friends @sainingxie) attending NYC vision day earlier this month!
Holy shit lol
These numbers are insane. I can't even imagine what the larger one(s) will be. Looks like Mistral 7B might be dead as of today though, and maybe even sonnet lol My favorite is the huge gains in coding capabilities
These numbers are insane. I can't even imagine what the larger one(s) will be. Looks like Mistral 7B might be dead as of today though, and maybe even sonnet lol My favorite is the huge gains in coding capabilities
Normalization is an important supporting actor in visual perception (besides convolutional feature extraction). Normalization is what you need interneurons for.
TL;DR New preprint with discovery: many interneuron types in the fly visual system function as highly specific normalizers. x.com/SebastianSeung…
Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman mitpress.mit.edu/9780262048972/… It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4
Excellent book! Ordered one and will try to get signatures from three authors in person ;)
Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman mitpress.mit.edu/9780262048972/… It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4
We be cooking with $1.25B in fresh capital to fund the AI infra of the future 🤖🏗️ Come build with @martin_casado @BornsteinMatt @JenniferHli @stuffyokodraws @appenz @rajko_rad @zanelackey @satishtalluri @Mascobot @bhorowitz @pmarca me & the entire @a16z team
We've raised a $1.25B infrastructure fund! We love all infra, compute, network, storage, databases, data science, gen AI, dev tools ... from silicon to UIs. Infra is the true root of value in tech. And we're deepening our commitment to it. a16z.com/new-funds-new-…
One of the most fun parts for me has been making visualizations. To give a sample, here are a few, showing 1) embeddings layer by layer in an MLP, 2) weight sharing in a CNN, 3) a diffusion model, 4) an image captioning system 2/4
@jeremyphoward as the semi-serious joke goes: if they published it, its not in gemini
✨Excited to finally drop our new paper: SSMs “look like” RNNs, but we show their statefulness is an illusion🪄🐇 Current SSMs cannot express basic state tracking, but a minimal change fixes this! 👀 w/ @jowenpetty, @Ashish_S_AI arxiv.org/abs/2404.08819
Very great study! This is a much more comprehensive analysis into the 3d/geometric awareness of vision models compared to our telling-left-from-right, while we focus more on correspondence scenarios and how to improve it.
Google presents Probing the 3D Awareness of Visual Foundation Models Visual foundation models can learn representations that encode the depth and orientation of the visible surface but struggle with multiview consistency possibly because they are learning view-dependent…
LLM Fatigue is a variation of Decision fatigue for AI; every day there's a new release, so you stick to the familiar 3 names despite merits
After two good years at Microsoft Research AI4Science, I am very excited to announce that as of this month I have, together with Chad Edwards, co-founded a new startup in the field of molecular and materials discovery.
Looks like not a single one of model releases from academia is notable…
The AI Index editors chose “the most notable model releases of 2023”. 9 of the 15 were Large Language Models. 3 more involved language: text to image and speech models. 2 image models and a watermarking model brought up the rear. aiindex.stanford.edu/report/ #NLProc #BiasedTakes
This is an interesting, timely, and important paper. The takeaway is that "recent self-supervised models such as DINOv2 learn representations that encode depth and surface normals, with StableDiffusion being a close second". This contrasts with vision-language models like CLIP,…
Google announces Probing the 3D Awareness of Visual Foundation Models Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their
Great work & we're just starting! There’s a rich history in scene perception that’s worth exploring. Over the next few years, my research will focus on connecting these theories to today’s large models -- let's not rush to use that 'F' word just yet 😉🤓 x.com/anand_bhattad/…
Probing the 3D Awareness of Visual Foundation Models @_mbanani, Amit Raj, @kmaninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, @jcjohnss, @jampani_varun tl;dr: DINOv2 rules, but read paper. SD is good for semantic cores arxiv.org/abs/2404.08636…