Saketh Rambhatla @rssaketh
Phd student at University of Maryland, College Park rssaketh.github.io College Park, MD Joined October 2010-
Tweets79
-
Followers197
-
Following584
-
Likes7K
Inference time objectives are amazing :) We show that LLMs can be upgraded to multimodal beings by a simple trick :) No training needed! Works on image generation, editing, style transfer and more!
Inference time objectives are amazing :) We show that LLMs can be upgraded to multimodal beings by a simple trick :) No training needed! Works on image generation, editing, style transfer and more!
Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!
Super cool to see transformers scaling so effectively for image/video autoencoders! Our model also offers a flexible way to implement variable token length
Super cool to see transformers scaling so effectively for image/video autoencoders! Our model also offers a flexible way to implement variable token length
How can we better animate images solely following text descriptions? We present Motion Focal Loss (MotiF) (arxiv.org/abs/2412.16153) to better align motions with text descriptions in text-image-to-video (TI2V) task and release TI2V-Bench, a comprehensive TI2V benchmark. (1/n)
Flow matching can transform one distribution to another. So why do text-to-image models map noise to images instead of directly mapping text to images? Wouldn't it be cool to directly connect modalities together? CrossFlow accomplishes exactly that! cross-flow.github.io
How can we make Imitation Leaning generalize? In my latest work we show that a key point based representation can generalize to novel instances of an object and is agnostic to background changes.
🚨 Internship in Meta GenAI NYC 🚨 I have an open PhD internship position for 2025! Interested in exploring visual generative models (or any other exciting ideas) inside the team that brought you Movie Gen and Emu Video? 📩 Send me DM with CV, website, and GScholar profile
🚨 Internship in Meta GenAI NYC 🚨 I have an open PhD internship position for 2025! Interested in exploring visual generative models (or any other exciting ideas) inside the team that brought you Movie Gen and Emu Video? 📩 Send me DM with CV, website, and GScholar profile
Meta Movie Gen is just freakin cool! Generative Video Foundation models with this quality, precise editing and personalization unlock value for creators, new creative tools and enable Agents that can interact in richer ways closing the loop on learning to unlock world models!
Meta Movie Gen is just freakin cool! Generative Video Foundation models with this quality, precise editing and personalization unlock value for creators, new creative tools and enable Agents that can interact in richer ways closing the loop on learning to unlock world models!
I’m thrilled and proud to share our model, Movie Gen, that we've been working on for the past year, and in particular, Movie Gen Edit, for precise video editing. 😍 Look how Movie Gen edited my video!
I’m thrilled and proud to share our model, Movie Gen, that we've been working on for the past year, and in particular, Movie Gen Edit, for precise video editing. 😍 Look how Movie Gen edited my video! https://t.co/0YawTGo217
Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope…
Check out Movie Gen 🎥 Our latest media generation models for video generation, editing, and personalization, with audio generation! 16 second 1080p videos generated through a simple Llama-style 30B transformer. Demo + detailed 92 page technical report 📝⬇️
Check out Movie Gen 🎥 Our latest media generation models for video generation, editing, and personalization, with audio generation! 16 second 1080p videos generated through a simple Llama-style 30B transformer. Demo + detailed 92 page technical report 📝⬇️
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
So proud to be part of the Movie Gen project, pushing GenAI boundaries! Two key insights: 1. Amazing team + high-quality data + clean, scalable code + general architecture + GPUs go brr = SOTA video generation. 2. Video editing *without* supervised data: train a *single* model…
Hi friends, say hello to Movie Gen. Over the past couple of months, we've been working hard behind the scenes to bring you the latest advancements in video generation. Movie Gen not only packs with text-to-video capability, but also comes with video personalization, editing, and…
Hi friends, say hello to Movie Gen. Over the past couple of months, we've been working hard behind the scenes to bring you the latest advancements in video generation. Movie Gen not only packs with text-to-video capability, but also comes with video personalization, editing, and…
And here is the most exciting model we have been working on with special capabilies in text-to-video generation, video personalization, editing, and audio generation! Plus, an invaluable tech report released! Welcome to the world, Movie Gen!
And here is the most exciting model we have been working on with special capabilies in text-to-video generation, video personalization, editing, and audio generation! Plus, an invaluable tech report released! Welcome to the world, Movie Gen!
We released 92 pages worth of detail including how to benchmark these models! Super critical for the scientific progress in this field :) We'll also release evaluation benchmarks next week to help the research community 💪
We released 92 pages worth of detail including how to benchmark these models! Super critical for the scientific progress in this field :) We'll also release evaluation benchmarks next week to help the research community 💪
📢 Point tracking 🤝 action recognition at #ECCV2024 We've set the new SoTA of few-shot action recognition by harnessing morion data from point tracking and semantic features from SSL. Curious? Visit Poster #203 Thursday AM to see the future of action recognition🔥. Details:🧵
Website: cs.umd.edu/~pulkit/tats/ Work done in collabration with @namithap10, Luke Luo, @rssaketh and @abhi2610. 3/3
🚀 Excited to share InstanceDiffusion @cvpr2024! It adds precise instance-level control for image gen: free-form text conditions per instance and diverse location specs—points, scribbles, boxes & instance masks Code: shorturl.at/dtxSW arXiv: shorturl.at/rQS14 1/n

KittyFranklin @COXday7230v2EpP
101 Followers 2K Following
Sukriti Paul @sukritiollie
534 Followers 604 Following PhD @umdcs @ml_umd || Prev @Nonexomics, @AmericanExpress & IISc || Google WTM, CSRMP, GHC, and ACM-W Scholar. || She/Her. viewsOwn()
Steven (Shaobo) Wang @ShaoboWang6
395 Followers 1K Following Ph.D Candidate @sjtu1896, Intern @Alibaba_Qwen. Exploring Data-Centric AI on LLMs, MLLMs, including data synthesis/pruning/distillation/attribution.
Mele @meleawi
102 Followers 2K Following AI, Computer Vision, deep learning, and Autonomous System Motion Planning and Control Software Engineer.
Anh Nguyen (Aengus) @aengusng8
118 Followers 2K Following Son & brother; AI Research Resident @Qualcomm; Contributor @huggingface🤗; Prev: @VinAI_Research. Update gradients in generative dimensions of computer vision.
Ollin Boer Bohan @madebyollin
3K Followers 2K Following Made sdxl-vae-fp16-fix, taesd, that pokemon-emulation-via-dnn thing.
praveen penumaka @praveenpenumaka
133 Followers 415 Following
miru @miru_why
1K Followers 1K Following 3e-4x engineer, unswizzled wagmi. specialization is for warps
Elausterio T Ferreira @Elausterio97035
48 Followers 957 Following
neeks @neeksww
285 Followers 590 Following a cluster of several atoms making a unique me - curious about nature and the signals (and systems) which make it. More here: https://t.co/6bsq8xhtTG
István Kerek @istvankerek
392 Followers 7K Following University Lecturer, Founder of the ChatGPT Hungarian Facebook Group and @ai2knowit, AI Business Development Expert
Crosesez @CrosesezHvFQ7n
24 Followers 38 Following
J @Jstl2bw
15 Followers 673 Following
Chen Sun @jesu9
2K Followers 498 Following Assistant Professor @BrownCSDept; Part-time Research Scientist @GoogleDeepMind. Opinions are my own.
Shijie Wang @ShijieWang20
200 Followers 426 Following Multimodal learning | CS PhD student @BrownUniversity, ex-Intern @GoogleDeepMind @meta, BS @Tsinghua_Uni.
Mara Levy @mlevy1221
80 Followers 104 Following PhD student @umdcs. | Excited to make robots work in the real world!
Fiona @tashethee72832
74 Followers 7K Following Don't wait for a leader; do it alone, one person at a time.
QuantiPhy @DebrupPaul2946
17 Followers 486 Following
Yuval Kirstain @YKirstain
704 Followers 659 Following Research Scientist @Meta | Building GenAI capabilities
Adam Polyak @adam_polyak90
159 Followers 246 Following
Kevin Chih-Yao Ma @chihyaoma
608 Followers 243 Following Building multimodal foundation models @MicrosoftAI | Past: a lead IC & babysitter of Meta's MovieGen, Emu, Imagine, ...
MagHolmes @JCQfep50n0H2He
51 Followers 7K Following
Shraman Pramanick @Shramanpramani2
207 Followers 541 Following PostDoc @AIatMeta Ph.D. @JohnsHopkins | Interned @AIatMeta FAIR, GenAI, @google GDM | Multimodal LLMs
Andrew white @Andreww95636515
135 Followers 2K Following 3d modeling. Gaussian splatting, NeRF, Diffusion models, GANs.
Silas Walkotte @SilasWalkotte
0 Followers 54 Following
jaiswati @jaiswati
22 Followers 450 Following
Jonas Gottschalk @JoSGottschalk
67 Followers 573 Following Building generative AI Solutions | Partner @Deyan7 GmbH & Co. KG | Hiring smart people who are passionate about developing effective digital solutions
Chris Chiasson @ChrisPChiasson
136 Followers 3K Following
Chester Jungseok Roh @chester_roh
4K Followers 2K Following Chester Jungseok Roh / BFACTORY Founder & CEO
Praveen @pravnx
453 Followers 4K Following Software Engineer; Interests: HPC, AI, Product Management, Entrepreneurship
prafulk @prafulk
242 Followers 4K Following
Guilherme @gpmarques1993
36 Followers 873 Following
Filip Kučera @PonekudOnekom
287 Followers 3K Following pronouns: e/acc; *manifesting*; PhD in mechinterp to debias VLMs @Uni_WUE, prev @CVUTPraha
Scott Nguyen @ScottNguye81334
0 Followers 11 Following
Dan @danredblack
60 Followers 2K Following
Maheedhar Gunturu @Vanguard_space
1K Followers 5K Following Mahee is a father, technologist, and a builder - formerly @aws @zscaler @smartthings @scylladb @mapr @VoltActiveData @qualcomm
iman jenabzadeh @imanjenabzadeh
46 Followers 1K Following Awandering with other wanderers in this wonderful world
Sergey Gulyaev @sergeygulyaev
62 Followers 139 Following #Mobile #Telecommunications #Marketing #Pricing #CRM
Nihaar Shah @non_gaussian
215 Followers 3K Following working on AI for new wearables @meta priors: @oxengsci @columbia @geresearch
Chris @chris___sun
324 Followers 2K Following Maker of https://t.co/yGT8C4DRH0 | exAI Scientist & Software Engineer @Microsoft | Get your shit done, nobody cares your tech stack. Learning investment.
Jonathon Luiten @JonathonLuiten
4K Followers 2K Following Head of Volumetric 3D Video at Meta Prev Projects: Hyperscape, MapAnything, Dynamic 3D Gaussian Splatting, SplaTAM, HOTA +more Prev PhD at RWTH + CMU + Oxford
Jascha Sohl-Dickstein @jaschasd
26K Followers 719 Following Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.
Lilian Weng @lilianweng
167K Followers 167 Following Co-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log
Ideogram @ideogram_ai
66K Followers 0 Following Turn your ideas into creative graphic designs, in a matter of seconds. What will you create?
Divya Kothandaraman @DivyaKRaman1
610 Followers 314 Following Sr. Researcher @Dolby. GenAI and Multimodal Learning. Earlier CS PhD, Univ. of Maryland @umdcs, @GoogleDeepMind @AdobeResearch @iitmadras
A Jabri @ajabri
4K Followers 283 Following research @ msl – 🇦🇺🇱🇧🇨🇳- ex @openai @berkeley_ai @princeton
Susan Zhang @suchenzang
34K Followers 661 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for intelligence.
Ollin Boer Bohan @madebyollin
3K Followers 2K Following Made sdxl-vae-fp16-fix, taesd, that pokemon-emulation-via-dnn thing.
Deepti @deeptigp
1K Followers 1K Following Asst. Professor in Computer Vision @ BU; Researcher @runwayml; ex-researcher @MetaAI, @UTCompSci , @iiit_hyderabad
Shijie Wang @ShijieWang20
200 Followers 426 Following Multimodal learning | CS PhD student @BrownUniversity, ex-Intern @GoogleDeepMind @meta, BS @Tsinghua_Uni.
Devi Parikh @deviparikh
26K Followers 211 Following Co-CEO @yutori_ai. Join the waitlist at https://t.co/zD3StYi8db.
Zhuang Liu @liuzhuang1234
11K Followers 1K Following Assistant Professor @PrincetonCS. researcher in deep learning, vision, models. previously @MetaAI, @UCBerkeley, @Tsinghua_Uni
Yuval Kirstain @YKirstain
704 Followers 659 Following Research Scientist @Meta | Building GenAI capabilities
Demis Hassabis @demishassabis
495K Followers 152 Following Nobel Laureate. Co-Founder & CEO @GoogleDeepMind - working on AGI. Solving disease @IsomorphicLabs. Trying to understand the fundamental nature of reality.
Nando de Freitas @NandoDF
105K Followers 788 Following Writing my own AI story. Recent: NPI, AlphaGo tuning, learn to learn, AlphaCode, Gato, ReST, r-Gemma, Imagen3, Veo, Genie, MAI …
Shelly Sheynin @ShellySheynin
1K Followers 209 Following Research Scientist @AIatMeta; Working on Media Generation; Meta Movie Gen, Emu Edit, Make-a-Video 3D, KNN Diffusion, Make-A-Scene
Wei-Ning Hsu @mhnt1580
2K Followers 133 Following Research Scientist @ Meta FAIR / audio generation, self-supervised learning, speech processing
Roshan Sumbaly @rsumbaly
2K Followers 750 Following Senior Director of AI, @AIatMeta - Llama & Movie Gen. Prior life @coursera, @linkedIn, @stanford
Adam Polyak @adam_polyak90
159 Followers 246 Following
Kevin Chih-Yao Ma @chihyaoma
608 Followers 243 Following Building multimodal foundation models @MicrosoftAI | Past: a lead IC & babysitter of Meta's MovieGen, Emu, Imagine, ...
Nathan Lambert @natolambert
57K Followers 858 Following Figuring out AI @allen_ai, open models, RLHF, fine-tuning, etc Contact via email. Writes @interconnectsai Wrote The RLHF Book Mountain runner
Yoshua Bengio @Yoshua_Bengio
26K Followers 211 Following Working towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec A.M. Turing Award Recipient and most-cited AI researcher.
Ranjay Krishna @RanjayKrishna
6K Followers 435 Following Assistant Professor @ University of Washington, Co-Director of RAIVN lab (https://t.co/f0BWKyjoeA), Director of PRIOR team (https://t.co/l9RzTesMSM)
Tanmay Gupta @tanmay2099
2K Followers 545 Following Senior Research Scientist @allen_ai (Ai2) | Developing the science and art of multimodal AI agents | Prev. CS PhD, UIUC and EE UG, IIT Kanpur
Laurens van der Maate... @lvdmaaten
4K Followers 2K Following Member of Technical Staff at Anthropic. Ex-Meta. t-SNE. Llama 3. DenseNet. Web-scale weakly supervised vision. CrypTen.
Jürgen Schmidhuber @SchmidhuberAI
165K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
Berkeley AI Research @berkeley_ai
228K Followers 379 Following We're graduate students, postdocs, faculty and scientists at the cutting edge of artificial intelligence research.
Lucas Beyer (bl16) @giffmana
110K Followers 524 Following Researcher (now: Meta. ex: OpenAI, DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://t.co/xe2XUqkKit ✗DMs → email
Neha Kalibhat @NehaKalibhat
2K Followers 576 Following research @GoogleDeepMind | PhD @umdcs | safety and interpretability | she/her
Ajay Jain @ajayj_
7K Followers 4K Following Co-founder @genmoai. Co-created denoising diffusion (DDPM), DreamFusion, Dream Fields. Ex Ph.D. @berkeley_ai, @googleai, @facebookai, @nvidiaai, @mit
Samaneh Azadi @smnh_azadi
956 Followers 110 Following Research Scientist @GenAI @Meta . Ph.D. graduate from Berkeley AI Research and former intern @GoogleBrain and @AdobeResearch.
Quentin Duval @quduval
380 Followers 318 Following Research Engineer in Artificial Intelligence at Meta, Software Engineer and Functional Programming enthusiast.
XuDong Wang @XDWang101
1K Followers 659 Following Research Scientist @AIatMeta | PhD from @Berkeley_AI @UCBerkeley | Prev.: @GoogleDeepMind, FAIR @MetaAI
Andrew Brown @Andrew__Brown__
3K Followers 482 Following Research Scientist GenAI NY @AIatMeta working on video generation (Meta Movie Gen) | PhD @Oxford_VGG with Andrew Zisserman, Previously @oxengsci
ICLR 2026 @iclr_conf
53K Followers 55 Following International Conference on Learning Representations #ICLR2026. SPC is @BharathHarihar3 and GC is @cvondrick
paul @paul_okewunmi
1K Followers 4K Following ML/AI Engineer | MLH Fellow'23 @ Meta | Drone Hobbyist
Center for AI Safety @ai_risks
7K Followers 4 Following Reducing societal-scale risks from AI. https://t.co/5I9YG8IZa7 https://t.co/u91FCIyeSV
toly 🇺🇸 @aeyakovenko
634K Followers 6K Following Co-Founder of Solana Labs. Award winning phone creator. NFA, don’t trust me, mostly technical gibberish. https://t.co/LomgbTpb6h
Kosta Derpanis @CSProfKGD
69K Followers 197 Following #CS Assoc Prof @YorkUniversity, #ComputerVision Scientist Samsung #AI, @VectorInst Faculty Affiliate, TPAMI AE, @ELLISforEurope Member #ICCV2025 Publicity Chair
Chelsea Finn @chelseabfinn
83K Followers 399 Following Asst Prof of CS & EE @Stanford Co-founder of Physical Intelligence @physical_int PhD from @Berkeley_EECS, EECS BS from @MIT
Judea Pearl @yudapearl
80K Followers 279 Following Student of causal inference, human reasoning, and history of ideas, all viewed through the sharp lens of artificial intelligence.
Thomas G. Dietterich @tdietterich
58K Followers 625 Following Distinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability