📢 Excited to share that our joint work with @MakarandTapaswi's lab, led by two brilliant @iiit_hyderabad undergrads @Prajneya and @_eshk_, comparing attention patterns in models and humans viewing videos for a memorability task has been accepted to WACV2025. 🧵👇 1/15
Can RL fine-tuning endow MLLMs with fine-grained visual understanding?
Using our training recipe, we outperform SOTA open-source MLLMs on fine-grained visual discrimination with ClipCap, a mere 200M param simplification of modern MLLMs!!!
🚨Introducing No Detail Left Behind:…
🚨 Introducing Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation.
Given an image pair, it is easier for an MLLM to identify fine-grained visual differences during VQA evaluation than to independently detect and describe such differences 🧵(1/n):
820 Followers 2K FollowingNLP/Code Generation PhD at FAIR (Meta AI) and INRIA - previously researcher at Stanford University - MS Stanford 22’ - Centrale Paris P2020
235 Followers 316 FollowingResearch Scientist Internship @AdobeResearch | PhD at Imagine (ENPC) and Willow (Inria) under the supervision of Gül Varol and Cordelia Schmid.
322 Followers 1K FollowingResearch focused on exploring various aspects of first-person (egocentric) vision. Working with Prof. @dimadamen at @BristolUni.
394 Followers 424 FollowingPhD Student at Max Planck Institute. Past @iiit_hyderabad @VectorInst. Interested in better evals, forecasting, and open-endedness.
15K Followers 50 FollowingEMNLP 2025 - The 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Hashtag: #EMNLP2025
Dates: November 5-9
Submission Deadline: May 19th
16K Followers 491 FollowingAssociate Professor @UofT, Vice President of AI Research @nvidia, founding member of @VectorInst. Computer vision, deep learning, 3D. Opinions are my own.
10K Followers 699 FollowingProfessor of Computer Vision, @BristolUniEng. Senior Research Scientist @GoogleDeepMind - passionate about the temporal stream in our lives.
3K Followers 54 FollowingI am a professor for computer science at the Technical University of Darmstadt focusing on the AI-based motion capturing and synthesis of digital humans.
6K Followers 435 FollowingAssistant Professor @ University of Washington, Co-Director of RAIVN lab (https://t.co/f0BWKyjoeA), Director of PRIOR team (https://t.co/l9RzTesMSM)
7K Followers 245 FollowingDirector, GenAI Research @Meta. Tech Lead of Movie Gen Meta. Past: MIT TR35, Llama3, Emu Video, ImageBind, DINO. Tweets and opinions are my own.
3K Followers 896 FollowingProfessor for CS at the Tuebingen AI Center @uni_tue and affiliated Professor at MIT-IBM Watson AI lab @MITIBMLab - Multimodal learning and video understanding
655 Followers 128 Following🇫🇷 PhD student in computer vision @ImagineEnpc
Interests: deep learning, 3D reconstruction, 3D scene understanding and 3D scene rendering.
119 Followers 132 FollowingPostdoc in Narrative Modeling @ Pioneer Institute for AI, in Copenhagen; PhD in Computer Vision; conceptual artist; tortured-philosopher; ex-poet
2K Followers 956 FollowingAssistant Professor for Comp. Vision @UvA_Amsterdam
3D human bodies+hands from images.
Past: Res. Scientist at @MPI_IS, PhD at @UniBonn, Diploma @Aristoteleio.
226 Followers 11 FollowingI am an ELLIS PhD student in the IMAGINE computer vision team of École des Ponts ParisTech (ENPC) and in the Perceiving Systems Department of MPI
16K Followers 307 FollowingTeaching AI to see, model, and interact with our 3D world. Assistant Professor @ MIT, leading the Scene Representation Group (https://t.co/h5gvhLYrtw).