Shanghai AI Lab, General Vision Team. We created InternImage, BEVFormer, VideoMAE, LLaMA-Adapter, Ask-Anything, and many more! [email protected]github.com/OpenGVLab Shanghai Joined January 2023
Top AI Papers of The Week (September 15-21)
- OmniWorld: Multi-Domain 4D World Modeling
- ScaleCUA: Scaling Cross-Platform Agents by
@opengvlab
- WebWeaver: Dynamic Outlines for Deep Research
- Scaling Agents via Continual Pre-training
- FlowRL: Matching Reward Distributions for…
ScaleCUA: Master GUIs across 6 OS with our new open-source agent!
This VLM-powered agent sets new SOTA, trained on a massive dataset of 6 OS and 3 task domains, enabling seamless cross-platform operation. Researchers, explore its capabilities!
ScaleCUA 🔥 computer-use agents with cross-platform data, released by @opengvlab
Paper:
huggingface.co/papers/2509.15…
Model: huggingface.co/collections/Op…
✨ 3B/7B/72B - Apache2.0
✨ Two modes: Direct Action & Reasoned Action (agent)
✨ Trained on 6 OS + 3 domains with a dual-loop…
🥳We have released #InternVL3, an advanced #MLLM series ranging from 1B to 78B, on @huggingface.
😉InternVL3-78B achieves a score of 72.2 on the MMMU benchmark, setting a new SOTA among open-source MLLMs.
☺️Highlights:
- Native multimodal pre-training: Simultaneous language and…
🚀 Introducing MM-Eureka Series - A Breakthrough in Multimodal Reasoning with Visual Aha Moments!
✨ Reproduced R1-Zero and Visual Aha-Moment Phenomena
🧠 Trained on only 0.05% of the data used for base models, it achieves comparable benchmark math reasoning performance to…
🚀 Introducing #InternVideo 2.5 - The Video Multimodal AI That Sees Longer & Smarter!
✨ Handles videos 6x longer than predecessors
✨ Pinpoints objects/actions with surgical precision ✨ Trained on 300K+ hours of diverse video data
📈 Outperforms SOTA on multiple benchmarks &…
🥳Mini-InternVL has been accepted by Visual Intelligence! The Mini-InternVL series of #MLLMs, with parameter ranges from 1 B to 4 B, achieve 90% of the performance using only 5% of the parameters. This significant efficiency and performance boost makes our model more accessible…
People pay more and more attention on the quality or details of generated videos. Using a single hand-tuning temperature parameter to enhance your generated video for free! Nice work with our amazing friends @YangL_7@oahzxl, @shaowenqi126301, @VictorKaiWang1, @VITAGroupUT,…
People pay more and more attention on the quality or details of generated videos. Using a single hand-tuning temperature parameter to enhance your generated video for free! Nice work with our amazing friends @YangL_7@oahzxl, @shaowenqi126301, @VictorKaiWang1, @VITAGroupUT,…
🥳We have released InternVL2.5, ranging from 1B to 78B, on @huggingface .
😉InternVL2_5-78B is the first open-source #MLLM to achieve over 70% on the MMMU benchmark, matching the performance of leading closed-source commercial models like GPT-4o.
🤗HF Space:…
The tech report is worth reading. It reveals many details about how InternVL 1.5, InternVL 2.0, and now InternVL 2.5 can be the best open-source #vlm foundation model all the time.
huggingface.co/papers/2412.05…
The tech report is worth reading. It reveals many details about how InternVL 1.5, InternVL 2.0, and now InternVL 2.5 can be the best open-source #vlm foundation model all the time.
huggingface.co/papers/2412.05…
100 Followers 1K Following🌱 Life long learner ..| Generative AI Engineer | Founder/CEO @TechParivartan💡@AiParivartanLab | Everything started from "0" , in hope to reach "1" someday |
1 Followers 330 FollowingChinese👌 English👌 Japanese(50%)
undergraduate, yet feel sick of my university and my life.
Here,this is maybe merely to show my cynical .
521 Followers 486 FollowingResearch Scientist @Nvidia, previously intern @Meta and PhD @JohnsHopkins.
I find my dog still more robust than models I've trained.
338 Followers 5K FollowingEngineer + MBA, Background in Finance, underwriting, risk management with software engineering skills and ability to analyze large volumes of data using ML, AI.
1K Followers 575 FollowingCS PhD Student @Berkeley_AI and @BerkeleySky. Prev. MS @Princeton_NLP, BS @HDSIUCSD and @CogSciUCSD; '25 @SiebelScholars; I work on multimodal models; He/Him.
45K Followers 3K FollowingWe're in a race. It's not USA vs China but humans and AGIs vs ape power centralization.
@deepseek_ai stan #1, 2023–Deep Time
«C’est la guerre.» ®1
2K Followers 782 FollowingResearch Director @XPENGRobotics @XPengMotors. We are hiring!🦾 Previously Principal Researcher @TencentGlobal. PhD from MMLab @CUHKofficial.
4K Followers 93 FollowingPhD, Distinguished Engineer @Sony, Lead Research Scientist/VP of AI Research @SonyAI_global, Head of Creative AI Lab, Former Associate Prof. @tokyotech_jp
318K Followers 1K FollowingAI Educator. 𝕏 about AI, solutions and interesting things. Showing how to leverage AI in practical ways for you and your business. Opinions are my own.
717K Followers 288 FollowingTogether with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
327K Followers 3K FollowingNVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
2K Followers 510 FollowingInstitute of Data Science, The University of Hong Kong. Founder of FeelingAI. Looking for Interns/RAs/PhDs/Postdocs/Full-time researchers and engineers.
8K Followers 472 FollowingPresidential Young Professor at @NUSingapore. @Forbes 30 under 30. Ph.D. from @UCBerkeley. Founder and Chairman of @HPCAITech
71K Followers 500 FollowingTurning complexity into clarity with expertly curated abstracts, citations, and enriched data, linking scholarly literature across diverse disciplines.
955K Followers 765 FollowingProfessor at NYU. Chief AI Scientist at Meta.
Researcher in AI, Machine Learning, Robotics, etc.
ACM Turing Award Laureate.