🚨Thrilled to share our latest progress on Computer Use Agent, ComputerRL, an end-to-end RL method which achieves 48.1% success rate on OSWorld Benchmark with only 9B open model, beating OpenAI Operator, Claude Sonnet 4.0, and other previous models, state-of-the-art performance.…
🚨Thrilled to share our latest progress on Computer Use Agent, ComputerRL, an end-to-end RL method which achieves 48.1% success rate on OSWorld Benchmark with only 9B open model, beating OpenAI Operator, Claude Sonnet 4.0, and other previous models, state-of-the-art performance.…
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
"To support scalable and robust training, we develop a distributed RL infrastructure capable of orchestrating thousands of parallel virtual desktop environments to accelerate large-scale…
Introducing GLM-4.5 and GLM-4.5 Air: new flagship models designed to unify frontier reasoning, coding, and agentic capabilities.
GLM-4.5: 355B total / 32B active parameters
GLM-4.5-Air: 106B total / 12B active parameters
API Pricing (per 1M tokens):
GLM-4.5: $0.6 Input / $2.2…
Today we release DeepSeek-R1T-Chimera, an open weights model adding R1 reasoning to @deepseek_ai V3-0324 with a novel construction method.
In benchmarks, it appears to be as smart as R1 but much faster, using 40% fewer output tokens.
The Chimera is a child LLM, using V3s…
🥳 I'm releasing Rejax, a lightweight library of fully vectorizable RL algorithms!
⚡ Enjoy lightning-fast speed using jax.jit on the training function
🧬Use vmap and pmap on hyperparameters
🔙 Log using flexible callbacks
🌐 Available @ github.com/kerajli/rejax
📸 Take a tour!
Sorry to hear that @jsuarez5341, Open RL Benchmark was also rejected from RLC, and we mostly feel the same way about review quality (LLM-generated?). Among other things, we read that "the meaning of "metrics" is never made clear", whereas we have a section dedicated to metrics,…
Sorry to hear that @jsuarez5341, Open RL Benchmark was also rejected from RLC, and we mostly feel the same way about review quality (LLM-generated?). Among other things, we read that "the meaning of "metrics" is never made clear", whereas we have a section dedicated to metrics,…
Which is the best RL agent on the Hub? Now you can, thanks to the Open RL leaderboard 🏆 !
🧩 Features:
- Automatic evaluation of models on the 🤗 Hub
- Compatible with all torch-based RL libraries
- Supports 87 environments, with more to come 🔥
huggingface.co/spaces/open-rl…
Announcing the Reinforcement Learning Beyond Rewards workshop at the first @RL_Conference.
Think that rewards aren't enough for RL?
Working on RLHF? Thinking of alternative ways of alignment?
Creating a foundational model for RL? or have ideas on task-agnostic RL algo? Join us
23 Followers 236 FollowingCS Ph.d student and Research on RL at @iLZU1909, Founder & maintainer of the Deep Reinforcement Learning (Chinese Community) https://t.co/jwqyVc9Vc7
268 Followers 498 FollowingPostDoc @PurdueEngineers @purdue_ie. Ph.D. @PurdueCS. Reinforcement Learning, Robotics, AI in Surgery. Interested in understanding Intelligence and Universe.
18K Followers 809 FollowingReinforcement Learner @periodiclabs. Adjunct Prof at McGill. Ex MSL Meta, DeepMind, Brain, Mila, IIT Bombay. NeurIPS Best Paper
320 Followers 557 FollowingAssistant Professor @UM_DACS.
Opinions my own, but should be everyone's.
Anon feedback: https://t.co/dWtWvc41ha
https://t.co/TYaw5PODFv
14K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
4K Followers 276 FollowingPhD-ing @UCBerkeley. Part-time @AnthropicAI. Part-time eater. Prev @Tsinghua_Uni.
Try to understand and control intelligence as a human.
80K Followers 1 FollowingDemocratizing AI research, education, and technologies. Learn how to build with AI in our new AI Academy: https://t.co/zQXQt0Pem8
23 Followers 236 FollowingCS Ph.d student and Research on RL at @iLZU1909, Founder & maintainer of the Deep Reinforcement Learning (Chinese Community) https://t.co/jwqyVc9Vc7
6K Followers 559 Followinge/λ Currently: Doing some stuff with AI.
Prev founding team of both: @NousResearch and @TTSLabsAI
DM for interesting conversations.
19K Followers 1K FollowingAgents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU
242 Followers 177 FollowingPhD student at the University of Edinburgh, curious about how humans and AI co-exist and co-evolve. Previously at Cohere and Google DeepMind.
9K Followers 866 Followingbuilding @vllm_project |
cs phd @ 🌁 uc berkeley |
machine learning system |
the real agi is the friends we made along the way