Rasool Fakoor @rasoolfa

Research in RL & ML. rasoolfa.github.io Joined December 2012

Tweets

602
Followers

390
Following

935
Likes

2K

Rasool Fakoor @rasoolfa

2 months ago

The application closes on Tuesday (8/12). If you are interested, please apply and don't wait until the last minute.

Rasool Fakoor @rasoolfa

2 months ago

The application closes on Tuesday (8/12). If you are interested, please apply and don't wait until the last minute.

1 3 14 2K 5

0 0 0 179 0

Our team is *hiring* interns & researchers! We’re a small team of hardcore researchers & engineers working on foundation models, agentic methods, and embodiment. If you have strong publications and related experience, plz fill out application form. forms.gle/4bUeFfksUhCLap…

1 3 14 2K 5

Robert Yang @GuangyuRobert

2 months ago

Our Excel Agent, Shortcut, is generally available now! Greatly improved trust-worthiness & accuracy. ~90% win rate against top first-year analysts 26 days since early access, 28 versions shipped So proud of the team, and really appreciate all the feedback from our users!

nico @nicochristie

2 months ago

314 706 8K 3.6M 11K

Download Video

10 17 233 39K 148

nico @nicochristie

2 months ago

Shortcut – the first superhuman excel agent – is live. While not perfect, Shortcut beats first year analysts from McKinsey/Goldman head-to-head 89.1% (220:27) when blindly judged by their managers. We even gave humans 10x more time. Try Shortcut now (before your boss does).

314 706 8K 3.6M 11K

Download Video

Robert Yang @GuangyuRobert

4 months ago

Many of you have known us as Altera. Today, I'm happy to share that we are now officially @Fundamental Research Labs! We will be unveiling our next big step today, so it felt perfect to reintroduce ourselves: digitalhumanity.substack.com/p/introducing-…

6 10 104 13K 33

Tianwei Ni @twni2016

5 months ago

Can we make LLMs reason effectively without a huge inference time cost? We show a powerful approach through learning and forgetting! Our recipe: 1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),…

0 6 23 2K 7

Download Image

Ke Yang @EmpathYang

8 months ago

Excited to announce that our web agent paper, AgentOccam, has been accepted to ICLR 2025! 🏂🏂🏂 Huge thanks to all collaborators! 😊 Special thanks to my brilliant and considerate mentor, Yao @yaoliucs, for your constant guidance and encouragement! Sapana @Sapana_007 and Rasool…

Ke Yang @EmpathYang

12 months ago

3 28 60 11K 37

Download Image

0 6 16 1K 2

Download Image

Ke Yang @EmpathYang

12 months ago

👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄 🧙 Link: arxiv.org/abs/2410.13825…

3 28 60 11K 37

Download Image

Jesse Zhang @Jesse_Y_Zhang

a year ago

I’ll be presenting this work at CoRL 2024 in about a month. Let’s chat about sample-efficient robot adaptation! Website: jessezhang.net/projects/extra… Paper: arxiv.org/abs/2406.17768 Coauthors: @MinhoHeo, @LiuZuxin, @ebiyik_, @JosephLim_AI, @yaoliucs, @rasoolfa

0 2 5 546 2

Jesse Zhang @Jesse_Y_Zhang

a year ago

How can robots efficiently learn **new tasks/in new settings**? Introducing EXTRACT: a reinforcement learning (RL) framework that extracts a discrete + continuously parameterized skill library from offline data for efficient RL on new tasks! Accepted to CoRL 2024: 🧵👇

5 36 128 15K 60

Download Gif

Alex Smola @smolix

a year ago

Proud to release the first LLM from @boson_ai. Higgs-Llama-3-70B, built for characters and gameplay, trained on Boson-3 base. With great MMLU-Pro performance. boson.ai/higgs-opensour…

1 9 50 8K 9

Rasool Fakoor @rasoolfa

2 years ago

Our team at AWS is *hiring* interns and full-time researchers! @yaoliucs, @pratikac, I, and others work on RL, alignment, large models, and ML in general. If you have a strong relevant publications in those areas, please fill out this form. forms.gle/5KsNZ1zyKArLF4…

0 4 24 11K 34

Yao Liu @yaoliucs

2 years ago

Offline RL is much harder than online RL or imitation learning as it needs to solve a sequence of counterfactual reasoning problems. That often gives an error of (1+\delta)^H, where delta is the one-step divergence of policy or extrapolation of Q and H is the horizon. 1/N

1 2 24 3K 9

Yao Liu @yaoliucs

2 years ago

One common misconception about (deep) RL is that is was done by first defining some empirical loss as objective and then deriving model updating rules from GD, just like supervised learning. This is NOT the case for popular RL algorithms like policy gradient or TD-based. 1/N