I finally find an explanation for why RL is needed for RLHF that satisfied me. It's actually like playing board games.
The reward model can only judge a full answer and a "critic" is needed to efficiently improve the intermediate moves (earlier tokens in the answer) 1/4
I have a recommended reading list for Artificial Intelligence, and it hasn't changed since 2019. I give this list to my grad students, but all of the articles are broadly accessible if you're interested. Very short 🧵.
With the rise of RLHF thanks to chatGPT, I remembered this famous slide by @ylecun. It is funny to think that in this usecase the cherry seems to be a key ingredient to enjoy the full cake!
Are you into machine learning?
From 9:00 am to 5:00 pm every day, I try to improve the world with my work. Right after that, I come to Twitter and write about everything I've learned.
If you are into machine learning, say hello, and let's keep building this community!
"You can't use an algorithm unless you understand how it works."
That's what many people say. But I don't believe it.
This is how you can build expertise: ↓
80 Followers 199 FollowingStrategist sia nei giorni pari che in quelli dispari. Mi occupo di comunicazione, marketing, innovazione e tecnologia. In sintesi, di attualità.
621 Followers 1K FollowingDeeply interested in Lean / Agile mindset, continuous improvement, software development and complex systems. Love good wine and dogs
38 Followers 91 FollowingTop 10! #Review The Best Products From #AliExpress And #Amazon 2020. New Tech Gearbest. Deals #Banggood. #Shopping_Online. Haul. Cool Technology.
6K Followers 5K FollowingWe bring culture, strategy and technology together to make sure your Cloud Native Transformation is done right. Subscribe to our #WTFisCloudNative newsletter:
309 Followers 2K FollowingLLMs @translation | Former student @UnivRoma3 on #XAI & ML for Data Integration | Web Security, Ranked in the top-150 @Google Bughunter Hall of Fame.
648K Followers 35 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
227K Followers 520 FollowingInstagram || https://t.co/KrGZNbRP6K ⭐️ FC Coins || Use code Romania for 10% off @FUTCoinShop⭐️ ⭐️ WL Services || @FUTdeepview 👇Discord👇
5K Followers 954 FollowingWe match @MITstudents with internship, teaching, study abroad & research opportunities globally. We also fund #MITfaculty research projects abroad. #GoMISTI
7K Followers 270 FollowingHarvard-MIT Health Sciences and Technology. Integrating science, engineering, and medicine to solve problems in human health.
46K Followers 1K FollowingThe Koch Institute brings interdisciplinary approaches together to advance the fight against cancer.
More places to follow us: https://t.co/fPPRkN0PSj
14K Followers 1 FollowingA research workshop on large language model gathering 1000+ researchers around the world
Follow the training of the 176B multilingual model live @BigScienceLLM
119K Followers 2K FollowingThe craziest news about Italy and Italian politics.
"A crazy good Twitter feed" @nytimes
Telegram: https://t.co/0smyD8L8AE
📩 [email protected]
131K Followers 1K FollowingQuello di Breaking Italy. Ho fondato il brand Heya e altro. Leggo e ascolto molto, parlo meno, sbaglio accenti. Mi piacciono le cose fatte bene. 🕹🏍🥃
66K Followers 4K FollowingInvestigative journalist @fattoquotidiano,ex Repubblica ex Espresso,has worked on all WikiLeaks releases+Snowden files.PGP+SD:https://t.co/lfJ6ZgoN7C
76K Followers 121 FollowingConfindustria parody è la principale associazione a delinquere di stampo satirico operante nel globo terracqueo.
Parody ma non troppo
106K Followers 4K FollowingDedicated to shaping a world in which AI enhances human potential & transforms how businesses operate, via #agentic automation.
56K Followers 2K FollowingAutomation Anywhere is the leader in intelligent automation solutions that put AI to work across every aspect of an organization.
435 Followers 372 FollowingImplementing #RPA for #ROI in business processes enables human to shift focus from repetitive tasks to higher value roles #intelligentautomation
80 Followers 199 FollowingStrategist sia nei giorni pari che in quelli dispari. Mi occupo di comunicazione, marketing, innovazione e tecnologia. In sintesi, di attualità.
283 Followers 409 FollowingFounder/CEO/CTO @MetatopiaXR. Metatopia is a digital world where anyone can create 3D digital spaces for a plethora of use cases.
1.1M Followers 0 FollowingNational Security Agency/Central Security Service official account, home to America's codemakers and codebreakers. Likes, retweets, and follows ≠ endorsement.
83K Followers 324 FollowingAll things AI for developers from @NVIDIA.
Additional developer channels: @NVIDIADeveloper, @NVIDIAHPCDev, and @NVIDIAGameDev.
141K Followers 39 FollowingSan Diego Dec 2-7, 25 and Mexico City Nov 30-Dec 5, 25. Tweets to this account are not monitored. Please send feedback to [email protected].