omar khaled @therealokai

AI enthusiast passionate about research and bridging the Arabic gap in artificial intelligence. و إن شاء الله هعملها linkedin.com/in/dsomarkhale… The Universe Joined September 2023

Tweets

93
Followers

107
Following

1K
Likes

1K

Connor Davis @connordavis_ai

2 days ago

This one paper might kill the AI scaling hype. While Big Tech burns billions on massive datasets, researchers just achieved state-of-the-art agent performance using 78 samples. And it makes a scary amount of sense. Here's the full breakdown:

25 158 706 55K 858

Download Image

آلاء. @2alaaaaaaaaaa2

a day ago

أكتوبر شهر الوقوع في الحب أنا في أكتوبر من ١٩٩٨:

72 546 6K 255K 577

Download Image

Dounia @buildwithDB

2 days ago

the closer I get to my MVP being done, the more I don’t want to ship it lol

260 25 752 32K 40

omar khaled @therealokai

2 days ago

بفتكر ايام ثانوية عامة ( و اول سنتين فالكلية) لما كان جزء من المدرسين و معيدين الكلية يعاملوني معاملة غريبة فشخ و لما اسألهم يقولولي "هو كدا شكلك مش عاجبني" لحد ما قررت استلسم و احلق

Mohamed Fathy @RAGNAR404_

2 days ago

9 2 656 80K 73

0 0 1 350 0

omar khaled @therealokai

2 days ago

حاجة كدا زي "عشان تعرف حسابك قوي ولا لا اكتب الباسورد في كومنت لو ظهر **** يبقي حسابك قوي"

Ahmad @TheAhmadOsman

2 days ago

حاجة كدا زي "عشان تعرف حسابك قوي ولا لا اكتب الباسورد في كومنت لو ظهر **** يبقي حسابك قوي"

216 448 9K 451K 1K

Download Image

0 0 0 95 0

Rohan Paul @rohanpaul_ai

3 days ago

This Tencent paper shows a way to improve reasoning by training only on raw text using reinforcement learning. It is called Reinforcement Learning on Pre-Training data (RLPT) and it removes the need for human labels. Simple “predict the next segment” rewards are enough to…

8 40 191 16K 127

Download Image

omar khaled @therealokai

3 days ago

"Rule number one, DON'T FKN SAY 'you are absolutely right!' and write code!"

Shreya Shankar @sh_reya

3 days ago

"Rule number one, DON'T FKN SAY 'you are absolutely right!' and write code!"

324 264 6K 252K 200

0 0 0 44 0

Omar Khattab @lateinteraction

2 years ago

DSPy and ColBERT are interesting academic experiments imo. Each is a multi-paper repo that has one coherent artifact, combining our latest research together. We typically release the features as open source—hence get users/feedback—well before writing a paper on the new ideas.

Jacques @JacquesThibs

2 years ago

0 0 1 30K 0

1 10 67 31K 19

Omar Khattab @lateinteraction

a year ago

Talking to grad students, too many think that long-term projects (not scattered papers), proper code releases, thoughtful benchmarks are "not incentivized". Most often they're mistaken. If we're talking incentives, *nothing* matches demonstrating impact! Will blog on this soon.

9 26 287 34K 51

Jerry Tworek @MillionInt

4 days ago

Science of RL optimization is likely humanity’s last open scientific problem

43 62 1K 118K 283

Dimitris Papailiopoulos @DimitrisPapail

4 days ago

Prediction: In ~3 years academia will be the most desirable place to do fundamental AI research Contributing factors: - small models improve/become significantly more impactful - open weights community broadens its reach - gpus continue to get faster & cheaper - meaningful…

23 33 470 56K 200

anshuman @athleticKoder

5 days ago

You're in a ML Engineer interview at Meta, and the interviewer asks: "Why does RL work better than supervised learning for LLMs?" Here's how you answer:

14 38 974 88K 1K

Mahmoud Abo Elyazid 🇵🇸🔻 @mahmoud20825671

7 days ago

كنت بدوّر على كورسات عن إزاي تبني Start-up، بحيث تكون فاهم إيه اللي هيتم وتوسع وجهة نظرك بشكل أكبر. وصلت للكورس ده فحبيت أشاركه معاكم، يمكن يساعد شخص يبدأ مسيرة مهنية جديدة بإذن الله ❤️

4 47 640 27K 780

Download Image

机器之心 JIQIZHIXIN @jiqizhixin

6 days ago

Wow, a new post-training method. SFT = efficient but capped 🚦 RL = powerful but slow 🐢 Now enter: Guess-Think-Answer (GTA) GTA fuses guess (SFT), think (reflection), and answer (RL-shaped). Result: ⚡ Faster convergence than RL 📈 Higher ceiling than SFT 🛠️ Gradient…

7 67 338 20K 272

Download Image

Bashmohandes Mazen - بشمهندس مازن @BashmohandesM

7 days ago

انت عارف المشكلة فين؟ انك كل مرة بتوصل ل ٨٠٪ … بعد كدة بتوقف و ترجع تعيد تاني. ال ٢٠٪ الاخيرة دي هي اللي فيها كل حاجة

1 24 183 4K 26

François Chollet @fchollet

3 years ago

"It's autocomplete" is not a helpful analogy to understand LLMs. A LLM is more like a database that lets query information in natural language. You can query both knowledge, and "patterns" (associative programs seen in the training data, that can be applied to new inputs).

35 141 1K 385K 335

omar khaled @therealokai

6 days ago

“The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The bitter lesson is that building in human knowledge is a losing game in the long run.” – Sutton

omar khaled @therealokai

6 days ago

0 0 0 42 0

0 0 0 25 0

omar khaled @therealokai

6 days ago

"We should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries." Richard Sutton

0 0 0 42 0

Nando de Freitas @NandoDF

7 days ago

Most RL for LLMs involves only 1 step of RL. It’s a contextual bandit problem and there’s no covariate shift because the state (question, instruction) is given. This has many implications, eg DAgger becomes SFT, and it is trivial to design Expectation Maximisation (EM) maximum…

21 62 708 93K 727

Download Image

Dwarkesh Patel @dwarkesh_sp

a week ago

How does backprop work with RL? The virtue of backprop is that it updates EACH individual parameter in proportion to how much wiggling it affects the loss. This is only possible if you know how changing each parameter affects the loss function. But of course with RL this is…