Bro if this dumbass can get millions of dollars for antisocial vaporware that no one uses (1k$ per one-time user) then surely I can just convince some VC to give me like 1m$ to hire ML researchers for an experiment right? Is that a thing I can do?
Is it a viable business…
Bro if this dumbass can get millions of dollars for antisocial vaporware that no one uses (1k$ per one-time user) then surely I can just convince some VC to give me like 1m$ to hire ML researchers for an experiment right? Is that a thing I can do?
Is it a viable business…
Can you just take a coarse MoE, split all the experts in `n` parts and multiply the number of active experts by `n` (or slightly less than n), and also modify the router appropriately and then continue pretrain from there?
Higher active experts tends to be pareto optimal so yes…
I can't wait for the day we can do this with Minecraft. Imagine 1,000 vs 1,000 agent battles with mods, etc. managed by a small central team of human commanders. I did MC modding before ML and this would be such a sick ass project/community event to pull off.
I don't think the…
I can't wait for the day we can do this with Minecraft. Imagine 1,000 vs 1,000 agent battles with mods, etc. managed by a small central team of human commanders. I did MC modding before ML and this would be such a sick ass project/community event to pull off.
I don't think the…
I tried my best to make a trading strategy to invest in OAI/Ant indirectly just for fun.
For the record I have no idea what I'm doing, my knowledge of financial markets is extremely slim and I didn't write the code.
Don't overindex too much on the relative increases/decreases…
The attention module? Map-reduce.
The matmuls in the modules? Map-reduce.
Activation functions + down projection? Map reduce!
Distributed training? Believe it or not, also map-reduce!
It's all map-reduce
@CharlesFLehman I want to answer this, because the dark abundance stuff leaves me cold even though I agree with some of the policy prescriptions.
Our prisons are terrible, terrible places where there is immense and unnecessary human suffering. In prison people are subject to random violence…
Dem financial power struggles: "uhm if you'd please consult this 500 page essay and these 12 research papers we clearly need to increase the housing double triple loan default rate by .004% or the economy will collapse"
Republican financial power struggles: "I'm gonna fucking…
Dem financial power struggles: "uhm if you'd please consult this 500 page essay and these 12 research papers we clearly need to increase the housing double triple loan default rate by .004% or the economy will collapse"
Republican financial power struggles: "I'm gonna fucking…
How petitioning to skip all my core requirements w/ my research projects looks at me when I can take harder classes instead of trying to get a good GPA for a PhD
Mixing the Mexican rice and biryani I got from the restaurant with my chicken rice in ratios corresponding to perceived cooking skill
We doing Bayesian model averaging now
With such a large diversity in environments, we will also be able to make the first scaling laws for RL Envs. As the number of Envs increases, the overall number of reinforcements needed per task should decrease, and it should be a metric we track.
We should also track the…
With such a large diversity in environments, we will also be able to make the first scaling laws for RL Envs. As the number of Envs increases, the overall number of reinforcements needed per task should decrease, and it should be a metric we track.
We should also track the…
32 Followers 208 Following1+ year uni student sabbatical, restaurant busser, quant trading (larper), tech building in public (never started), whimsical & delulu. Non ducor, duco.
240 Followers 483 FollowingPhD student at @EPFL🇨🇭 working on improved understanding of deep neural networks and their optimization. Previously did NN training @Tesla_AI @CerebrasSystems
23K Followers 53 FollowingCommunity account for sharing ClaudeCode related projects and releases. Views/shares independent from @AnthropicAI positions.
2K Followers 532 FollowingAssistant Professor at @TelAvivUni and Research Scientist at @GoogleResearch; previously postdoc at @GoogleDeepMind and @allen_ai
29K Followers 1K FollowingAI, national security, China. Part of the founding team at @CSETGeorgetown (opinions my own). Author of Rising Tide on substack: https://t.co/LKAoyL00iB
14K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
5K Followers 8 FollowingInteractive AI explainers.
Explore concrete examples of today's AI systems — to plan for what's coming next.
A project of @sage_future_