NebberCracker @NebberCrackker

Need Monte Carlo Simulation in life Joined October 2023

Tweets

68
Followers

11
Following

332
Likes

4K

NebberCracker @NebberCrackker

2 days ago

Thanks to whoever made this !!

0 0 0 3 0

Download Image

NebberCracker @NebberCrackker

a week ago

Saturday JAM !!

0 0 0 10 0

Download Image

Radiance Fields @RadianceFields

4 weeks ago

It's a bird. It's a plane. It's gaussian splatting. When @Framestore needed to bring Kryptonian tech to Earth, they turned to gaussian splatting. In the recently released Superman, every shot of Kal-El's parents is a dynamic splat. I spoke to the team at Framestore about…

8 43 356 99K 145

Download Video

NebberCracker @NebberCrackker

a month ago

Solving Triton Puzzles is such a fun activity !!

0 0 1 47 0

Download Image

Jacob Austin @jacobaustin132

a month ago

Today we're putting out an update to the JAX TPU book, this time on GPUs. How do GPUs work, especially compared to TPUs? How are they networked? And how does this affect LLM training? 1/n

38 531 4K 387K 5K

Download Image

Alex Zhang @a1zhang

a month ago

announcing the @GPU_MODE x @scaleml summer speaker series happening next week, a 5⃣-day series where top researchers will teach about the algorithmic and systems-level advances that underpin `gpt-oss`! all content will be live-streamed & recorded for FREE on GPU MODE's YouTube!

1 45 270 28K 142

Download Image

Yacine Mahdid @yacinelearning

2 months ago

for those curious about how 1M context model is even possible here is a 47min deep dive on the minimax-01 open model in it we cover the lightning linear attention mechanism the hybridization strategy to make it work and how to go beyond and make the model multimodal wild stuff

8 38 560 23K 470

Download Image

NebberCracker @NebberCrackker

2 months ago

Yo chat... Check out my article on batching strategies in LLMs Inference.. open.substack.com/pub/bargav/p/b… Any feedback back is highly appreciated 🤘

0 0 1 31 0

NebberCracker @NebberCrackker

3 months ago

Distributed Training.... I am coming to get ya.. !!

0 0 1 100 0

Download Image

Xiuyu Li @xiuyu_l

3 months ago

Sparsity can make your LoRA fine-tuning go brrr 💨 Announcing SparseLoRA (ICML 2025): up to 1.6-1.9x faster LLM fine-tuning (2.2x less FLOPs) via contextual sparsity, while maintaining performance on tasks like math, coding, chat, and ARC-AGI 🤯 🧵1/ z-lab.ai/projects/spars…

5 59 207 35K 131

Download Video

⁶𓅓t1h1🌻 @2PimpABfly

3 months ago

Me when I sing both parts of Luther

30 2K 15K 348K 587

Download Image

Han Guo @HanGuo97

4 months ago

We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between? Introducing Log-Linear Attention with: - Log-linear time training - Log-time inference (in both time and memory) - Hardware-efficient Triton kernels

16 202 1K 261K 849

Download Image

NebberCracker @NebberCrackker

4 months ago

I think after Attention (learnt back in 2020), Flash-Attention is the only machine learning algorithm that gave me immense happiness once I cracked it. #MachineLearning #Llm

0 0 1 27 0

Download Image

NebberCracker @NebberCrackker

5 months ago

Spent last week building an #LLM completely from scratch — and I mean everything: BPE tokenizer (optimized), Linear, SWIGLU, RoPE attention, and full Transformer blocks, AdamW, cross_entropy, and top-p decoding etc.. and trained on children stories. github.com/bargav25/llm

0 0 1 58 0

NebberCracker @NebberCrackker

5 months ago

wrapping up things in my current job to move to a new job. And the worst has come.... I've to document all my code now🥲

0 0 1 34 0

NebberCracker @NebberCrackker

5 months ago

How much time do you think it takes to train a Byte Pair Encoding Tokenizer on corpus of 2M docs? Well.... I did it in 2 mins. Main Thanks to MultiThreading and hashmaps. 😉 bargav.substack.com/p/bpe-tokenize… github.com/bargav25/fast_… #LLM #ArtificialIntelligence #DeepLearning

0 1 1 48 0

NebberCracker @NebberCrackker

6 months ago

I was one of them 🥲

JOLLY J✨ @DynamoSuperX

6 months ago

I was one of them 🥲

2K 18K 318K 14.8M 14K

0 0 1 47 0

NebberCracker @NebberCrackker

6 months ago

Just published my blog on Pipeline Parallelism fundamentals! Learn how it works. Check it out: medium.com/@bargav25/dist… Feedback welcome! More deep learning content coming soon. #MachineLearning #LLMs #DistributedTraining

0 0 0 32 0

NebberCracker @NebberCrackker

8 months ago

Hi! Our team is building a product for men's skincare. Could you please fill out this short form (2-3 minutes) to help us better understand what men are looking for? Thank you so much! #skincareroutine #skincare #ai #hackathon #ml #fashion forms.gle/3syFkLbbVhryWA…