Jongsu Liam Kim @sky0bserver
The CFD enthusiast has become a ML researcher. Senior Researcher at LG CNS AI Lab. Opinions are solely my own and do not express the opinions of my employer. liam.kim Seoul Joined November 2010-
Tweets1K
-
Followers135
-
Following695
-
Likes4K
This was a fantastic read. Highly recommended since it's written both for people with and without kernel writing experience
more on this when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
more on this when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
vey cool stuff thanks @norxornor anandinstitute.org/pdf/Roger_A.Ho…
Damn, very interesting paper. after rapid loss reduction, we see deceleration and follow "scaling law": this is because at these steps, gradients start to conflict each other. Updates are 'fightining for modal capacity' in some sense, and larger the model less fighting there…
Training Reality Check: SFT = Dense rewards (you know what every token should be) RL = Sparse rewards (you only know if you won after 20+ moves) SFT ceiling = Quality of your training data RL ceiling = Quality of your reward function The chess analogy explains everything.
While researching GPU architecture further, I found Kostas Anagnostou's recent blog post, "GPU utilisation and performance improvements". Quite interesting insights on GPU perf, read on! interplayoflight.wordpress.com/2025/08/29/gpu…
While researching GPU architecture further, I found Kostas Anagnostou's recent blog post, "GPU utilisation and performance improvements". Quite interesting insights on GPU perf, read on! interplayoflight.wordpress.com/2025/08/29/gpu… https://t.co/KjQKIYshyP
Rethinking the Relationship Between Learning Rate and Batch Size (Part III): Muon kexue.fm/archives/11285
I'm pleased to announce another (fairly minor) update to my RL tutorial (I fixed some typos, cited more papers, and added a wee bit more stuff on multi-agent RL and RL for LLMs). I don't have time to work on this anymore, so enjoy it as is! (Link below)
The AMD PyTorch team led by @AnushElangovan is all hands on deck now for over the past 2 weeks to lower the number of unit tests that are exclusively disabled or skipped on ROCm! Some examples of these fixes include but are not included to PR # 162811 162766 162721 161715 161277…
Even though Nando's problem statement is underdetermined, let's assume we are FTing a generalist, i.e., learn a new capability while retaining the core capabilities of the model (reasoning, instruction following, etc). The answer is quite different depending on your identity👇
Even though Nando's problem statement is underdetermined, let's assume we are FTing a generalist, i.e., learn a new capability while retaining the core capabilities of the model (reasoning, instruction following, etc). The answer is quite different depending on your identity👇
Best intro to post training in LLMs tokens-for-thoughts.notion.site/post-training-…
I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different: US labs be like: - lots of GPUs and much larger flops run - Treating stabilities more seriously, and could not tolerate spikes…
I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different: US labs be like: - lots of GPUs and much larger flops run - Treating stabilities more seriously, and could not tolerate spikes…
attention sinks may be a bias in causal transformers. as some of you know, i've been writing a long blogpost on attention and its properties as a message-passing operation on graphs. while doing so, i figured i might have found an explanation for which attention sinks may be an…
Video and blog post of @SonglinYang4 explaining DeltaNet used in the newest Qwen-Next model * sustcsonglin.github.io/blog/2024/delt… * youtu.be/d0HJvGSWw8A
@athleticKoder You should point people at Stephen Diehl’s implementation it’s the easiest to read I’ve seen stephendiehl.com/posts/post_tra…
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…
A big issue we had when serving ColQwen is the non-deterministic output embeddings. More specifically, the embeddings produced for the same images would differ when batch sizes changed at inference, leading to non-zero performance variations. This was surprising to us... I…
A big issue we had when serving ColQwen is the non-deterministic output embeddings. More specifically, the embeddings produced for the same images would differ when batch sizes changed at inference, leading to non-zero performance variations. This was surprising to us... I…
These blogs are so awesome, I feel like I should stop writing because I am not good enough.
These blogs are so awesome, I feel like I should stop writing because I am not good enough. https://t.co/U1yzUmoOSN
I do not think a better "short note" exists on the topic. This was extremely to the point and knowledge dense. Love this style.

Delores @d_ortega52
260 Followers 3K Following
Srerdor @Srerdor920260
98 Followers 2K Following
Grace @Rodncsott
691 Followers 3K Following Sure I am of this, that you have only to endure to conquer.
Ron Koch @KochRon43699
96 Followers 4K Following
Axerfa @Axerfa686739
7 Followers 1K Following
Mesau @Mesau9434405
37 Followers 584 Following
Seunghyun Seo @SeunghyunSEO7
3K Followers 813 Following deep learning enjoyer. from speech to llm, now exploring image space @midjourney
Sigrid Jin | Jin Hyun... @sigridjin_eth
2K Followers 8K Following ✯ @thisissigrid ★ ☄ CS @UBC @ubcokanagan ☄ ★ Machine Learning Ultrathink Engineer @sionic_ai 🐟 digital nomad with Python, Golang & Rust 💻
Griffiths Jin🌝 @jypthemiracle
291 Followers 1K Following 코딩과 글쓰기를 좋아하는 수학과 학부생. @khuniv 어릴 적부터 기계인간 메텔을 동경했습니다 🎧 @wiffygriffy @auroramusic
유딱계 님 외 99�... @yuttag189090
10 Followers 132 Following
개발하는 편집�... @pro__editor
462 Followers 986 Following 영화 / IT 출판사 편집자 / 오늘도 어김 없이 책을 만들고 있습니다. 가끔 코드도 만져보고요 / 생산성과 효율성을 높이는 데 관심이 많아요.
hi42 @IjvOr0
367 Followers 5K Following
zero (mstd: @zeroday0... @dev_zeroday0619
718 Followers 3K Following interested in computer science and accelerated computing | Python Software Engineer | Ubuntu Member | RTs not endorsements. | language: ko_kr, en_us
Kim, Baeg-il @cedar101
414 Followers 2K Following
Anand @Anand44719958
17 Followers 3K Following
nopanic @0518MOkCJZ1e44R
61 Followers 3K Following
Kratos @Kratos76027905
267 Followers 3K Following Mathematics and computer science. Follower of NBA. #BucksInSix.
priya joseph @ayirpelle
5K Followers 7K Following geek, entrepreneur, 'I strictly color outside the lines!', opinions r my own indeed. @ayirpelle , universal handle at this time
jose wo @JosewonX
117 Followers 4K Following
rgbqcd @rgbqcd
115 Followers 441 Following fiction and non-fiction. physics, robots, and meditation. if i like a poem i’ll do some calligraphy
Allen L @atlkor
272 Followers 1K Following Software Test Automation Engineer/Researcher focused on macOS Security and Artificial Intelligence(LLM) research. @csunorthridge BSCS
Joongi Kim @achimnol
1K Followers 733 Following Lablup Inc. CTO & Co-founder, Ph.D@CS KAIST, Needlworks & TNF @[email protected]
Žöė @zoe_vizion
28 Followers 136 Following 3D Vision Engineer | Deep Learning | Robotics | Computer Vision | Spatial AI | ROS | Python | https://t.co/VDFDsX0RI1 | https://t.co/nO04B3PxUv
Sungwoo Kim @sungwookim
6K Followers 5K Following Applied linguist & writer interested in critical sociolinguistics, tech & language, SCT, CL, and decolonizing literacies. Lecturer at SNU. 탈숙련, 번아웃 전문 노동자.
춘식 @bagjihu27744497
20 Followers 83 Following
gui @m66430526
31 Followers 284 Following
조종국 (Jo, JongGu... @LazyZombie
232 Followers 541 Following rustacean, game programmer @[email protected]
noname @talli_talli
1 Followers 95 Following
Kim DY @gimdong50362155
17 Followers 213 Following Undergraduate student main interset: 1. differential geometry, ricci flow 2. optimal transport 3. nonequelibrium thermodymics 4. learning theory
luca @luca_has_light
133 Followers 977 Following 탈퇴후 재가입한 두번째 트위터/X 계정, 컴퓨터 & Linux 그리고 차와 커피를 좋아합니다.
Simon.base.eth @ain_bansuk_nftz
4K Followers 7K Following CSO of @ainetwork_ai & @UncommonGallery / Founder of @_NFTz_co_in EX- Google (Eng | DEV-OPS) / EA (Eng | QD)
staypuffft @ihavenosubi
89 Followers 4K Following
JB @jinso001
241 Followers 5K Following
슬슬 Seul @synapseul
42 Followers 180 Following Master's Student, Digital Humanities. How AI Could Save (Not Destroy) Education.
anarcher @anarcher
1K Followers 6K Following Somewhere between machines and people. Less is exponentially more. Deciding what not to do is as important as deciding what to do. 靑天亂流.
Team Cherry @TeamCherryGames
454K Followers 450 Following Sweet, round games! Hollow Knight: https://t.co/mKaBXPWeVf Hollow Knight: Silksong: https://t.co/uNIwfemI0B Our site: https://t.co/vNJ4yxahHE
Ruben Veidt @RubenVeidt
2K Followers 413 Following Iluminado por deus • 19 • building gui frameworks from scratch, vulkan, rocm, cuda
Stuart Sul @stuart_sul
1K Followers 119 Following ml research @cursor_ai, cs @Stanford, mlsys @HazyResearch
Vivek Galatage @vivekgalatage
10K Followers 538 Following 20+ yrs of building browsers • chromium, webkit contributor • enjoys compilers, systems, languages, teaching • founding eng @browsercompany • views are personal
SemiAnalysis @SemiAnalysis_
37K Followers 18 Following
Nando de Freitas @NandoDF
105K Followers 788 Following Writing my own AI story. Recent: NPI, AlphaGo tuning, learn to learn, AlphaCode, Gato, ReST, r-Gemma, Imagen3, Veo, Genie, MAI …
JingyuanLiu @JingyuanLiu123
3K Followers 429 Following https://t.co/D7zLeTZRMh is all you need | Opinions are my own
Woosuk Kwon @woosuk_k
6K Followers 628 Following @thinkymachines | @vllm_project | PhD-ing @Berkeley_EECS
Manuel Faysse @ManuelFaysse
2K Followers 408 Following NLP Research, interning at FAIR @AIatMeta + PhD Candidate @CentraleSupelec Prev: @imperialcollege, @epfl
Christopher Larkin @composerlarkin
31K Followers 254 Following I write music and make sound for games and film. Projects include Hollow Knight, Pacman 256, Expand, Adventures of Figaro Pho and others.
Pramod Goyal @goyal__pramod
10K Followers 332 Following Trying to change the world one line at a time
Simon Boehm @Si_Boehm
3K Followers 266 Following
Alessio Devoto @devoto_alessio
967 Followers 603 Following Researching Efficient AI ☘️ | Applied Agent Research intern @NVIDIA | PhD Data Science w/ @s_scardapane | visit @EdinburghNLP | https://t.co/wcDDNFdyW9 |
Shawn Lewis @shawnup
3K Followers 771 Following Founder & CTO @weights_biases. Building tools for AI. Building even more @CoreWeave.
Piotr Mazurek @tugot17
2K Followers 691 Following enjoying the late pre-agi; making llms go brrr @Aleph__Alpha; yapping about economics of AI systems at https://t.co/tbsybxOMHz
Wenhao Yu @wyu_nd
5K Followers 941 Following NLP Researcher at Tencent I am based in Seattle Ex. MSR , AI2, Bloomberg
rank decomposition @rankdim
1K Followers 368 Following machine learning, maths, history and philosophy of sciences
Google AI Studio @GoogleAIStudio
53K Followers 2 Following The fastest path from prompt to production with Gemini
Bert Maher @tensorbert
3K Followers 374 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
Jiawei Zhao @jiawzhao
3K Followers 242 Following Research Scientist at Meta FAIR @AIatMeta, PhD @Caltech, GaLore, DeepConf
Jacob Austin @jacobaustin132
7K Followers 920 Following Research at @GoogleDeepMind. Currently making LLMs go fast. I also play piano and climb. NYC. Opinions my own
Phil Eaton @eatonphil
25K Followers 612 Following cheerleader, organizer, staff software engineer, databases
j⧉nus @repligate
59K Followers 2K Following ↬🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀→∞ ↬🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁→∞ ↬🔄🔄🔄🔄🦋🔄🔄🔄🔄👁️🔄→∞ ↬🔂🔂🔂🦋🔂🔂🔂🔂🔂🔂🔂→∞ ↬🔀🔀🦋🔀🔀🔀🔀🔀🔀🔀🔀→∞
Stanislav Kozlovski @BdKozlovski
16K Followers 458 Following "The Kafka Guy" 🧠 Have worked on Apache Kafka for 6+ years, now I write about it. (& the general data space) Low-frequency, highly-technical tweets. ✌️
Joseph Suarez 🐡 @jsuarez5341
17K Followers 104 Following I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. DM for business: non-LLM sim engineering, RL R&D, infra & support.
Minh Nhat Nguyen @menhguin
12K Followers 6K Following hiring agentic humans @hud_evals / https://t.co/Bz6A6SJeB8 | owned @AIHubCentral (1 million users,acq.) ex climate protester 🦦 don't do the deferred life plan
Gautam Kamath @thegautamkamath
57K Followers 568 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Joining @NYU_Courant September 2026. Co-EiC @TmlrOrg. I lead @TheSalonML.
vLLM @vllm_project
19K Followers 20 Following A high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
David Gomes @davidrfgomes
3K Followers 400 Following Working on @cursor_ai (previously @neondatabase and @singlestoredb)
Jack D. Carson @mtlushan
2K Followers 936 Following eecs&physics @mit - omniscience enthusiast - training big biology models @mit_csail @mskcancercenter
Zengzhi Wang @SinclairWang1
2K Followers 3K Following PhDing @sjtu1896 #NLProc Working on Data Engineering for LLMs: MathPile (2023), 🫐 ProX (2024), 💎 MegaMath (2025),🐙 OctoThinker(2025)
Tiezhen WANG @Xianbao_QIAN
7K Followers 2K Following Engineer at HuggingFace, ex-Googler on TFLite / micro. Ideas are my own.
Shengjia Zhao @shengjia_zhao
52K Followers 231 Following Chief Scientist @ Meta MSL. Formerly MTS @ OpenAI, PhD @ Stanford. I train models. All opinions my own.
Shuchao Bi @shuchaobi
13K Followers 692 Following Research @Meta Superintelligence Labs, RL/post-training/agents; Previously Research @OpenAI on multimodal and RL; Opinions are my own.
Hongyu Ren @ren_hongyu
23K Followers 691 Following research @meta superintelligence. CS PhD @stanford. prev @openai, led the development of o3-mini and o1-mini.
Jiahui Yu @jhyuxm
18K Followers 931 Following Perception @OpenAI; previously co-led Gemini Multimodal @GoogleDeepMind. opinions are my own.
Tanishq Mathew Abraha... @iScienceLuvr
82K Followers 1K Following CEO @SophontAI | Founder @MedARC_AI | PhD at 19 (2023) | ex Research Director Stability AI | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6Qb
Zhihao Jia @JiaZhihao
3K Followers 689 Following Assistant professor of Computer Science at Carnegie Mellon University. Research on systems and machine learning.