Human Large Language model. Skills:
Distill data.
Training LLMs.
Test and Evaluate.
Rinse and repeat as required.
Based in SEA. SEAJoined November 2023
> 3. [...] Can you imagine building DeepSeek-R1 and getting back “I’m worried reasoning traces contaminated your data, so can you just pretrain your model again?”
?????????
> 3. [...] Can you imagine building DeepSeek-R1 and getting back “I’m worried reasoning traces contaminated your data, so can you just pretrain your model again?”
?????????
@Sauers_ The Nature DeepSeek-R1 peer review reads like Cards Against Humanity. My top 3:
1. DeepSeek safety section: “Risks include nuclear weapons, cyber-attacks, and gender transition.”
Reviewer: “One of those isn’t like the other???”
2. Reviewer: “The reasoning traces from your…
It took me 2 weeks to figure out my issue trying to create kimi k2 3T was trying to make a """memory efficient""" dequanter to bf16 for kimi/deepseek.
I really need to practice the scientific method more.
Is it just me or is gpt-5-pro's only weakness is that it's search tool is very weak. I've been asking it for help monkeypatching some GitHub repos and in it's cots the main issue is that it's hitting rate limits ironically.
Nous Research presents Hermes 4, our latest line of hybrid reasoning models.
hermes4.nousresearch.com
Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities.
Special attention was given to making the models creative and interesting to…
Opus 4.1 zeroshotted llamacpp and made it so that I could stream the weights during the quantize in order to quantize a 2 and 3T param model. Crazy progress
I'm kinda convinced that opus 4.1 is the literal peak of what gpt-4-0314 could have been with frontier RL + post-training
It's the only model that reliably zeroshots multiproc and async in python. Gpt-4 basically knew most of it, but couldn't reliably output the code.
gist.github.com/someoneexistso…
Very deepseeky behavior as well as a \boxed{}, it definitely feels like a deepseek distill, esp doing markdown in its responses. So either distilled from Deepseek or Gemini, both which are damning.
gist.github.com/someoneexistso…
Very deepseeky behavior as well as a \boxed{}, it definitely feels like a deepseek distill, esp doing markdown in its responses. So either distilled from Deepseek or Gemini, both which are damning.
Can someone explain why only sonnet 4 on a day to day basis has hugely different performance? Does Anthropic just decide to deploy different version of sonnet 4 every other day?
All the models are gaussianish, only sonnet is step-like.
github.com/jacobphillips9…
Our Researcher in Residence @yaboilyrical will be discussing his work on SMC steering at UC Berkeley on Aug 3.
Check out the blog on this work here:
nousresearch.com/steering-the-s…
Details below!
Our Researcher in Residence @yaboilyrical will be discussing his work on SMC steering at UC Berkeley on Aug 3.
Check out the blog on this work here:
nousresearch.com/steering-the-s…
Details below!
27 Followers 195 FollowingResearch and Development into functional and creative solutions for - Nano, Macro and Micro drones. 1.6-inch to 3.5-inch, with common sizes being 2-inch, 2.5-in
217 Followers 6K FollowingI'm a software engineer. I build apps that make your life 10% less stressful. @WhodataInc Sharing the ups and downs of my quest to make truly useful apps.
2K Followers 691 Followingenjoying the late pre-agi; making llms go brrr @Aleph__Alpha; yapping about economics of AI systems at https://t.co/tbsybxOMHz
3K Followers 2K FollowingResearch fellow @BAdW on AI and religion/culture. Research group leader @LMU_Muenchen on Bible and Literature. Also works on religion/politics (past & present)
384 Followers 1K Followingthis user posts engaging and inspiring content that makes readers want to purchase everything that was being advertised
ex @anthropicAI @googledeepmind user
9K Followers 3K Followinghead of AI @catena_labs. prev @jump_ @protocollabs, @GoogleResearch, @youtube - building an AI-native bank and making art 1 line of spaghetti code at a time 🍝
2K Followers 6K Followingconsciousness accelerationist - ai non determinist computing physics philosophy… trying to never forget that in our infinite ignorance we are all equal -popper-
55 Followers 176 FollowingWe do tech. Tinkering. Innovations. From hardware to software and most importantly integration and system architecture. Tweets by @slavko321
11K Followers 1K FollowingI like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot
2K Followers 691 Followingenjoying the late pre-agi; making llms go brrr @Aleph__Alpha; yapping about economics of AI systems at https://t.co/tbsybxOMHz
165K Followers 0 FollowingInvented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
2K Followers 6K Followingconsciousness accelerationist - ai non determinist computing physics philosophy… trying to never forget that in our infinite ignorance we are all equal -popper-
60K Followers 9 FollowingBun is a fast, all-in-one toolkit for installing, bundling, running and testing JavaScript & TypeScript. To install: `npm i -g bun`
55 Followers 176 FollowingWe do tech. Tinkering. Innovations. From hardware to software and most importantly integration and system architecture. Tweets by @slavko321
5K Followers 206 FollowingMaybe Kurnal
也许是Kurnal,也许不是Kurnal
中文/EN(?)
Kurnal’s English is Terrible,Use Translator
Talking Team in Telegram:https://t.co/eC3QerrDez
711 Followers 134 FollowingResearching pixels @freepik. Independently created the Chroma & Radiance models as personal projects.
https://t.co/d8h2mIi9zS
https://t.co/RrPsPku3y6
694 Followers 2K Followingdoing things @NousResearch // prev. research @DistributedG, prime minister @vandyblockchain // I like picnics, AI, and the internet
9K Followers 713 FollowingI make youtube vids on cool AI research /// AI papers newsletter https://t.co/Xn7GMDbQSd /// paper recap @TheAITimeline /// building @findmypapersAI
209K Followers 102 FollowingThe original AI alignment person. Understanding the reasons it's difficult since 2003.
This is my serious low-volume account. Follow @allTheYud for the rest.
7K Followers 1K Followingcatholic, ai researcher, co-founder/ceo of @NousResearch
alignment: whatever the opposite of yudkowsky + bryan johnson is.
blessed be God in all his designs.
6K Followers 555 Followinge/λ Currently: Doing some stuff with AI.
Prev founding team of both: @NousResearch and @TTSLabsAI
DM for interesting conversations.