NobodyExistsOnTheInternet @nullvaluetensor

Human Large Language model. Skills: Distill data. Training LLMs. Test and Evaluate. Rinse and repeat as required. Based in SEA. SEA Joined November 2023

Tweets

341
Followers

527
Following

81
Likes

2K

NobodyExistsOnTheInternet @nullvaluetensor

2 weeks ago

> 3. [...] Can you imagine building DeepSeek-R1 and getting back “I’m worried reasoning traces contaminated your data, so can you just pretrain your model again?” ?????????

Justin Angel @JustinAngel

2 weeks ago

> 3. [...] Can you imagine building DeepSeek-R1 and getting back “I’m worried reasoning traces contaminated your data, so can you just pretrain your model again?” ?????????

5 3 49 8K 6

Download Image

0 0 2 97 0

@Sauers_ The Nature DeepSeek-R1 peer review reads like Cards Against Humanity. My top 3: 1. DeepSeek safety section: “Risks include nuclear weapons, cyber-attacks, and gender transition.” Reviewer: “One of those isn’t like the other???” 2. Reviewer: “The reasoning traces from your…

5 3 49 8K 6

Download Image

NobodyExistsOnTheInternet @nullvaluetensor

2 weeks ago

"Thing work. Sometimes not work because math sad. Two smart humans look at code. Used old code others already like."

0 0 1 60 0

NobodyExistsOnTheInternet @nullvaluetensor

3 weeks ago

huggingface.co/NobodyExistsOn…

0 0 1 49 1

NobodyExistsOnTheInternet @nullvaluetensor

3 weeks ago

Are these good/relevant takes for the question: "When do you think we will achieve AGI"

0 0 0 55 0

Download Image

NobodyExistsOnTheInternet @nullvaluetensor

4 weeks ago

It took me 2 weeks to figure out my issue trying to create kimi k2 3T was trying to make a """memory efficient""" dequanter to bf16 for kimi/deepseek. I really need to practice the scientific method more.

0 0 1 115 1

NobodyExistsOnTheInternet @nullvaluetensor

a month ago

Is it just me or is gpt-5-pro's only weakness is that it's search tool is very weak. I've been asking it for help monkeypatching some GitHub repos and in it's cots the main issue is that it's hitting rate limits ironically.

1 0 2 131 0

Nous Research @NousResearch

a month ago

Nous Research presents Hermes 4, our latest line of hybrid reasoning models. hermes4.nousresearch.com Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities. Special attention was given to making the models creative and interesting to…

141 318 2K 374K 592

Download Image

NobodyExistsOnTheInternet @nullvaluetensor

a month ago

Opus 4.1 zeroshotted llamacpp and made it so that I could stream the weights during the quantize in order to quantize a 2 and 3T param model. Crazy progress

0 0 1 93 1

NobodyExistsOnTheInternet @nullvaluetensor

a month ago

In the next version of Bun Bun.train()

0 0 4 359 0

Download Image

NobodyExistsOnTheInternet @nullvaluetensor

a month ago

I'm kinda convinced that opus 4.1 is the literal peak of what gpt-4-0314 could have been with frontier RL + post-training It's the only model that reliably zeroshots multiproc and async in python. Gpt-4 basically knew most of it, but couldn't reliably output the code.

0 0 3 129 1

NobodyExistsOnTheInternet @nullvaluetensor

2 months ago

gist.github.com/someoneexistso… Very deepseeky behavior as well as a \boxed{}, it definitely feels like a deepseek distill, esp doing markdown in its responses. So either distilled from Deepseek or Gemini, both which are damning.

Susan Zhang @suchenzang

2 months ago

15 15 501 222K 403

Download Image

0 0 1 565 0

NobodyExistsOnTheInternet @nullvaluetensor

2 months ago

Can someone explain why only sonnet 4 on a day to day basis has hugely different performance? Does Anthropic just decide to deploy different version of sonnet 4 every other day? All the models are gaussianish, only sonnet is step-like. github.com/jacobphillips9…

0 0 2 112 1

Download Image

Nous Research @NousResearch

2 months ago

Our Researcher in Residence @yaboilyrical will be discussing his work on SMC steering at UC Berkeley on Aug 3. Check out the blog on this work here: nousresearch.com/steering-the-s… Details below!

nightwing @yaboilyrical

2 months ago

Our Researcher in Residence @yaboilyrical will be discussing his work on SMC steering at UC Berkeley on Aug 3. Check out the blog on this work here: nousresearch.com/steering-the-s… Details below!