Hamel Husain @HamelHusain

Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb, @DataRobot. @fastdotai core contributor. hamel.dev Portland, OR Joined September 2012

Tweets

9K
Followers

23K
Following

2K
Likes

11K

Eugene Yan @eugeneyan

14 hours ago

First, figure out WHAT the product is, WHAT problem it solves, WHY people want to use it. Get some traction & usage data. Then, bring in ML folks who can help with HOW to measure & improve, based on the data. Doing it reversed is like building solutions to nonexistent problems.

Hamel Husain @HamelHusain

19 hours ago

0 0 8 7K 1

2 3 30 6K 20

Eugene Yan @eugeneyan

14 hours ago

And ML folks don’t just train models. They’re also bring rigor, data driven analysis, best practices of how to use data and work with non-deterministic output, how to build products based on the above, etc These are things that typical software engineers may not focus on.

Hamel Husain @HamelHusain

19 hours ago

2 2 19 6K 4

2 3 12 4K 6

Hamel Husain @HamelHusain

21 hours ago

After 25 years of working in the ML space I agree that many early stage companies (Series A-ish and earlier) shouldn’t be hiring MLEs However I think companies need a fractional MLE (.05) to help ensure the foundations get built properly for MLEs later on. Data Eng,…

Latent Space Podcast @latentspacepod

a week ago

3 15 128 58K 134

5 10 120 40K 97

Shishir Patil @shishirpatil_

a day ago

Berkeley Function Calling Leaderboard: Introducing Consistent 8 X V100 with pay-as-you-go pricing for measuring costs and latency. In depth: We fix inconsistency in the cost and latency calculation for open-source models, which are now all calculated when serving the model with…

Shishir Patil @shishirpatil_

2 days ago

11 59 308 214K 229

Download Image

2 15 61 14K 24

Vik Paruchuri @VikParuchuri

2 days ago

I made pdftext, a small tool that extracts text like pymupdf, but with an Apache license (mupdf is AGPL). It can pull out blocks and lines or plain text. Find it here - github.com/VikParuchuri/p… .

13 52 484 61K 445

Hamel Husain @HamelHusain

2 days ago

Does anyone know which provider this is referencing?

anton @abacaj

2 days ago

Does anyone know which provider this is referencing?

27 116 774 242K 251

Download Image

3 0 14 10K 12

Logan Kilpatrick @OfficialLoganK

2 days ago

The API product is your developer documentation. If you don’t treat the docs like a first class product, developers will choose to go elsewhere.

19 26 353 41K 48

Hamel Husain @HamelHusain

2 days ago

This is the prompt they are using for Llama-3 function calling (seems to work well even though it's not specifically fine tuned for that): github.com/ShishirPatil/g…

Shishir Patil @shishirpatil_

2 days ago

This is the prompt they are using for Llama-3 function calling (seems to work well even though it's not specifically fine tuned for that): github.com/ShishirPatil/g… https://t.co/lNetEc4f8Y

11 59 308 214K 229

Download Image

7 61 418 59K 535

Download Image

Alex Albert @alexalbert__

3 days ago

Our first Build with Claude contest was a success! We received tons of great submissions from @AnthropicAI devs. Here are the 5 winning projects (in no particular order)🧵

11 41 477 196K 694

Download Image

Hamel Husain @HamelHusain

3 days ago

Jeremy is THE most talented infra/devops person I know. He’s created a specialized IDE + Copilot for DevOps : K8s, Cloud, etc - based on notebooks! It’s open source, accompanied by a blog that shows his thinking 👇

Jeremy Lewi @jeremylewi

3 days ago

3 18 112 26K 92

1 8 124 19K 68

Simon Willison @simonw

3 days ago

"Do stuff and then blog about it" remains one of the most underrated pieces of career advice

vicki @vboykis

3 days ago

"Do stuff and then blog about it" remains one of the most underrated pieces of career advice

10 135 1K 435K 1K

25 305 3K 336K 1K

Wing Lian (caseus) @winglian

3 days ago

I'm up to 96k context for Llama 3 8B. Using PoSE, we did continued pre-training of the base model w 300M tokens to extend the context length to 64k. From there we increased the RoPE theta to further attempt to extend the context length. 🧵

26 66 441 114K 240

Download Image

Hamel Husain @HamelHusain

4 days ago

Love this, just came up IRL and often true

5 16 446 42K 80

Download Image

Daniel Han @danielhanchen

4 days ago

Made a Colab to finetune Llama-3 8b Instruct 2x faster and use 68% less VRAM No endless generations, fixed llama.cpp GGUF conversions, supports 4x longer contexts - 11K vs 2.5K before Also have a 2x faster inference only notebook: colab.research.google.com/drive/1aqlNQi7… colab.research.google.com/drive/1XamvWYi…

2 13 66 4K 63

Hamel Husain @HamelHusain

5 days ago

LOVE how the Meta team went hard on explicitly describing the prompt template llama.meta.com/docs/model-car…

8 30 381 25K 247

Download Image

David Aronchick @aronchick

5 days ago

Cool tool by @HamelHusain (and a primer on why declarative job execution, like with @BacalhauProject is so critical, even against black box models like LLMs!) - Debugging AI With Adversarial Validation bit.ly/3xOWy52

0 2 0 1K 1

Download Image

Alex Albert @alexalbert__

5 days ago

Don’t blindly base your decision on which LLM to use on broken benchmarks like MMLU... If you are serious about choosing the right LLM for your use case, you NEED to create an eval of your own. Let’s talk about how you can make one 🧵 x.com/nearcyan/statu…

near @nearcyan

5 days ago

24 20 364 97K 86

Download Image

7 30 242 63K 225

Hamel Husain @HamelHusain

5 days ago

Does anyone have a reference implementation of function calling on llama 3 + vllm + outlines Surely someone has an open example - its helpful to see how other people are doing it b/c there are lots of ways of accomplishing this