Francesca Lucchetti @fran_lucc

CS PhD Student at Northeastern University franlucc.github.io Massachusetts, USA Joined October 2022

Tweets

24
Followers

73
Following

71
Likes

15

Arjun Guha @ArjunGuha

8 months ago

We present a new benchmark for reasoning models that reveals capability gaps and failure modes that are not evident in existing benchmarks. E.g., we find that o1 / o3-mini-high are significantly better at verbal reasoning than other models.

5 11 97 11K 44

Download Image

Jaden Fiotto-Kaufman @jadenfk23

a year ago

🚀 New NNsight features launching today! If you’re conducting research on LLM internals, NNsight 0.3 is now available. This update introduces advanced features, offering deeper insights for complex investigations into model behavior. 👇 Here’s what’s new: colab.research.google.com/github/ndif-te…

1 19 51 5K 20

Download Image

Jaden Fiotto-Kaufman @jadenfk23

a year ago

Frontier LLMs have capabilities that smaller AIs don't, but up to now there's been no way to crack them open. Now that #Llama3 405b is here, what's the most interesting experiment YOU want to do? 🚀 Apply at NDIF.us/405b.html to make it happen and read for details 🧵⬇️

1 25 47 23K 14

Federico Cassano @ellev3n11

a year ago

Llama-3.1 trains on synthetic translations of Python to low-resource languages (e.g., PHP) to improve performance on MultiPL-E! In our work, conditionally accepted to OOPSLA 2024, we present several experiments in this direction: arxiv.org/abs/2308.09895

6 4 19 2K 3

Download Image

David Bau @davidbau

a year ago

The National Deep Inference Fabric #NDIF, an @NSF-funded AI research infrastructure project, is awarding 2024 **Summer Engineering Fellowships** in Boston. These are summer visiting positions, for current or recent PhD or undergrads, including stipend, travel and housing costs.

1 27 59 26K 27

AK @_akhaliq

a year ago

NNsight and NDIF Democratizing Access to Foundation Model Internals The enormous scale of state-of-the-art foundation models has limited their accessibility to scientists, because customized experiments at large model sizes require costly hardware and complex engineering

2 25 72 14K 24

Download Image

Yao Fu @Francis_YAO_

3 years ago

How did the initial #GPT3 evolve to today's #ChatGPT ? Where do the amazing abilities of #GPT3.5 come from? What is enabled by #RLHF ? In this article with ⁦@allen_ai⁩ , we trace the emergent abilities of #LLM to their sources from first principles yaofu.notion.site/How-does-GPT-O…

31 321 1K 0 541

Google DeepMind @GoogleDeepMind

3 years ago

Introducing a generalist neural algorithmic learner, capable of carrying out 30 different reasoning tasks, with a 𝘴𝘪𝘯𝘨𝘭𝘦 graph network. These include: 🔵 Sorting 🔵 Shortest paths 🔵 String matching 🔵 Convex hull finding And more: dpmd.ai/3FC1FqA

12 241 1K 0 318

Download Image

Sewon Min @sewon__min

3 years ago

Most if not all language models use a softmax that gives a categorical probability distribution over a finite vocab. We introduce NPM: the first nonparametric masked LM that replaces this softmax with a nonparametric distribution over a text corpus. arxiv.org/abs/2212.01349 (1/4)

10 80 428 0 114

Download Gif

Masoud @linguistMasoud

3 years ago

Ok I think it is time to share my "foundations of linguistics" syllabus with you here. It took me a long time to work out the details. I wanted the course to also be a light introduction to philosophy of science in linguistics. As a graduate student ... jasbi.github.io/courses/lin200…

6 23 132 0 67

Download Image

AI at Meta @AIatMeta

3 years ago

4️⃣ Papers we presented at #NeurIPS2022 that you should know about (and how you can learn more even if you’re not at the conference.) 🧵/6

5 34 192 0 63

Download Image

Patrick Schwab @schwabpa

3 years ago

You couldn't make it to #NeurIPS2022 this year? Nothing to worry - I curated a summary for you below focussing on key papers, presentations and workshops in the buzzing space of ML in Biology and Healthcare 👇