Andrej Karpathy @karpathy, Twitter Profile

Andrej Karpathy @karpathy

a year ago

(finally got around to reading in full). Amusing to read so many negative result attempts back to back to incorporate previous papers/ideas (at least in the cramming setting). Like the inline experimental result style. Like the nice code release. Like the "cramming" benchmark.

Lucas Beyer (bl16) @giffmana

a year ago

48 639 3K 848K 2K

Download Image

8 51 584 245K 311

Lucas Beyer (bl16) @giffmana

a year ago

@karpathy I like this paper because we had a similar experience between BiT and ViT, trying A LOT of ResNet "improvements" and none held water (except SqeezeEx). Now a similar thing seems to be happening with ViT sadly.

0 0 25 7K 2