(finally got around to reading in full). Amusing to read so many negative result attempts back to back to incorporate previous papers/ideas (at least in the cramming setting). Like the inline experimental result style. Like the nice code release. Like the "cramming" benchmark.
(finally got around to reading in full). Amusing to read so many negative result attempts back to back to incorporate previous papers/ideas (at least in the cramming setting). Like the inline experimental result style. Like the nice code release. Like the "cramming" benchmark.
@karpathy I like this paper because we had a similar experience between BiT and ViT, trying A LOT of ResNet "improvements" and none held water (except SqeezeEx). Now a similar thing seems to be happening with ViT sadly.