@giffmana Spoken like a googler who never has to worry about such things. :) There are multiple reasons imagenet is difficult, and I’ve wanted to reduce the barriers for a long time.
@amy_tabb @theshawwn I read your linked post and agree with everything, but it does not contradict my statement? 1. If you do DL, you have min 1 GPU already 2. R50-i1k took me 2w on 1 GPU 5 ago, now much better. 3. It is _the_ gold standard, but a paper can't spend 2 gpu-weeks or 5$ cloud on it? huh?
@giffmana @amy_tabb @theshawwn Let's me step in about $5. I am much likely to spend $1000 for GPU than $5 for AWS, as PhD student. First, the GPU is to stay. Second, $5 is never $5. You would have lots of unsuccessful experiments, you have to learn. Also, add here eternal fear of not turning off the instance.
@giffmana @amy_tabb @theshawwn Saying that, I absolutely did a lot of ImageNet training in the past, when architectures were simpler 1 week to train each, but the most of studies were done on 128px to reduce load github.com/ducha-aiki/caf…
@ducha_aiki @amy_tabb @theshawwn yeah, so we agree? It was feasible years ago for a lone phd student with 1 GPU, and is even moreso nowadays. Just need a little bit of patience.
@giffmana @ducha_aiki @amy_tabb People keep telling you it's hard, and you keep denying it's hard. I don't think this attitude is productive. Just because you can do something doesn't mean other people can do that thing. And it has a lot less to do with patience than with the right esoteric commands.
@theshawwn @ducha_aiki @amy_tabb It's literally an example in every framework, I'd hardly call that esoteric O.o github.com/pytorch/exampl… github.com/google/flax/tr… - this one even has a nice timing table, takes 4.3h on 8xV100, meaning 1½ days! 2080 is not far: lambdalabs.com/blog/best-gpu-… But Dmytro didn't disagree?