Shawn Presser @theshawwn, Twitter Profile

Shawn Presser @theshawwn

2 years ago

@giffmana Spoken like a googler who never has to worry about such things. :) There are multiple reasons imagenet is difficult, and I’ve wanted to reduce the barriers for a long time.

1 0 8 0 0

@amy_tabb @theshawwn I read your linked post and agree with everything, but it does not contradict my statement? 1. If you do DL, you have min 1 GPU already 2. R50-i1k took me 2w on 1 GPU 5 ago, now much better. 3. It is _the_ gold standard, but a paper can't spend 2 gpu-weeks or 5$ cloud on it? huh?

3 0 4 0 0

Dmytro Mishkin 🇺🇦 @ducha_aiki

2 years ago

@giffmana @amy_tabb @theshawwn Let's me step in about $5. I am much likely to spend $1000 for GPU than $5 for AWS, as PhD student. First, the GPU is to stay. Second, $5 is never $5. You would have lots of unsuccessful experiments, you have to learn. Also, add here eternal fear of not turning off the instance.

1 0 5 0 0

Dmytro Mishkin 🇺🇦 @ducha_aiki

2 years ago

@giffmana @amy_tabb @theshawwn Saying that, I absolutely did a lot of ImageNet training in the past, when architectures were simpler 1 week to train each, but the most of studies were done on 128px to reduce load github.com/ducha-aiki/caf…

1 1 3 0 0

Lucas Beyer (bl16) @giffmana

2 years ago

@ducha_aiki @amy_tabb @theshawwn yeah, so we agree? It was feasible years ago for a lone phd student with 1 GPU, and is even moreso nowadays. Just need a little bit of patience.

2 0 0 0 0

Shawn Presser @theshawwn

2 years ago

@giffmana @ducha_aiki @amy_tabb People keep telling you it's hard, and you keep denying it's hard. I don't think this attitude is productive. Just because you can do something doesn't mean other people can do that thing. And it has a lot less to do with patience than with the right esoteric commands.

1 0 1 0 0

Lucas Beyer (bl16) @giffmana

2 years ago

@theshawwn @ducha_aiki @amy_tabb It's literally an example in every framework, I'd hardly call that esoteric O.o github.com/pytorch/exampl… github.com/google/flax/tr… - this one even has a nice timing table, takes 4.3h on 8xV100, meaning 1½ days! 2080 is not far: lambdalabs.com/blog/best-gpu-… But Dmytro didn't disagree?

0 0 0 0 0