François Fleuret @francoisfleuret, Twitter Profile

François Fleuret @francoisfleuret

a year ago

"linear layers" are affine and all the deep learning literature should be corrected.

20 3 37 53K 10

Shawn Presser @theshawwn

a year ago

@francoisfleuret An activation makes it not-affine.

1 0 2 1K 0

Lucas Beyer (bl16) @giffmana

a year ago

@theshawwn @francoisfleuret I'm instantly put off when a library or framework considers the non linearity to be part (say an option) of another layer

3 0 5 742 1

@giffmana @francoisfleuret Maybe, but we need a term in between layer and block. A block usually consists of conv -> norm -> relu, repeated three or four times. I'd like a term for that "conv -> norm -> relu" part. I think of that as a "layer", conceptually. But I know that's not quite right.

1 0 2 322 0

Lucas Beyer (bl16) @giffmana

a year ago

@theshawwn @francoisfleuret That's the thing, having one somehow "freezes" this exact combination or at least conv-normalizer-activation and disincentives trying other stuff. At least that's what I noticed happens.

1 0 2 296 0

Shawn Presser @theshawwn

a year ago

@giffmana @francoisfleuret If the term was general enough, it wouldn't necessarily freeze that combo. On the other hand, it's hard to think of a different combo. What else have you used?

1 0 1 265 0