François Fleuret @francoisfleuret, Twitter Profile

François Fleuret @francoisfleuret

a year ago

"linear layers" are affine and all the deep learning literature should be corrected.

20 3 37 53K 10

Shawn Presser @theshawwn

a year ago

@francoisfleuret An activation makes it not-affine.

1 0 2 1K 0

Lucas Beyer (bl16) @giffmana

a year ago

@theshawwn @francoisfleuret I'm instantly put off when a library or framework considers the non linearity to be part (say an option) of another layer

3 0 5 742 1

Clayton @cthorrez

a year ago

@giffmana @theshawwn @francoisfleuret Do you want it to be part of the same layer or completely separate, as in not a part of any layer?

1 0 1 33 0

@giffmana @francoisfleuret Maybe, but we need a term in between layer and block. A block usually consists of conv -> norm -> relu, repeated three or four times. I'd like a term for that "conv -> norm -> relu" part. I think of that as a "layer", conceptually. But I know that's not quite right.

1 0 2 322 0

Horace He @cHHillee

a year ago

@giffmana @theshawwn @francoisfleuret I think it’s sometimes done since cudnn provides some particular fusions of conv + activation functions. So if you don’t have a compiler that can do it, it can help for performance.

1 0 4 689 1