Volodymyr Kyrylov @darkproger, Twitter Profile

Volodymyr Kyrylov @darkproger

2 weeks ago

Similar trick comes up when implementing backprop through discrete stochastic variables return loss + (grad_obj - grad_obj.detach()) Returns the forward loss, but also attaches the grad_obj node to the its Tensor so loss.backward() pushes the grad through this extra path

ptrblck @ptrblck_de

5 years ago

Similar trick comes up when implementing backprop through discrete stochastic variables return loss + (grad_obj - grad_obj.detach()) Returns the forward loss, but also attaches the grad_obj node to the its Tensor so loss.backward() pushes the grad through this extra path https://t.co/ue4WFbJd1e

10 68 458 0 103

1 7 30 15K 26

Download Image

Vishal @eigenVectorizer

2 weeks ago

@darkproger Can gumbel softmax trick be used in this case?

1 0 1 374 0