Accepted at #ICLR2022 🎉 great work as always by @emiel_hoogeboom :)
Accepted at #ICLR2022 🎉 great work as always by @emiel_hoogeboom :)
@vdbergrianne @emiel_hoogeboom Do you think there's any mileage in using this to sample from a stat mech model? You'd be minimising the KL in the other direction, and you'd lose the benefit of training separately through each step. Might still be interesting, though
@AustenLamacraft @vdbergrianne good question. Just to check that I understand correctly, you mean to sample from the model to use for a downstream task? For this, one would at least need to use the parallelization, too avoid taking too many steps.
@AustenLamacraft @vdbergrianne Further if you want to backprop, one would also require a discrete sampling technique (like gumbel softmax). But then yes it should be possible
@emiel_hoogeboom @vdbergrianne Ah yes, I guess this is one of the big benefits of training these models on data: no sampling to differentiate