Emiel Hoogeboom @emiel_hoogeboom, Twitter Profile

Emiel Hoogeboom @emiel_hoogeboom

2 years ago

Generate data in random order? A masked language models as generative model? All this and more in "Autoregressive Diffusion Models" with @agritsenko @BastingsJasmijn @poolio @vdbergrianne @TimSalimans. For details see arxiv.org/abs/2110.02037. Some explanations below...

2 23 102 0 29

Download Gif

Emiel Hoogeboom @emiel_hoogeboom

2 years ago

Training of Autoregressive Diffusion Models (ARDMs). In the basic form, they are trained like a masked language model, but where the _number_ of masked variables also varies. Each train step some variables are masked, and those are predicted from the remaining ones.

1 0 3 0 0

Download Image

Emiel Hoogeboom @emiel_hoogeboom

2 years ago

Sampling takes multiple steps. First a random generation order is picked. Then, one-by-one the model does a forward pass and samples a value. Those values are filled-in for the next forward pass.

2 0 3 0 0

Download Image

Emiel Hoogeboom @emiel_hoogeboom

2 years ago

Extensions include parallelization (using multiple predictions at once) and generation in stages (predicting more significant bits first).

1 0 3 0 0

Download Image

Ohad Rubin @OhadRubin

2 years ago

@emiel_hoogeboom Sounds very similar to the mask-predict non-autoregressive decoding algorithm

1 0 0 0 0