Ben Poole @poolio, Twitter Profile

Ben Poole @poolio

2 years ago

Happy to announce DreamFusion, our new method for Text-to-3D! dreamfusion3d.github.io We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed! Joint work w/ the incredible team of @BenMildenhall @ajayj_ @jon_barron #dreamfusion

136 1K 6K 0 946

Download Video

Robin Lobel @divideconcept

2 years ago

@poolio @karpathy @BenMildenhall @ajayj_ @jon_barron How do you change the angle of rendering (to feed the nerf)? By using prompts such as "from the top/front"? How do you keep scene consistency between those angles?

1 0 4 0 0

Ben Poole @poolio

2 years ago

@divideconcept @karpathy @BenMildenhall @ajayj_ @jon_barron We condition the diffusion model with view-dependent prompts based on the azimuth and elevation by appending "front view" "back view" etc. Consistency comes from 3d model + making sure all views in between are good too :)

2 2 10 0 1

Julien Dorra @juliendorra

2 years ago

@poolio @divideconcept @karpathy @BenMildenhall @ajayj_ @jon_barron So is there a manual process in selecting the different views?

1 0 1 0 0

Ben Poole @poolio

2 years ago

@juliendorra @divideconcept @karpathy @BenMildenhall @ajayj_ @jon_barron nope, automatic and described in appendix: for azimuth we split into four equally sized quadrants: "front view" is between 0 and 90, "side view" 90 - 180, "back view" 180-270, and "side view" for 270-360. If elevation > 60 degrees we swap to "overhead view"

1 0 4 0 0

Julien Dorra @juliendorra

2 years ago

@poolio @divideconcept @karpathy @BenMildenhall @ajayj_ @jon_barron Do you feel there’s a path for diffusion models to really become multi-view aware, or is there specific models to reinvent for that? (and thus going back to square one of… several order of magnitude bigger 3D datasets 😬)

0 0 0 0 0