Text-to-image conversion systems such as DALL-E 2 and Stable Diffusion have rapidly gained popularity in recent years.

“To obtain a 3D object from the text, we first sample the image using the text-image model and then sample the 3D object based on this sample. Both of these steps can be completed in a matter of seconds and do not require expensive optimization procedures,” write the authors of the neural network.

If you enter a text query such as “The cat is eating a burrito”, Point-E first creates a synthetic 3D image of a cat eating a burrito. It then runs that image through a series of diffusion patterns to create a 3D RGB point cloud of the original image.

Each of these diffusion models has been trained on “millions” of 3D models converted into a standard format. The team acknowledges that “Although our method outperformed existing methods in this evaluation, it produced samples in a very short time.” If you want to try your hand, OpenAI has hosted the project code on Github.

Source: Ferra

Previous articleAcne drug helps arthritis in hands
Next articleNetflix takes another hit: Sharing passwords would be illegal

LEAVE A REPLY

Please enter your comment!
Please enter your name here