AI-based Web3 ecosystem that allows users to create high-quality 3D assets in just a few minutes using only text or 2D prompts.
Image prompts are either generated by DALL-E 3 or extracted from SA-1B
All text prompts are generated by GPT-4. Click on the cards to view extracted GLB files
Manipulate targeted local regions of a given 3D asset according to given text or image prompts.
All text prompts are generated by GPT-4. Click on the cards to view extracted GLB files
We introduce 3D Alchemy, a family of large generation models representation designed for high-quality and versatile 3D generation.
Models combines sparse structures with powerful visual features, enabling efficient and detailed 3D modeling, what can be conditioned on text prompts or images, making it highly flexible for various applications.
3D Alchemy uses a two-step generation process. First, it creates the sparse structure, and then it fills in latent vectors for non-empty voxels. We use rectified flow transformers as the backbone, optimized to handle the sparsity.
Model places local latents on active voxels that intersect the object's surface. These latents are encoded by fusing image features from multiple rendered views of the 3D model, capturing fine geometric and visual details. This approach enhances the coarse structure provided by the voxels.