Text-to-3D
Diffusers
Safetensors
English
StableDiffusionLDM3DPipeline
stable-diffusion
stable-diffusion-diffusers
text-to-image
Eval Results (legacy)
Instructions to use Intel/ldm3d-4c with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Intel/ldm3d-4c with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Intel/ldm3d-4c", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
4c data type
#2
by enddl22 - opened
Thanks for nice and interesting work.
I am just wondering about the data type of input 4 channel. It seems either 32bits (RGB+8bit depth converted from 16bit) or 128bits (all float32 conversion) or something else..
Best,
Thank you for your interest in our work.
The original depth channel was indeed 16-bit. in our implementation all channels, including the RGB and the original 16-bit depth channel, are converted to float32.
Best,
Thanks for the reply, it's clear now.
I guess ',' and '.' may need to be swapped :).
estellea changed discussion status to closed