SDXLCannyControlNetModel
Canny-edge-conditioned ControlNet pipeline built on Stable Diffusion XL 1.0.
Takes an input image and a text prompt. Canny edge maps are extracted using
OpenCV with configurable hysteresis thresholds, then fed as spatial
conditioning into the diffusers/controlnet-canny-sdxl-1.0 ControlNet
backbone together with the stabilityai/stable-diffusion-xl-base-1.0
diffusion pipeline and the madebyollin/sdxl-vae-fp16-fix VAE. The result
is a high-resolution image (up to 1024 x 1024 px) that closely follows the
structural edges of the original while adhering to the text prompt.
Requires opencv-python (pip install opencv-python).
References
- [1] Zhang & Agrawala, "Adding Conditional Control to Text-to-Image Diffusion Models", ICCV 2023. https://arxiv.org/abs/2302.05543
- [2] https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0
Parameters
- canny_low_threshold : integer, default=
100 - Lower threshold for Canny edge detection (range 0-255). Edges with gradient below this value are discarded. Lower values detect more edges, including weaker ones. Typical range: 50-150.
- canny_high_threshold : integer, default=
200 - Upper threshold for Canny edge detection (range 0-255). Edges with gradient above this value are detected. Higher values produce fewer but stronger edges. Typical range: 150-250. Must be greater than low_threshold.
- num_inference_steps : integer, default=
20 - Number of denoising steps. SDXL Canny achieves good quality with 20-30 steps. More steps improve quality at the cost of generation time.
- controlnet_conditioning_scale : number, default=
0.5 - Weight of the Canny edge conditioning (range 0.0-2.0). At 0.5 the edges guide the composition loosely. At 1.0 the output closely follows the input edges. Higher values produce more rigid edge adherence.
- device : string, default=
CPU - Hardware device for inference. GPU is strongly recommended for SDXL. CPU inference is very slow for this large model.
Methods
generate(self, input: Tuple[ForwardRef('Image.Image'), str]) -> List[Any]
SDXLCannyControlNetModelGenerate output from a generative model.
Parameters
- input : Tuple[Image.Image, str]
- Input image and text prompt.
Returns
- List[Any]
- Generated output images in a list.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.