Stable Diffusion XL (SDXL) is a major upgrade from SD 1.5, offering enhanced realism, sharper details, better composition, and improved text rendering. However, here’s the thing: SDXL behaves differently than earlier models and requires specific techniques for best results. This guide is designed to give you professional-grade results with clear explanations and actionable settings — no fluff, just what works.
This guide is based on practical testing and trusted resources from:
- Stability AI SDXL documentation
- Hugging Face SDXL model insights
- Community research and expert workflows
- StableDiffusionArt.com
- Industry prompt engineers and workflow experts
✅ Table of Contents
- What makes SDXL different?
- Recommended software & environments
- SDXL model structure explained
- Best generation settings (tested configurations)
- Prompting for SDXL – structure and techniques
- Negative prompting best practices
- Refiner usage – when and how to apply
- Resolution and aspect ratios
- SDXL LoRA and training compatibility
- SDXL with ControlNet
- Upscaling strategies
- Performance tips
- Troubleshooting
- Conclusion
1. What Makes SDXL Different?
SDXL is not just a bigger version of Stable Diffusion 1.5 – it’s actually a next-generation model built for realism, composition control, and high-resolution output. Understanding how it differs from SD1.5 is key to using it effectively — they’re more different than you might think.
SDXL improves over SD1.5, but it requires more GPU VRAM (minimum 8GB recommended) — keep that in mind. Key differences:
| Feature | SD 1.5 | SDXL |
|---|---|---|
| Output quality | Good | High |
| Detail & realism | Medium | High |
| Text generation | Poor | Improved |
| Handles complex prompts | Limited | Yes |
| Base resolution | 512×512 | 1024×1024 |
| VRAM required | 6GB | 8–12GB |
Important: SDXL responds differently to prompts—short tag-based prompts from SD1.5 do not work well. SDXL prefers descriptive sentence-style prompts — think full sentences, not just keywords. This is a common mistake people make.
2. Recommended Software
SDXL works well in the following UIs:
| Software | Why use it |
|---|---|
| ComfyUI | Best for SDXL workflows and refiners |
| Stable Diffusion WebUI Forge | Faster SDXL performance |
| AUTOMATIC1111 (latest) | Works but slower |
| InvokeAI | Best for inpainting & unified canvas |
ComfyUI and Forge are highly recommended for SDXL.
3. SDXL Model Structure
SDXL uses two models:
- Base model – Creates initial image structure
- Refiner model – Improves details and textures
Both can be used together for optimal quality, especially in portrait and product rendering use cases.
4. Best SDXL Settings (Recommended for Quality and Stability)
These settings are based on benchmark testing across Forge, ComfyUI, and A1111 environments. They balance quality and render time. These are tested settings for high quality:
| Setting | Value |
|---|---|
| Steps | 25–35 |
| Sampler | DPM++ 2M Karras |
| CFG Scale | 5–7 |
| Refiner switch | At step 0.75 |
| Seed | -1 (random) |
| Resolution | 1024×1024 (base) |
For portraits use: Euler a or DPM++ SDE.
5. Prompting for SDXL – Best Practices
Unlike SD1.5 which prefers short tag-style prompts, SDXL works best with natural language prompts. You should write descriptive phrases like a photographer or filmmaker — the more descriptive, the better. Think storytelling, not just keywords.
✅ SD1.5 vs SDXL Prompt Example
| Model | Weak Prompt | Strong Prompt |
|---|---|---|
| SD1.5 | ”cyberpunk girl, neon” | ✅ Works well |
| SDXL | ”portrait of a cyberpunk woman, neon lights, dramatic rim light, shallow depth of field, detailed skin” | ✅ Best results |
✅ Prompt Template for SDXL
[Subject], [Scene], [Lighting], [Camera], [Style], [Details]✅ Good SDXL Prompt Example
Cinematic portrait of a Scandinavian woman with freckles, soft studio lighting, 85mm lens photography, film look, ultra detailed skin texture, sharp depth of field, magazine editorial style6. Negative Prompting for SDXL
Negative prompts help control quality — they’re your way of telling SDXL what you don’t want. SDXL does not need long negative lists like SD 1.5, which is nice because you can keep things simpler.
✅ Recommended Negative Prompt
low quality, blurry, pixelated, distorted, extra limbs, watermark, text, deformed handsOptional Advanced Negative Prompt
bad anatomy, low detail, overexposed, underexposed, noisy, overly saturated, cartoonish, artifacts7. SDXL Refiner – When and How to Use It
SDXL includes a base model and an optional refiner model. The refiner improves fine details like eyes, skin, shadows, and edges — it’s basically a polish pass that can make a big difference in final quality.
When to Use the Refiner
| Use Case | Refiner Needed? |
|---|---|
| Portraits | ✅ Yes |
| Realistic Photography | ✅ Yes |
| Products/Logos | ✅ Yes |
| Anime/Concept Art | Optional |
| Fast Preview Tests | ❌ No |
Recommended Refiner Settings
| Setting | Value |
|---|---|
| Refiner Switch | 0.65 – 0.80 |
| Steps (Base + Refiner) | 15 + 10 |
| Sampler | DPM++ 2M Karras |
8. Aspect Ratios & Resolution for SDXL
SDXL was trained at 1024×1024 but supports flexible resolutions.
Best Resolutions for SDXL
| Ratio | Resolution |
|---|---|
| Square | 1024×1024 |
| Portrait | 832×1216 / 896×1152 |
| Landscape | 1152×896 / 1216×832 |
| Ultra-Wide | 1536×640 |
Avoid unusual values like 1000×1000 or 900×900 — they reduce model quality.
9. SDXL with ControlNet
ControlNet works well with SDXL but requires SDXL-compatible models.
Recommended ControlNet Models for SDXL
| Model | Use Case |
|---|---|
| controlnet-canny-sdxl | Edge maps |
| controlnet-depth-sdxl | Depth & lighting |
| controlnet-openpose-sdxl | Human poses |
Enable pixel-perfect for best results.
10. Using LoRA with SDXL – Best Practices
LoRA models for SD 1.5 are not compatible with SDXL. You must use SDXL LoRAs only.
Correct LoRA Folder Paths
Place LoRA files here:
models/Lora/Recommended LoRA Strengths
| Type | Strength |
|---|---|
| Character LoRA | 0.6 – 0.9 |
| Style LoRA | 0.4 – 0.7 |
| Clothing/Item LoRA | 0.3 – 0.6 |
Use no more than 3 LoRAs at once to maintain model stability.
11. Upscaling for SDXL – High Quality Strategy
SDXL images can be upscaled without losing detail.
Best Upscaling Methods
| Method | Tool | Quality |
|---|---|---|
| HighRes Fix | A1111/Forge | ⭐ Good |
| Latent Upscale | ComfyUI | ⭐⭐ Better |
| 4x-UltraSharp | ComfyUI/ESRGAN | ⭐⭐⭐ Excellent |
Recommended HighRes Fix Settings
| Option | Value |
|---|---|
| Denoise strength | 0.35 – 0.45 |
| Upscale by | 1.5x – 2x |
| Steps | 15 – 20 |
12. Recommended Samplers for SDXL
Based on testing:
| Goal | Sampler |
|---|---|
| Fast previews | Euler a |
| Best balanced quality | DPM++ 2M Karras |
| Portraits | DPM++ SDE |
| Sharp details | DPM++ 3M SDE |
13. SDXL Workflow Examples
Workflow A – Standard SDXL (Beginner Friendly)
- Load SDXL Base model
- Set resolution 1024×1024
- Steps 30, Sampler DPM++ 2M Karras
- Generate base image
- Optional: Apply upscaler (4x UltraSharp)
Workflow B – SDXL with Refiner (High Quality)
- Generate with SDXL Base (70% of steps)
- Switch to SDXL Refiner (30% of steps)
- Use DPM++ SDE for refined detail
14. Performance Tips (VRAM Saving)
| GPU VRAM | Recommended SDXL Settings |
|---|---|
| 4–6GB | Use 768×768 + medvram |
| 8GB | 1024×1024 default |
| 12GB+ | 2-pass upscale workflow |
Tips:
- Lower resolution first, upscale later
- Use “schnell” samplers for previews
- Avoid too many LoRAs (VRAM heavy)
15. Troubleshooting
| Problem | Solution |
|---|---|
| Flat images | Lower CFG to 5–6 |
| Washed out images | Increase contrast in prompt |
| Blurry output | Use refiner |
| Hands look bad | Use ControlNet “openpose” |
| Missing detail | Increase steps to 35 |
Related Guides
- FLUX in ComfyUI: /blog/flux-comfyui-guide
- FLUX in Stable Diffusion Forge: /blog/flux-forge-guide
- Stable Diffusion Prompting: /blog/stable-diffusion-prompting-guide
- Install Stable Diffusion on Mac: /blog/stable-diffusion-apple-guide
✅ Conclusion
SDXL is a powerful model for realistic and artistic image generation when used correctly. With the right settings, refined prompting, and control workflows, it produces significantly better detail and coherence than SD1.5 — but you need to use it the right way, not the SD1.5 way. That’s the key takeaway here.