Logo
Stable Diffusion on Apple Silicon: M1/M2/M3 Setup

Stable Diffusion on Apple Silicon: M1/M2/M3 Setup

October 23, 2025
7 min read

Apple Silicon has excellent support for AI thanks to the Metal Performance Shaders (MPS) backend — Apple actually did a pretty good job here. With the right setup, you can run Stable Diffusion locally on your Mac—even without an NVIDIA GPU.

This guide is written for beginners and follows the correct instructions based on:

It includes:

  • ✅ Installation using Automatic1111 WebUI
  • ✅ One-click Mac alternatives (DiffusionBee, Draw Things)
  • ✅ Optimized settings for speed & memory
  • ✅ Fix common Mac errors

✅ System Requirements

ComponentMinimumRecommended
ChipM1M2 Pro/Max or M3
RAM8GB16GB+
macOS13.3 or newerLatest version
Storage10–30GB50GB+

✅ Method 1: Easiest Option (No Terminal)

Option A — DiffusionBee (One Click Installer)

  • Download: diffusionbee.com
  • Pros: Easiest option, beginner-friendly — literally just download and run
  • Cons: Limited customization (you’ll hit walls pretty quickly)

Option B – Draw Things

  • Download from Mac App Store
  • Pros: Local processing, supports LoRA & ControlNet
  • Cons: Slower workflow

If you want full control and extensions like ControlNet, LoRA, upscale, custom models, continue to Method 2 — it’s more setup, but worth it if you’re serious about this.


✅ Method 2: Install AUTOMATIC1111 WebUI (Full Control)

This is the recommended setup using Terminal.

Step 1: Install Homebrew

Open Terminal and run:

Terminal window
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install Python and Git:

Terminal window
brew install python git

Step 2: Clone Stable Diffusion WebUI

Terminal window
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui

Step 3: Configure for Apple Silicon

Edit webui-user.sh (create if missing):

Terminal window
echo 'export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half"' >> webui-user.sh

Add MPS support:

Terminal window
echo 'export PYTORCH_ENABLE_MPS_FALLBACK=1' >> webui-user.sh

Step 4: Install Requirements + Launch

Terminal window
./webui.sh

The WebUI runs at: http://127.0.0.1:7860


✅ Download a Model (SDXL or SD1.5)

Download a model from Hugging Face and place it into:

stable-diffusion-webui/models/Stable-diffusion/

Example (SDXL base model): stabilityai/stable-diffusion-xl-base-1.0


✅ Enable Optimizations for Mac

In Settings → Optimization:

  • ✓ Enable MPS support
  • ✓ Reduce VRAM usage
  • ✓ Enable “medvram” or “lowvram” for 8GB Macs

Add to terminal before launch (optional boost):

Terminal window
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0

✅ Common Errors and Fixes

ErrorFix
Torch not compiled with MPSInstall PyTorch for MPS: pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
Slow generationUse smaller model (SD 1.5) instead of SDXL
WebUI crashesAdd --no-half to launch arguments
Out of memoryEnable --lowvram

ModelStepsSamplerCFG
SD 1.520–25Euler a6–8
SDXL18–24DPM++ 2M5–7
Realistic Vision22DPM++7

✅ ControlNet & LoRA Support on Mac

Works normally, but slower than on GPU PCs. Place LoRA files here:

models/Lora/

Install ControlNet via Extensions → Available → Search “ControlNet”.


✅ Performance Tips

  • Close all other apps
  • Use image size 768×768 for speed
  • Use SD1.5 instead of SDXL if slow
  • Use Euler a sampler for fastest speed
  • Lower batch size to 1 on M1

✅ Conclusion

You now have Stable Diffusion running natively on Apple Silicon using the AUTOMATIC1111 WebUI. This setup allows model experiments, LoRA support, ControlNet, and upscaling — basically everything you’d want from a full setup.


✅ Bonus: Install ComfyUI on Apple Silicon (M1/M2/M3)

ComfyUI also works on macOS and is actually faster than AUTOMATIC1111 in many cases — something to keep in mind if speed matters to you.

Step 1: Install dependencies

Terminal window
brew install git [email protected]

Step 2: Clone ComfyUI

Terminal window
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Step 3: Launch ComfyUI

Terminal window
python main.py --force-fp16 --cpu-offload

Open in browser: http://127.0.0.1:8188

✅ Supports LoRA, ControlNet, and SDXL.


✅ Bonus: Install Forge (Stable Diffusion WebUI Forge) on Mac

Forge is faster than A1111 and supports memory-efficient attention — it’s a solid alternative if you want something different.

Terminal window
git clone https://github.com/lllyasviel/stable-diffusion-webui-forge.git
cd stable-diffusion-webui-forge
chmod +x run.sh
./run.sh --no-half --skip-torch-cuda-test

✅ Good for SDXL on M1/M2.



✅ Install FLUX on Apple Silicon (M1/M2/M3)

FLUX can run on Apple Silicon using CPU + Metal acceleration. While it is slower than on NVIDIA GPUs (sometimes painfully slow), it works for experimentation — just don’t expect blazing fast speeds.

Step 1: Install Required Dependencies

From your main AI folder:

Terminal window
cd stable-diffusion-webui
source venv/bin/activate || true
pip install transformers accelerate safetensors

Step 2: Download FLUX Models

Create a folder for FLUX:

models/FLUX/

Download FLUX from Hugging Face:

Put the .safetensors file here:

stable-diffusion-webui/models/Stable-diffusion/

Step 3: Install FLUX Support

Install ComfyUI FLUX nodes (best support on macOS):

Terminal window
cd ComfyUI/custom_nodes
git clone https://github.com/city96/ComfyUI-FLUX.git

Restart ComfyUI.

Step 4: FLUX Settings for Mac

Use these settings for best performance:

SettingValue
Precisionfp16
Batch size1
Modefloat fallback enabled

✅ Recommended: Use FLUX Schnell on Mac for faster generation.



⚡ FLUX Performance Optimization on Apple Silicon

Running FLUX on Apple Silicon is slower than on NVIDIA, but with the right settings you can improve generation speed and avoid memory crashes.

Use these flags when launching ComfyUI to improve stability:

Terminal window
python main.py --force-fp16 --disable-smart-memory --cpu-offload --lowvram

✅ Environment Variables for Better Stability

Add these to your terminal before launching:

Terminal window
export PYTORCH_ENABLE_MPS_FALLBACK=1
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0

✅ Memory Optimization

SettingRecommendation
Resolution768×768 (max for M1/M2 16GB)
Batch Size1 only
ModelUse FLUX.1-schnell (faster)
SamplerEuler or DPM++ 2M
Precisionfp16

✅ Speed Boost Options

MethodGain
Use float32 fallbackPrevents model crash on MPS
Use schnell model2× faster than dev
Disable VAE decodingSlight speed gain
Close Chrome tabsFrees unified memory


✅ Table of Contents

  1. Introduction
  2. Requirements
  3. Method 1 – One‑Click Mac Apps
  4. Method 2 – Automatic1111 WebUI Setup
  5. Downloading Models
  6. Performance Settings for Apple Silicon
  7. ComfyUI on macOS
  8. Forge on macOS
  9. FLUX Installation on Apple Silicon
  10. FLUX Performance Optimization
  11. Troubleshooting
  12. Conclusion

🛠 Troubleshooting Summary

ProblemCauseSolution
WebUI crashesMissing MPS argsAdd --skip-torch-cuda-test --no-half
Slow generationSDXL too heavyUse SD1.5 or FLUX-schnell
RuntimeError: MPS fallbackNo GPU ops availableAdd PYTORCH_ENABLE_MPS_FALLBACK=1
Out of memoryMac RAM limitUse --lowvram and 768×768
Cannot load modelWrong folderMove to models/Stable-diffusion/


✅ Final Notes

This guide provides a reliable and tested setup for Stable Diffusion on Apple Silicon systems using Automatic1111, ComfyUI, and Forge. It includes full support for SD1.5, SDXL, and FLUX on macOS. Performance will not match CUDA GPUs (let’s be realistic here), but the setup is stable and functional for local generation — you can definitely get good results, just be patient.