Stable Diffusion installation on Apple Silicon (Diffusers method)

Stable Diffusion

General install instructions


Python environment

Create a virtual Python environment to keep everything in one place:

mkdir sd
cd sd
python3 -m venv env
. ./env/bin/activate
pip install --upgrade pip
pip install --upgrade diffusers transformers scipy spacy ftfy
pip install huggingface
huggingface-cli login

Image generation script

Based on instructions at

Create file script named with contents:

#!/usr/bin/env python
from datetime import datetime
import sys
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline

if len(sys.argv) != 2:
    raise ValueError('Usage: ' + sys.argv[0] + ' "<prompt>"');

# Use date time as output filename
dt =
filename = dt.strftime('%Y-%m-%d_%H%M%S') + '.png'

# Configure generation model and device
model_id = "CompVis/stable-diffusion-v1-4"
device = "cpu" # options are: cpu, cuda, mps
pipe = StableDiffusionPipeline.from_pretrained(model_id, use_auth_token=True)
pipe =

## Image generation options
prompt = sys.argv[1]

# Either height or width must be 512; don't change both at once.
# Both values must be a multiple of 8.
width = 512
height = 512

# Higher number yields higher quality, but less diversity
num_inference_steps = 50

# How strictly to follow prompt. Values between 7 and 8.5 are best
guidance_scale = 7.5

with autocast(device):
    image = pipe(prompt, height=height, width=width, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale)["sample"][0]

# Log prompt for each generated image
flog = open("log.txt", "a+")
flog.write("\"{0}\",\"{1}\"\n".format(filename, prompt))

Make the script executable:

chmod +x

Run the script (must be done within virtual environment):


Apple Silicon compatibility

Pytorch nightly

Install pytorch nightly (instructions from table):

pip uninstall torch
pip install --pre torch --extra-index-url

Edit ./env/lib/python3.10/site-packages/torch/nn/ (credit to

return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)


return torch.layer_norm(input.contiguous(), normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

Modifications to the image generation script

Several changes also need to be made to, the image generation script


device = "cpu"


device = "mps"

Comment out:

with autocast(device):

And unindent the image = pipe(... line.

