Stable Diffusion installation on Apple Silicon (Diffusers method)
August 23, 2022โข378 words
Stable Diffusion
Main URL: https://huggingface.co/CompVis/stable-diffusion
Current version URL: https://huggingface.co/CompVis/stable-diffusion-v1-4
Public release notice URL: https://stability.ai/blog/stable-diffusion-public-release
General install instructions
Requirements
- Created huggingface.co account
- Access token from https://huggingface.co/settings/tokens
- Accepted terms of use at https://huggingface.co/CompVis/stable-diffusion-v1-4
rust
(install from Homebrew)
Python environment
Create a virtual Python environment to keep everything in one place:
mkdir sd
cd sd
python3 -m venv env
. ./env/bin/activate
pip install --upgrade pip
pip install --upgrade diffusers transformers scipy spacy ftfy
pip install huggingface
huggingface-cli login
Image generation script
Based on instructions at https://huggingface.co/blog/stable_diffusion
Create file script named sd.py
with contents:
#!/usr/bin/env python
from datetime import datetime
import sys
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline
if len(sys.argv) != 2:
raise ValueError('Usage: ' + sys.argv[0] + ' "<prompt>"');
# Use date time as output filename
dt = datetime.now()
filename = dt.strftime('%Y-%m-%d_%H%M%S') + '.png'
# Configure generation model and device
model_id = "CompVis/stable-diffusion-v1-4"
device = "cpu" # options are: cpu, cuda, mps
pipe = StableDiffusionPipeline.from_pretrained(model_id, use_auth_token=True)
pipe = pipe.to(device)
##
## Image generation options
##
prompt = sys.argv[1]
# Either height or width must be 512; don't change both at once.
# Both values must be a multiple of 8.
width = 512
height = 512
# Higher number yields higher quality, but less diversity
num_inference_steps = 50
# How strictly to follow prompt. Values between 7 and 8.5 are best
guidance_scale = 7.5
with autocast(device):
image = pipe(prompt, height=height, width=width, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale)["sample"][0]
image.save(filename)
# Log prompt for each generated image
flog = open("log.txt", "a+")
flog.write("\"{0}\",\"{1}\"\n".format(filename, prompt))
flog.close()
Make the script executable:
chmod +x sd.py
Run the script (must be done within virtual environment):
./sd.py
Apple Silicon compatibility
Pytorch nightly
Install pytorch nightly (instructions from https://pytorch.org/ table):
pip uninstall torch
pip install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu
Edit ./env/lib/python3.10/site-packages/torch/nn/functional.py
(credit to https://github.com/CompVis/stable-diffusion/issues/25#issuecomment-1221416526).
Change
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
to
return torch.layer_norm(input.contiguous(), normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
Modifications to the image generation script
Several changes also need to be made to sd.py
, the image generation script
Change
device = "cpu"
to
device = "mps"
Comment out:
with autocast(device):
And unindent the image = pipe(...
line.