Stable Diffusion installation on Apple Silicon (Diffusers method)

Stable Diffusion

Main URL: https://huggingface.co/CompVis/stable-diffusion
Current version URL: https://huggingface.co/CompVis/stable-diffusion-v1-4
Public release notice URL: https://stability.ai/blog/stable-diffusion-public-release

General install instructions

Requirements

Python environment

Create a virtual Python environment to keep everything in one place:

mkdir sd
cd sd
python3 -m venv env
. ./env/bin/activate
pip install --upgrade pip
pip install --upgrade diffusers transformers scipy spacy ftfy
pip install huggingface
huggingface-cli login

Image generation script

Based on instructions at https://huggingface.co/blog/stable_diffusion

Create file script named sd.py with contents:

#!/usr/bin/env python
from datetime import datetime
import sys
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline

if len(sys.argv) != 2:
    raise ValueError('Usage: ' + sys.argv[0] + ' "<prompt>"');

# Use date time as output filename
dt = datetime.now()
filename = dt.strftime('%Y-%m-%d_%H%M%S') + '.png'

# Configure generation model and device
model_id = "CompVis/stable-diffusion-v1-4"
device = "cpu" # options are: cpu, cuda, mps
pipe = StableDiffusionPipeline.from_pretrained(model_id, use_auth_token=True)
pipe = pipe.to(device)

##
## Image generation options
##
prompt = sys.argv[1]

# Either height or width must be 512; don't change both at once.
# Both values must be a multiple of 8.
width = 512
height = 512

# Higher number yields higher quality, but less diversity
num_inference_steps = 50

# How strictly to follow prompt. Values between 7 and 8.5 are best
guidance_scale = 7.5

with autocast(device):
    image = pipe(prompt, height=height, width=width, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale)["sample"][0]

image.save(filename)

# Log prompt for each generated image
flog = open("log.txt", "a+")
flog.write("\"{0}\",\"{1}\"\n".format(filename, prompt))
flog.close()

Make the script executable:

chmod +x sd.py

Run the script (must be done within virtual environment):

./sd.py

Apple Silicon compatibility

Pytorch nightly

Install pytorch nightly (instructions from https://pytorch.org/ table):

pip uninstall torch
pip install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Edit ./env/lib/python3.10/site-packages/torch/nn/functional.py (credit to https://github.com/CompVis/stable-diffusion/issues/25#issuecomment-1221416526).
Change

return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

to

return torch.layer_norm(input.contiguous(), normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

Modifications to the image generation script

Several changes also need to be made to sd.py, the image generation script

Change

device = "cpu"

to

device = "mps"

Comment out:

with autocast(device):

And unindent the image = pipe(... line.

macOS compose key

gen-compose

URL: https://pypi.org/project/gen-compose/

Generates key binding dictionaries using a facility built into macOS, described at https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/EventOverview/TextDefaultsBindings/TextDefaultsBindings.html

Comes with a companion utility called gen-compose-convert that can accept X compose files as input. Usage:

gen-compose-convert xcompose ./path/to/Compose --keep-comments > xcompose.yaml

Use the output of the above command as input for gen-compose. Usage:

gen-compose xcompose.yaml > ~/Library/KeyBindings/DefaultKeyBinding.dict

This generates a key binding dictionary that uses the non-US backslash character (§) as the first character in the chain. What's missing is a way to directly type § on a US English keyboard.

hidutil

Since macOS 10.12 Sierra, there has been an ability to do specify simple key re-mappings using the hidutil utility. This functionality is described at https://developer.apple.com/library/archive/technotes/tn2450/_index.html

Command to check current key mappings:

hidutil property --get "UserKeyMapping"

Command to map right Option key to §:

hidutil property --set '{"UserKeyMapping":[{"HIDKeyboardModifierMappingSrc":0x7000000E6,"HIDKeyboardModifierMappingDst":0x700000064}]}'

Auto-run at startup

Effects of hidutil do not persist across reboots. In order to have the key remapping apply automatically, the command needs to run at startup. One method of doing so is by making a LaunchAgent.

Save the following as ~/Library/LaunchAgents/local.RemapRightOptionToNonUsBackslash.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>local.RemapRightOptionToNonUsBackslash</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/bin/hidutil</string>
        <string>property</string>
        <string>--set</string>
        <string>{"UserKeyMapping":[{"HIDKeyboardModifierMappingSrc":0x7000000E6,"HIDKeyboardModifierMappingDst":0x700000064}]}</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>