Introduction
This post explores the deployment of Stable Diffusion WebUI (AUTOMATIC1111) on a Kubernetes cluster. By leveraging GitOps principles with ArgoCD, we ensure a declarative and reproducible environment for high-performance AI image generation.
Project Overview
The Stable Diffusion deployment is part of the broader llama project, which aims to provide a comprehensive local AI infrastructure. Key features of this setup include:
- Stable Diffusion WebUI: A feature-rich interface for interacting with Stable Diffusion models.
- Hardware Acceleration: Direct access to NVIDIA GPUs within the Kubernetes cluster for rapid image generation.
- Persistent Model Storage: Centralized storage for large model checkpoints using NFS.
- GitOps Management: Automated synchronization and drift detection via ArgoCD.
Architecture and Components
The deployment is defined through a set of Kubernetes manifests, optimized for performance and stability:
1. Storage Configuration
Large AI models require significant storage and high availability. We utilize a PersistentVolume (PV) and PersistentVolumeClaim (PVC) backed by an NFS server:
sd-webui-pv&sd-webui-pvc: Provisioned with 40Gi of storage using thekubenfsstorage class. The access mode is set toReadWriteMany(RWX), allowing the volume to be managed and accessed across the cluster. The models are stored on an external ZFS pool (<nfs-server-ip>:<nfs-path>/llama/sd-webui).
2. Stable Diffusion Deployment
The core of the application is the sd-webui-deployment, which uses a specialized Docker image from ai-dock:
- Image:
ghcr.io/ai-dock/stable-diffusion-webui:latest-cuda - Resource Management:
- CPU: Requests of
2000mand limits of4000m. - Memory: Requests of
8Giand limits of16Gi. - GPU Acceleration: Specifically requests
nvidia.com/gpu: 1, ensuring the pod is scheduled on a node with an available NVIDIA GPU.
- CPU: Requests of
- Environment Variables:
WEBUI_FLAGS: Configured with--listen --api --xformers --enable-insecure-extension-accessto enable remote access and optimize performance with Xformers.NVIDIA_VISIBLE_DEVICES&NVIDIA_DRIVER_CAPABILITIES: Set toallto ensure full GPU functionality within the container.
- Volume Mounting: The model storage is mounted to
/workspace/stable-diffusion-webui/models, ensuring that downloaded checkpoints persist across pod restarts.
3. Service Exposure
Internal communication is handled by a ClusterIP service:
sd-webui-service: Exposes the WebUI on its default port 17860. This service can be further fronted by an Ingress or HTTPRoute for external access.
GitOps with ArgoCD
Like the rest of the llama stack, Stable Diffusion is managed as an ArgoCD Application. This allows for:
- Version Control: All manifest changes are tracked in Git.
- Automated Sync: ArgoCD ensures the cluster state matches the Git repository.
- Sync Waves: We utilize ArgoCD sync waves to ensure resources are created in the correct order (e.g., PV at wave
0, PVC at wave1, and the Deployment at wave2). - Easy Rollbacks: Quickly revert to a previous configuration if needed.
Conclusion
Deploying Stable Diffusion WebUI on Kubernetes provides a scalable and robust platform for AI image generation. By combining GPU passthrough with persistent NFS storage and GitOps management, we’ve created a high-performance environment that is both easy to manage and highly resilient.