SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper on arXiv | GitHub repo

The repository contains Sparse Autoencoders trained in our work for blocks up.1.1 and up.1.2.

After cloning our GitHub repo, you can use them as follows:

from SAE.sae import Sae

device = "cuda"
hookpoint = "unet.up_blocks.1.attentions.1"

sae = Sae.load_from_hub("bcywinski/SAeUron", hookpoint=hookpoint, device=device)

πŸ“š Bibtex

@article{cywinski2025saeuron,
  title={SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders},
  author={Cywi{\'n}ski, Bartosz and Deja, Kamil},
  journal={arXiv preprint arXiv:2501.18052},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.