siliconflow/onediff

Repository files navigation

| Documentation | Community | Contribution | Discord |


onediff is an out-of-the-box acceleration library for diffusion models, it provides:

  • Out-of-the-box acceleration for popular UIs/libs(such as HF diffusers and ComfyUI)
  • PyTorch code compilation tools and strong optimized GPU Kernels for diffusion models

We're hiring! If you are interested in working on onediff at SiliconFlow, we have roles open for Interns and Engineers in Beijing (near Tsinghua University).

If you have contributed significantly to open-source software and are interested in remote work, you can contact us at [email protected] with onediff in the email title.


onediff is the abbreviation of "one line of code to accelerate diffusion models".

  • Model stabilityai/stable-diffusion-xl-base-1.0;
  • Image size 1024*1024, batch size 1, steps 30;
  • NVIDIA A100 80G SXM4;

  • Model stabilityai/stable-video-diffusion-img2vid-xt;
  • Image size 576*1024, batch size 1, steps 25, decoder chunk size 5;
  • NVIDIA A100 80G SXM4;

Note that we haven't got a way to run SVD with TensorRT on Feb 29 2024.

We also maintain a repository for benchmarking the quality of generation after acceleration: odeval

Note: You can choose the latest versions you want for diffusers or transformers.

python3 -m pip install "torch" "transformers==4.27.1" "diffusers[torch]==0.19.3"

When considering the choice between OneFlow and Nexfort, either one is optional, and only one is needed.

  • For DiT structural models or H100 devices, it is recommended to use Nexfort.

  • For all other cases, it is recommended to use OneFlow. Note that optimizations within OneFlow will gradually transition to Nexfort in the future.

Install Nexfort is Optional. The detailed introduction of Nexfort is here.

python3 -m  pip install -U torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 torchao==0.1
python3 -m  pip install -U nexfort

Install OneFlow is Optional.

NOTE: We have updated OneFlow frequently for onediff, so please install OneFlow by the links below.

  • CUDA 11.8

    For NA/EU users

    python3 -m pip install -U --pre oneflow -f https://.com/siliconflow/oneflow_releases/releases/expanded_assets/community_cu118

    For CN users

    python3 -m pip install -U --pre oneflow -f https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu118
Click to get OneFlow packages for other CUDA versions.
  • CUDA 12.1

    For NA/EU users

    python3 -m pip install -U --pre oneflow -f https://.com/siliconflow/oneflow_releases/releases/expanded_assets/community_cu122

    For CN users

    python3 -m pip install -U --pre oneflow -f https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu122
  • CUDA 12.2

    For NA/EU users

    python3 -m pip install -U --pre oneflow -f https://.com/siliconflow/oneflow_releases/releases/expanded_assets/community_cu122

    For CN users

    python3 -m pip install -U --pre oneflow -f https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu122
  • From PyPI
python3 -m pip install --pre onediff
  • From source
git clone https://.com/siliconflow/onediff.git
cd onediff && python3 -m pip install -e .

Or install for development:

# install for dev
cd onediff && python3 -m pip install -e '.[dev]'

# code formatting and linting
pip3 install pre-commit
pre-commit install
pre-commit run --all-files

NOTE: If you intend to utilize plugins for ComfyUI/StableDiffusion-WebUI, we highly recommend installing OneDiff from the source rather than PyPI. This is necessary as you'll need to manually copy (or create a soft link) for the relevant code into the extension folder of these UIs/Libs.

FunctionalityDetails
Compiling TimeAbout 1 minute (SDXL)
Deployment MethodsPlug and Play
Dynamic Image Size SupportSupport with no overhead
Model SupportSD1.5~2.1, SDXL, SDXL Turbo, etc.
Algorithm SupportSD standard workflow, LoRA, ControlNet, SVD, InstantID, SDXL Lightning, etc.
SD Framework SupportComfyUI, Diffusers, SD-webui
Save & Load Accelerated ModelsYes
Time of LoRA SwitchingHundreds of milliseconds
LoRA OccupancyTens of MB to hundreds of MB.
Device SupportNVIDIA GPU 3090 RTX/4090 RTX/A100/A800/A10 etc. (Compatibility with Ascend in progress)

onediff supports the acceleration for SOTA models.

  • stable: release for public usage, and has long-term support;
  • beta: release for professional usage, and has long-term support;
  • alpha: early release for expert usage, and should be careful to use;
AIGC TypeModelsHF diffusersComfyUISD web UI
CommunityEnterpriseCommunityEnterpriseCommunityEnterprise
ImageSD 1.5stablestablestablestablestablestable
SD 2.1stablestablestablestablestablestable
SDXLstablestablestablestablestablestable
LoRAstablestablestable
ControlNetstablestable
SDXL Turbostablestable
LCMstablestable
SDXL DeepCachealphaalphaalphaalpha
InstantIDbetabeta
VideoSVD(stable Video Diffusion)stablestablestablestable
SVD DeepCachealphaalphaalphaalpha

Compile and save the compiled result offline, then load it online for serving

If you want to do distributed inference, you can use onediff's compiler to do single-device acceleration in a distributed inference engine such as xDiT

If you need Enterprise-level Support for your system or business, you can email us at [email protected], or contact us through the website: https://siliconflow.cn/pricing

Onediff Enterprise Solution
More extreme compiler optimization for diffusion processUsually another 20%~30% or more performance gain
End-to-end workflow speedup solutionsSometimes 200%~300% performance gain
End-to-end workflow deployment solutionsWorkflow to online model API
Technical support for deploymentHigh priority support
@misc{2022onediff,
  author={OneDiff Contributors},
  title = {OneDiff: An out-of-the-box acceleration library for diffusion models},
  year = {2022},
  publisher = {},
  journal = { repository},
  howpublished = {\url{https://.com/siliconflow/onediff}}
}