Introduction
ComfyUI + Nunchaku FLUX.1-dev represents a breakthrough in AI image generation performance. By combining ComfyUI's node-based workflow interface with MIT Han Lab's revolutionary SVDQuant 4-bit quantization technology, this setup delivers 3.0× speedups and 3.6× memory reduction compared to standard FLUX.1-dev implementations. In my testing on Windows 11 + RTX 3080 10GB, image generation times dropped from 40+ seconds to around 11-12 seconds while maintaining exceptional quality. This makes Nunchaku FLUX.1-dev one of the most practical solutions for local AI image generation in 2025.
Features
- Revolutionary Performance: SVDQuant's 4-bit quantization delivers 3.0× speedups over NF4 W4A16 baseline while maintaining visual fidelity
- Memory Efficiency: 3.6× memory reduction enables 12B FLUX.1-dev to run comfortably on 8GB+ RTX cards without CPU offloading
- Easy Installation: Unlike traditional quantization methods requiring hours of compilation, Nunchaku provides pre-built wheels for instant deployment
- Broad GPU Compatibility: Native support for RTX 20xx, 30xx, 40xx, and 50xx series cards through optimized CUDA kernels
- Professional Workflow Integration: Seamless ComfyUI integration with LoRA, ControlNet, and multi-model support
- Production-Ready Stability: ICLR 2025 Spotlight paper backing ensures academic rigor and reliability
Prerequisites
- Operating System: Windows 11 (tested) or Windows 10 with latest updates
- GPU: NVIDIA RTX series with 8GB+ VRAM (10GB+ recommended for FLUX.1-dev)
- System RAM: 16GB minimum, 32GB recommended
- Storage: 15GB+ free space for models and dependencies
- Python: Python 3.12 recommended (ComfyUI Desktop handles this automatically)
Installing ComfyUI Desktop
ComfyUI Desktop provides the most streamlined installation experience, eliminating Python environment management complexities. [Download Link]
Essential File Downloads
- The following models are required for
Nunchaku FLUX.1-dev operation. Download each file to its specified directory within your ComfyUI installation:
Installing ComfyUI-nunchaku Plugin
- The
Nunchaku plugin provides essential nodes for 4-bit quantized model loading and inference.
Run [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ [ComfyUI-nunchaku] (Check)
→ [Install]
→ Restart [ComfyUI]
Installing Nunchaku Backend
- This step installs the actual quantization engine that powers the performance improvements.
Run [ComfyUI]
→ [Workflow]
→ [Open]
→ install_wheel.json (Double Click)
→ [Nunchanku Wheel Installer] (Click)
→ version: [v0.3.1] (Select)
→ [Preview Any] (Click)
→ [▷ Execute] (Click)
→ Wait for confirmation: "Successfully installed nunchaku..."
→ Restart [ComfyUI]
[Advanced] Manual Nunchaku Backend Installation
- For users requiring manual control or troubleshooting installation issues:
PS> cd .\ComfyUI\
PS> .\.venv\Scripts\Activate.ps1
PS> pip install -r custom_nodes\ComfyUI-nunchaku\requirements.txt
PS> pip install nunchaku --upgrade
PS> pip install facexlib insightface onnxruntime
PS> python -c "import nunchaku; print(nunchaku.__version__)"
Running Your First Nunchaku FLUX.1-dev Generation
Run [ComfyUI]
→ [Workflow]
→ [Open]
→ nunchaku-flux.1-dev.json (select)
→ Set your prompt in the text input node
→ [▷ Run]
- I applied the following additional configurations to the example workflow provided by Nunchaku and conducted multiple image generation tests. The test results confirmed very fast image generation averaging 11-12 seconds with high quality output.
Nunchaku Flux DiT Loader
* model_path: [svdq-int4_r32-flux.1-dev.safetensors]
* cache_threshold: 0
* attention: [nunchaku-fp16]
* data_type: [bfloat16]
Nunchaku Flux.1 LoRA Loader
* lora_name: [flux-1.turbo-alpha.safetensors]
* lora_strength: 1.0
Nunchaku Flux.1 LoRA Loader
* lora_name: [flux_realism_lora.safetensors]
* lora_strength: 0.7
Nunchaku Text Encoder Loader
* text_encoder1: [t5xxl_fp16.safetensors]
* text_encoder2: [clip_l.safetensors]
FluxGuidance
* guidance: 5
BasicScheduler
* scheduler: [beta]
* steps: 8
Multiply Sigmas
* factor: 0.960
* start: 0.950
* end: 0.980
Width:
* value: 896
Height
* value: 1152
[Tip] Multiply Sigmas: Maximizing Detail in Mechanical and Portrait Generation
Multiply Sigmas functions as an independent node in ComfyUI that significantly enhances detail quality in mechanical objects and portraits, effectively reducing the characteristic AI-generated appearance. [Related Link]
- The most recommended configuration is:
Guidance: 4.5 + Scheduler: Beta + Multiply Sigmas: 0.96.
- This feature becomes available after installing the
ComfyUI-Detail-Daemon custom node package in ComfyUI.
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [ComfyUI-Detail-Daemon]
→ [Install]
→ Restart [ComfyUI]
- After installation, you can add the
Multiply Sigmas node to your workflow as follows:
(Right-click on empty space in workflow canvas)
→ [Add Node]
→ [sampling]
→ [custom_sampling]
→ [sigmas]
→ [Multiply Sigmas (stateless)]
→ factor: 0.96
→ start: 0.95
→ end: 0.98
[Tip] Face Detailer: Maximizing Facial Detail Enhancement for Characters
Face Detailer is a powerful feature that detects and enhances facial details in generated images. This is particularly useful for full-body character shots where facial details tend to be significantly degraded. Face Detailer helps maintain and improve these crucial details.
- This feature becomes available after installing both the
ComfyUI Impact Pack and ComfyUI Impact Subpack custom node packages in ComfyUI.
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [ComfyUI Impack Pack]
→ [Install]
→ Search [ComfyUI Impack Subpack]
→ [Install]
→ Restart [ComfyUI]
- After installation, you can add the
FaceDetailer node to your workflow as follows:
(Right-click on empty space in workflow canvas)
→ [Add Node]
→ [ImpactPack]
→ [FaceDetailer]
→ guide_size: 512
→ guide_size_for: [crop_region]
→ max_size: 1024
→ steps: 8
→ cfg: 1.0
→ sampler_name: [euler]
→ scheduler: [beta]
→ denoise: 0.50
→ feather: 5
→ drop_size: 10
low quality, blurry, bad anatomy, worst quality, low resolution, heavy makeup, rough skin, harsh texture, skin imperfections, overly detailed skin, artificial skin, dirty skin, skin imperfections, acne, blackheads, wrinkles, aged skin, damaged skin, oily skin, uneven skin tone, overly detailed skin, harsh skin texture, artificial skin, large pores, visible pores, textured skin, coarse skin, bumpy skin, weathered skin, leathery skin, sun damaged skin, scarred skin, blemished skin, unsmooth skin, grainy skin, patchy skin, peach fuzz, vellus hair
[Tip] res_2s + bong_tangent: Superior Image Generation with Advanced Sampling
- Sampler
res_2s combined with Scheduler bong_tangent delivers the highest quality image generation. [Related Link]
- Technical Details:
res_2s: Uses 2-stage substeps per step, requiring two model calls per step (slower but higher quality than single-stage samplers)
bong_tangent: BONGMATH technology enables bidirectional denoising, processing both forward and backward simultaneously for more accurate sampling
- These features are available by installing the
RES4LYF custom node package in ComfyUI.)
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [RES4LYF]
→ [Install]
→ Restart [ComfyUI]
- Once installed, you can configure them in
KSamplerSelect and BasicScheduler as follows:
KSamplerSelect
* sampler_name: [res_2s]
BasicScheduler
* scheduler: [bong_tangent]
* steps: 8
* denoise: 1.00
[Tip] FLUX.1-Krea-dev Best Practices & Optimization
FLUX.1-Krea-dev is a collaborative model released by Black Forest Labs and Krea AI, featuring an opinionated aesthetic philosophy that emphasizes natural texture, realistic tone, and enhanced detail rendering to completely eliminate the characteristic AI look of FLUX models—including plastic-like skin and oversaturation—pursuing extreme photorealism.
- The model demonstrates improved prompt adherence capabilities compared to the base FLUX.1-dev model. Detailed descriptions of temporal context, color grading, composition, and fine details particularly leverage the model's strengths in natural texture and realistic rendering.
- Maintains 100% architectural compatibility with FLUX.1-dev as a drop-in replacement. Recommended settings:
- model:
svdq-int4_r32-flux.1-krea.dev.safetensors (Nunchaku version)
- sampler_name:
res_2s
- scheduler:
bong_tangent
- steps: 8
- denoise: 1.0
- guidance: 5.0
- width x height : 864 x 1152
- loras:
- lora_name:
Flux_Krea_Blaze_Lora-rank32.safetensors, lora_strength: 1.00
- lora_name: [your-style-lora], lora_strength: 0.50
- lora_name: [your-character-lora], lora_strength: 0.50
- lora_name:
SameFace_Fix.safetensors, lora_strength: -0.70
[Tip] FLUX.1-Kontext-dev Best Practices & Optimization
- Preserve Original Image Size: Set the
FluxKontextImageScalenode to Bypass mode to maintain the input image's original dimensions. This node typically scales images to optimal resolutions for FLUX processing (usually under 2.1MP) and reduces VRAM usage, but bypassing it preserves your desired output size.
- Minimize Facial Changes: Set the denoise strength parameter to 0.85 or lower in the
KSampler or BasicScheduler nodes. The default value of 1.0 completely replaces the input image with noise, while lower values preserve more original image characteristics. Values between 0.75-0.85 provide the optimal balance between edit quality and identity preservation.
- Use Multiple FLUX.1-dev LoRAs: You can load and combine multiple LoRA models trained on the FLUX.1-dev base model. Connect
Nunchaku FLUX LoRA Loader nodes to the output of the Nunchaku FLUX DiT Loader node and specify your desired LoRA files.
Personal Note
- After extensive testing across various hardware configurations,
Nunchaku FLUX.1-dev has become my go-to solution for high-quality, fast AI image generation. The combination of academic rigor (ICLR 2025 Spotlight), practical performance gains, and seamless ComfyUI integration makes this the most compelling FLUX.1-dev implementation available in 2025. The 12-20 second generation times on RTX 3080 10GB represent a significant improvement that makes AI image generation genuinely practical for iterative creative workflows.
References
- https://github.com/mit-han-lab/nunchaku
- https://hanlab.mit.edu/blog/svdquant
- https://github.com/mit-han-lab/ComfyUI-nunchaku
- https://huggingface.co/black-forest-labs/FLUX.1-dev
- https://docs.comfy.org/
- https://comfy.icu/extension/mit-han-lab__ComfyUI-nunchaku
- https://huggingface.co/collections/mit-han-lab/nunchaku-6837e7498f680552f7bbb5ad
- FLUX.1-Krea & the Rise of Opinionated Models - Drew Breunig