🚀 ERNIE-Image Dedicated Quantizer

Convert the massive ERNIE-Image and ERNIE-Image-Turbo models to lower precisions (FP8, FP16, BF16).

Memory Management: This tool processes the files shard-by-shard. The largest file is the 9.31 GB transformer shard, which will peak near 14 GB of RAM during FP8 conversion. The script flushes memory aggressively after each step to prevent crashing the free tier.

Hugging Face Token (Write Access Required)

Your Hugging Face Username

Source Repository

Components to Quantize

Select which folders should be cast to the new precision. Unselected folders will be copied as-is.

pe text_encoder transformer vae

Target Precision

Target Repository (Auto-generated)

Operation Logs