🚀 ERNIE-Image Dedicated Quantizer

Convert the massive ERNIE-Image and ERNIE-Image-Turbo models to lower precisions (FP8, FP16, BF16).

Memory Management: This tool processes the files shard-by-shard. The largest file is the 9.31 GB transformer shard, which will peak near 14 GB of RAM during FP8 conversion. The script flushes memory aggressively after each step to prevent crashing the free tier.

Source Repository
Components to Quantize
Select which folders should be cast to the new precision. Unselected folders will be copied as-is.
Target Precision