torchvision==0.23.0 cupy-cuda12x transformers==4.46.2 controlnet-aux==0.0.7 imageio imageio[ffmpeg] safetensors einops sentencepiece protobuf modelscope ftfy flash-attn-3 @ https://huggingface.co/alexnasa/flash-attn-3/resolve/main/128/flash_attn_3-3.0.0b1-cp39-abi3-linux_x86_64.whl