MPS is running slower than CPU on Mac M1 Pro

polodealvarado · September 9, 2022, 9:39am

Hello everyone.

I have been recently testing the new version 0.3.0 on my M1 Pro but I found that following the steps from How to use Stable Diffusion in Apple Silicon (M1/M2) the execution times for CPU and MPS are on average for similar prompts:

GPU: 331 s
CPU: 222 s

Has anyone tested it too ?

pcuenq · September 9, 2022, 11:42am

Hi @polodealvarado! Your CPU numbers are very similar to the ones I get in my M1 Max, but as reported in the page you mentioned, the speed I see is much faster when using the GPU. Would you mind sharing a couple of details so I can try to take a look? These would be useful:

The amount of RAM your computer has.
The version of PyTorch you installed.
Your macOS version.
A small code snippet, only if you made any changes to the example we provided.

Thanks a lot!

polodealvarado · September 9, 2022, 2:41pm

HI! @pcuenq, thank you for answering.

Here you have all the details and more:

RAM: 16 GB
GPU cores: 16
macOS version: 12.5.1
Python version: 3.9.13
Diffuser version: 0.3.0
Torch version: 1.13.0.dev20220908

I have been using the same code without touching it. On the other hand, I tried another jupyter notebook from this repository and the results are quite similar (cpu works better than mps).

polodealvarado · September 9, 2022, 3:21pm

I am following this thread, running mps backend. @pcuenq

pcuenq · September 9, 2022, 4:55pm

That’s a very interesting thread! They specifically say that random operations are not yet optimized; however, diffusers’ code generates random latents in CPU when using the mps device.

I’ll do some testing, thanks!

reobertwt7 · November 1, 2022, 6:47am

This also happens to me guys… my CPU takes around 4m 30s, my GPU (mps) takes more than 20 minutes??
Same code, I was simply changing:

pipe = pipe.to("mps")

To

pipe = pipe.to("cpu")

RAM: 16 GB
GPU cores: 16
macOS version: 12.6
Python version: 3.10.4
Diffuser version: 0.6.0
Torch version: 1.14.0.dev20221031

pcuenq · November 1, 2022, 4:56pm

We are going to release a new version of diffusers this week optimized for PyTorch 1.13, which was released last Saturday.

In the meantime, TL;DR:

Install production version of PyTorch, not the nightly one. You should get version 1.13.0.
Use the main branch of diffusers instead of the one from PyPi (pip install git+https://github.com/huggingface/diffusers).
Use attention slicing to optimize memory usage and prevent swapping (pipe.enable_attention_slicing() after you create your Stable Diffusion pipeline).

Topic		Replies	Views
Inference is slow on M1 Mac despite MPS Torch backend Beginners	4	4094	May 26, 2024
AutoTrain Python automatically using MPS on Mac - How to switch to CPU Beginners	1	776	November 9, 2024
Running PyTorch + Huggingface on Apple Silicon (M1) Beginners	1	1833	August 24, 2022
Slow GPU with mps in Intel 🤗Accelerate	0	1131	April 6, 2023
Performance of mtb-7b on mac M1 Beginners	0	1352	January 3, 2024

MPS is running slower than CPU on Mac M1 Pro

Related topics