Cheaper fine-tuning + hosting for Llama/Mistral — would this help anyone

BTip · August 22, 2025, 7:37pm

I’ve noticed a lot of people here talking about the high cost of fine-tuning and running models like Llama and Mistral.

I’m exploring an idea: what if you could upload your dataset, fine-tune a model, and get back a hosted endpoint — but instead of paying AWS/OpenAI rates, it ran on idle/off-peak GPUs for ~70% less?

Curious to hear from others here:
– Is cost the main blocker for you, or is it complexity/reliability?
– If something like this existed, would you try it?
– What would make it useful (or useless) for your projects?

I’m not selling anything — just testing the waters and would love feedback

MarkusEicher · August 22, 2025, 7:45pm

Hi there. To be honest, if I read something about 70% less and hosted GPU you got my instant attention. I would be interested in such an option. Especially a form of BYOGPU (means in my idea BuyYurOwnGPU) where we could buy older GPU rigs located in a datacenter and then pay for colocation and networking and stuff could be a very interesting model for us. But also having the trained model hosted as endpoint for a fair price would be nice.

Topic		Replies	Views
Would you use cheaper fine-tuning if it cut costs by 70%? Models	1	82	August 22, 2025
Fine-tune a 7B parameter LLM efficiently and affordably? Models	2	1236	August 26, 2024
On Demand GPU model hosting? Beginners	3	1315	June 2, 2025
Can you use a finetuned model for an assistant? Beginners	1	269	June 7, 2024
Finetuning cost estimation Languages at Hugging Face	2	2909	October 2, 2023

Cheaper fine-tuning + hosting for Llama/Mistral — would this help anyone

Related topics