I would not solve this by renting the card out or mining on it. That is a good way to turn a clean on-prem deployment into a security, accounting, support, and liability mess.
The first mistake is treating this as a GPU utilization problem. It is not. It is a capacity and ownership problem.
If the customer requires on-prem, then they are not buying 100 percent utilization. They are buying local capacity, data control, predictable latency, and the right to run the workload at peak without waiting for somebody else’s cloud quota. Expensive idle capacity is not unusual. Hospitals, banks, factories, backup systems, HA clusters, and DR environments all work like this. Most of that hardware is “wasted” until the day it is not.
What I would clarify very early is whether the H100 is required continuously or only at peak. If it is a peak requirement, say that plainly. Otherwise someone will run nvidia-smi two weeks after installation and conclude that the project bought a very expensive heater.
The normal enterprise pattern is not “make the ISV keep the GPU busy”. The normal pattern is that the client owns the idle capacity. Your application gets a reservation or priority, and the client can use the remaining time for their own batch inference, embeddings, document processing, evaluation jobs, internal ML experiments, or whatever else fits their governance. Slurm or Kubernetes can do the scheduling if they already have that kind of environment. MIG may help for smaller independent workloads, but it is not magic. If your model needs the whole card, it needs the whole card.
I would be very careful with the sales story here. “We will keep the GPU busy” is the wrong promise. The honest promise is “this workload needs this class of hardware when it runs, and on-prem means paying for availability rather than consumption.”
So the contract needs to state one of three things clearly. The GPU is dedicated to your product. Or the GPU is shared, with your product having priority. Or you provide the whole thing as a managed appliance/capacity service.
Leaving that undecided is how these projects become awkward. The bad outcome is not an idle H100. The bad outcome is an idle H100 that nobody is allowed to use, nobody budgeted correctly, and everyone blames on the software vendor.