Sunsetting Serverless GPUs

February 1, 2024Sunsetting Serverless GPUs

Feb 1, 2024

Hello everyone, today we’re announcing the sunsetting of the Banana Serverless GPU platform.

On March 31st, in two months, Banana infrastructure will be shut down at noon PST. Please ensure that your GPU services are migrated to a new provider by this time.

Later in this article, I’ll provide a guide for a clean migration.

We wish we could have made it work.

With the advancement of AI, we’re all in the most exciting times of our lives, working with technology that will change the world. It has been an absolute thrill to be a part of it.

Serverless GPUs are inevitable. Given there are workloads that must be ran on GPUs, there will be a platform orchestrating those workloads.

The outstanding questions, then, are:

In 2022-2023, we saw a taste of product market fit, and held onto it desperately, afraid to let it slip away. I called it “Promise market fit”; the idea that users will absolutely pay for a product, given a certain spec is hit. Reliable, cost-effective, fast, and easy. Simple spec.

Unfortunately, the realities of business have turned their head. Given current runway, traction, retention, shifting AI macro trends, supply-constrained GPU markets, and a deeper understanding of the engineering required, we’ve realized that we do not have the time and resources to hit that spec.

I’d like to write up a much more detailed blog about these business dynamics, but for now, my focus is dedicated to leading Banana through a successful pivot.


Yes, it’s a hassle you’d rather avoid.

We’re software engineers too, and we know how annoying it can be to be to have to spend time reacting to vendor changes. Especially a vendor as fundamental as compute. As a result, we delayed this decision until it was absolutely obvious it needed to happen.

Now, you’re looking for alternatives.

Life hack: Can it just be an API?

Often, users on Banana would deploy simple Huggingface base model Whisper or Stable Diffusion models. It is great to feel a sense of control, but in many cases, there are model-as-an-API providers serving those models in highly optimized, multi-tenant environments which allow them to be significantly faster and cheaper than running your own deployment on Banana.

If you don’t have a good reason for hosting custom code, such as having a nonstandard finetune or special pre/post processing logic, you may be pleasantly surprised by the quality of managed APIs. In this case, check out:

If custom code is the way:

Thankfully, products within the “Serverless GPU” market are all subject to the same infra constraints, so code running on one provider looks structurally identical to code on another.

All providers have had to:

Recommended Serverless GPU providers:

If you’re ambitious and want to hand-roll infra, check out these Open Source projects:

Or, if you want to simply stand up one or more always-on VMs, check out our friends at:

The ultimate “quick and dirty” way to get a functional inference endpoint would be to:

  1. take your existing Banana project
  2. build the docker image
  3. run the built image image on a Shadeform or Brev instance with the potassium port exposed
  4. configure your clientside http client or Banana SDK to point at that instance’s IP:port To be clear, this is not recommended. This is an always-on instance so will cost significant money, it is not autoscaled, and the container can crash and need manual intervention. But if that’s your vibe, it gets the job done.

Final Notes

We’ll try to be responsive in our Discord for quick questions. We cannot promise any level of support beyond pointing you in the right direction, as we’re a small team with many customers, and need to focus on what’s next for us.

Once you’ve migrated, please contact me at and we’ll cash out any balance you had remaining in your Banana account (excluding free credit deals).

I am deeply appreciative to you for choosing Banana as a provider, and I wish you the best of luck.

Godsend, programmers