Banana

Deploy Whisper in ~13 seconds.

Use our 1-click OpenAI Whisper model or customize your own version.

How to Deploy Whisper

Video tutorial demonstrating how to deploy Whisper to serverless GPUs.

How to Deploy Whisper

Video tutorial demonstrating how to deploy Whisper to serverless GPUs.

Serverless GPUs = Affordable.

1 hour of audio processing with Whisper on Banana costs <30 cents.

Code editor graphic with the lines of code needed to deploy machine learning on Banana.

What does Banana do?


Banana provides inference hosting for ML models in three easy steps and a single line of code.

Deploy models to production faster and cheaper with our serverless GPUs than developing the infrastructure yourself.

Serverless Pricing

Only pay for the resources you use. That's the power of Banana.

Usage Pricing

Only pay for GPU compute you use

$.00051992/second
  • 1 hour of FREE credits đź’¸
  • Run on A100 GPUs
  • ML Models up to 16GB
  • Network Payload up to 50MB
  • Autoscaling
  • Spike Tolerance (up to 10 replicas)
Sign Up

Volume Pricing

Prepaid hosting credits at discount

10-40%/off usage rate
  • Everything in Usage Pricing
  • Minimum purchase of $1,000
  • More you buy = more you save
  • Dedicated SLA Response Time
  • Increased Spike Tolerance (25+ replicas)
Contact Us

Why use Serverless GPUs?

Ready to Scale

When you need to scale bi-directionally based on demand and keep a great customer experience.

Cost Savings

When you need to gain cost efficiency and your spend for “always-on” GPUs is too expensive.

Speed to Market

When you need a reliable hosting solution quickly and/or prefer moving fast over building in-house.

Use Banana for scale.

Banana.dev logo as an icon.