Banana dev - Serverless GPUs - Changelog 001

45-80% Faster Cold Starts of Models

When a customer calls their model and it's currently not running, a cold start is required. Previously this cold start time for models would add multiple seconds (anywhere from 30 - 120 sec on average) of extra latency. Now, cold start times on Banana are reduced by 45-80% (depending on your model size), significantly reducing latency. This is a huge win for our customers and another leap forward as we unlock a truly serverless experience.

70% Decrease in Pipeline Time

Customers would previously have up to 800ms of additional network time added to their inference call due to Banana’s systems. Sticking with our goal of speed improvements for our customers, we brought this network time down to 220ms with some in-house optimizations to our systems.

If you have any feature suggestions, improvements, or bug reports, send us a message or let us know in #support or #feature-requests on Discord.