How to Deploy Galactica to Production on Serverless GPUs

How to Deploy Galactica to Production on Serverless GPUs

graphic of a planet and our blog post title "Galactica deployment tutorial".

In this video tutorial we show you the easiest way to deploy Galactica on serverless GPUs and how to actually call your Galactica model in production.

The tutorial is 5 minutes long, but the deployment of Galactica will only take you ~3 minutes. That's pretty freaking quick. If you're curious about how we made this process so quick, you can read more about our 1-click model deploys.

What is Galactica?

Galactica is a large language model (LLM) that was developed in collaboration with Meta AI and Papers with Code. The idea that spawned Galactica was that scientific researchers deal with major information overload due to papers constantly being published and the scientific landscape getting muddy and hard to identify what papers are relevant or irrelevant.

Galactica's purpose is to help sort through scientific information and reason through the content for you. The model was trained on a combination of 48 million papers, lectures, textbooks, scientific resources, compounds and proteins, and additional datasets.

How to Deploy and Run Galactica

Tutorial Notes & Resources:

We mentioned a few resources and links in the tutorial, here they are.

In the tutorial we used a virtual environment on our machine to run our demo model. If you are wanting to create your own virtual environment use these commands (Mac):

  • create virtual env: python3 -m venv venv
  • start virtual env: source venv/bin/activate
  • packages to install: pip install banana_dev pip install diffusers pip install transformers

In Closing:

Let us know what you are building with Galactica! We'd love to know and share your projects that have been deployed with Banana. The best place to reach our team is in our Discord or by tweeting at us on Twitter.