Anyscale Develops Aviary to Simplify Large Language Model Deployment

Anyscale Launches Aviary: Open Source Infrastructure to Simplify Large Language Model Deployment

New open source project helps application developers rapidly integrate LLMs into their applications.

Anyscale, the AI infrastructure company built by the creators of Ray, the world’s fastest growing open source unified framework for scalable computing, launched Aviary, a new open source project designed to help developers simplify the painstaking process of choosing and integrating the best open source large language models (LLMs) into their applications. Once the best model is selected for the application, Aviary makes the transition to production effortless.

“Our goal is to ensure that any developer can integrate AI into their products and to make it easy to develop, scale, and productionize AI applications without building and managing infrastructure. With the Aviary project, we are giving developers the tools to leverage LLMs in their applications,” said Robert Nishihara, Co-founder and CEO of Anyscale. “AI is moving so rapidly that many companies are finding that their infrastructure choices prevent them from taking advantage of the latest LLM capabilities. They need access to a platform that lets them leverage the entire open source LLM ecosystem in a future-proof, performant and cost-effective manner.”

The AI Imperative
Generative AI has taken the world by storm, quickly becoming a competitive imperative for technology applications in every industry. In response to the popularity of general purpose LLM-as-a-service offerings, dozens of open-source alternatives have emerged in recent weeks, offering potential advantages over proprietary LLMs, including low latency model serving, deployment flexibility, reduced compute costs, full data control and vendor independence.

Gartner® Research writes, “By 2026, 75% of newly developed enterprise applications will incorporate AI- or ML-based models, up from less than 5% in 2023.”¹

Selecting the right open source LLM for an application is a complex process and requires significant expertise in machine learning and distributed systems. Developers are not only tasked with selecting the right model for their application, but also trying to understand future operating costs and required application capabilities at scale.

Integrating LLM capabilities presents a host of challenges for application developers:

Managing multiple models - scaling models independently and deploying them across shared compute resources
Application integration - customizing models and integrating application logic
Productionization - upgrading models and applications without downtime and ensuring high availability
Scale - scaling up and down automatically based on demand and leveraging multiple GPUs to speed up inference
Cost - maximizing GPU utilization to lower costs

_______________
¹ Gartner Inc., “Critical Capabilities for Cloud AI Developer Services”, May 22, 2023
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Introducing Aviary
Aviary is the first fully open source, free, cloud-based LLM-serving infrastructure designed to help developers choose and deploy the right technologies and approach for their LLM-based applications. Aviary makes it easy to submit test-prompts to a portfolio of leading open source LLMs including: CarperAI, Dolly 2.0, Llama, Vicuna, StabilityAI, and Amazon’s LightGPT.

Using Aviary, application developers can rapidly test, deploy, manage and scale one or more pre-configured open source LLMs that work out of the box, while maximizing GPU utilization and reducing cloud costs. With Aviary, Anyscale is democratizing LLM technology and putting it in the hands of any application developer who needs it, whether they work at a small startup or a large enterprise.

Optimized for Production LLM Deployment
Aviary is built on Ray Serve, Anyscale’s popular open source offering for serving and scaling AI applications, including LLMs. Ray Serve provides production grade features including fault tolerance, dynamic model deployment and request batching. Aviary offers dynamic cost optimization by intelligently partitioning models across GPUs. It also enables model management and autoscaling across heterogeneous compute, saving costs by rightsizing infrastructure.

Aviary offers a unified framework to deploy multiple LLMs quickly and add new models in minutes, and is designed for multi-LLM orchestration. Aviary enables continuous testing, allowing developers to A/B test models over time for the best functionality, performance, and cost, delivering the best experience for the end user.

Aviary also aids developers on the journey to production deployment of LLMs. It includes libraries, tooling, examples, documentation, and sample code - all available in open source and readily adaptable for small experiments or large evaluations.

“Open source models and infrastructure are a great breakthrough for democratizing LLMs,” said Clem Delangue, CEO of Hugging Face. “We’ve attracted an enormous community of open source model builders at Hugging Face. Anything that makes it easier for them to develop and deploy is a win, especially something like Aviary coming from the team behind Ray.”

Source: Launchsquad media source

Click to learn how to make your agents super!

Follow @PipelineWire