Ray 2.4.0: Infrastructure for LLM training, tuning, inference, and serving

Ray, an open-source distributed computing framework, has announced the release of Ray 2.4.0, which features several updates across its ecosystem. The update is designed to support the use of large language models (LLMs) and generative AI workloads.

The new Ray release features various enhancements, including updates to Ray data, which include stability, observability, and ease of use. The updated Serve observability enables users to view their models' performance and monitor their model metrics through a customizable dashboard.

To improve Ray's scalability on large clusters, the update includes several improvements. These include introducing RLlib’s module for custom reinforcement learning, which allows users to create custom reinforcement learning models that can be used for various applications.

One of the significant updates to the Ray 2.4 release is the set of new working examples showing how to use Ray with Stable Diffusion and LLMs such as GPT-J. GPT-J is a 6B parameter language model trained on the PILE dataset, while Stable Diffusion is a technique to fine-tune diffusion models.

The release of the AccelerateTrainer, which allows users to run HuggingFace Accelerate and DeepSpeed on Ray with minimal code changes, enables additional large model workloads on Ray. The AccelerateTrainer can also run distributed hyperparameter tuning with each trial dispatched as a distributed training job.

The LightningTrainer is another significant update that allows users to scale their PyTorch Lightning on Ray. The LightningTrainer is an updated version of the existing ray_lightning integration, which offers better compatibility with other Ray libraries such as Ray Tune, Ray Data, or Ray Serve directly with Ray connecting APIs.

The Ray 2.4.0 release offers numerous features, examples, and tutorials, including fine-tuning GPT-J, fine-tuning Dreambooth, scalable batch inference with GPT-J, serving GPT-J, serving Stable Diffusion, and Ray integration with LangChain.

Overall, the Ray 2.4.0 update is a significant step towards supporting the use of large language models and generative AI workloads in the open-source community. With its various enhancements, the update makes it easier to use Ray and enables developers to scale their AI workloads more efficiently.

Subscribe

Ray 2.4.0: Infrastructure for LLM training, tuning, inference, and serving

Comments

Read Next

Harvey AI soars to $5B valuation with $300M Series E funding

Mira Murati's secretive startup Thinking Machines Lab raises $2B

Cluely, which launched promising to help users cheat on everything, raised $15M