Red Hat acquired inference performance engineering startup Neural Magic
Red Hat is acquiring Neural Magic, an MIT spin-off specializing in AI inference optimization, to help make large language models more accessible and cost-effective across hybrid cloud environments.
Red Hat recently announced its acquisition of Neural Magic, an MIT spinout that develops high-performance software to accelerate deep learning inference workloads. The press release confirming this transaction highlights how Neural Magic's expertise fits in nicely with Red Hat's mission to make AI workloads more accessible by making them available in a diversity of settings that suit Red Hat's enterprise customers' use cases and data, regardless of where they are in the hybrid cloud.
Neural Magic's involvement in the open-source vLLM project is especially noteworthy. The vLLM project is a community-driven project for open model serving that originated in UC Berkley and provides support for the most popular model families, advanced inference acceleration research, and hardware backends including AMD GPUs, AWS Neuron, Google TPUs, Intel Gaudi, NVIDIA GPUs and x86 CPUs. In addition to its vLLM leadership, Neural Magic has made several advances in model acceleration research, builds and maintains the LLM Compressor library, and is responsible with a repository of vLLM-ready pre-optimized models.
Red Hat plans to incorporate Neural Magic's impressive achievements together with its own advances in lowering the cost and skill barriers to AI access, including Red Hat Enterprise Linux AI (RHEL AI), which enables enterprise users to develop, test, and run IBM Granite models-based applications in Linux deployments; Red Hat OpenShift AI, a toolkit to "develop, train, serve and monitor machine learning models across distributed Kubernetes environments on-site, in the public cloud or at the edge"; and InstructLab, an open-source community project that empowers the community to improve the IBM Granite model suite via fine-tuning collaboratively.