News

Deci launches a generative AI development platform and the Deci-Nano model

Deci is debuting a series of LLMs and a Gen AI Development Platform. The first model is Deci-Nano, which features competitive benchmark scores, a fast throughput, and an affordable price. Deci-Nano and supported models can be deployed via Deci's platform via API, a VPC, or on-premises.

by Ellie Ramirez-Camara

Updated March 15, 2024

Deci's Gen AI Development Platform includes an inference engine, an AI cluster management solution, and a series of proprietary, fine-tuneable LLMs, starting with Deci-Nano. This small model scores higher than Mistral-7b-instruct-v0.2 and Gemma-7b-it in MT Bench and features a higher throughput than both models. Deci-Nano is 38% faster than Mistral-7b-instruct-v0.2 and 33% faster than Gemma-7b-it when benchmarked on NVIDIA A100 GPUs. Deci Nano also features a modest 8K-token context window and an affordable price of $0.1 per 1M tokens. All these features make Deci-Nano ideal for the production of real-time applications.

Deci-Nano's features are complemented by a variety of deployment options via the Generative AI Development Platform. Other supported models can also be deployed using the platform via Deci's API or on-premises. API deployments include serverless instances with per-token pricing and dedicated instances for ease of fine-tuning and increased privacy protection. Private cloud deployments can be containerized for increased control or a customized managed inference deployed within Kubernetes clusters for a hands-off approach. Finally, on-premises deployments enable users to integrate containerized models into private data centers. Containers host the chosen model and the Infery SDK, yielding complete control over the deployment and the highest level of data privacy and security. Deci also guarantees the possibility of migrating between options to accommodate businesses' evolving needs without affecting customized and fine-tuned models.

Every deployment option benefits from the enhanced throughput of the Deci models. In virtual cloud deployments, the enhanced throughput results in greater efficiency per GPU hour, making the deployment cost-effective and allowing organizations to serve more users at once. On-premises deployments receive an energy-saving and productivity boost that enables the use of GPUs for additional tasks. Deci's software-only approach also features a minimized environmental impact that lowers carbon emissions during the training and inference phases.

by Ellie Ramirez-Camara

Updated March 15, 2024

Subscribe to Our Newsletter

Deci launches a generative AI development platform and the Deci-Nano model

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Adobe introduced new Firefly AI-powered features for Photoshop and Illustrator

Cohere's Rerank 3 Nimble supports fast and accurate enterprise search applications

Data Phoenix Digest

Read More

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Adobe introduced new Firefly AI-powered features for Photoshop and Illustrator