On Data Phoenix’s and Hewlett Packard Enterprise’s (HPE) recent Accelerating Generative AI with HPE: From Development to Deployment webinar, Jordan Nanos (Master Technologist, HPE’s North America CTO Office) and Volodymyr Saviak (High Performance Computing Sales Manager in Central Eastern Europe at HPE) offered a comprehensive overview of HPE’s AI solutions, highlighting the company’s approach, which prioritizes flexibility and scalability across all of its offerings, not only its generative AI full stack.

The first part of the webinar, presented by Volodymyr Saviak, started by calling attention to one of HPE’s most important strengths as an end-to-end AI solutions provider: building full-featured capable solutions, whether a developer kit or a supercomputer, is more than putting chips or servers together, it requires access to specific tools, and the expertise to combine these ingredients into a single system which can be easily maintained. HPE’s experience building enterprise-grade AI infrastructure and its approach to working with customers support its capability to deliver solutions integrating basic hardware, middleware, and software to run AI models at scale that can be easily maintained through a single point of responsibility.

HPE solutions' versatility and adaptability were showcased by a selection of different-sized solutions currently offered. Offering a range of full stack solutions ranging from a single self-hosted server featuring two NVIDIA L40/L40s GPUs, capable of running up to two 12B parameter LLMs; to the HPE AI Factory Cloud Platform, a large scale offering geared towards large enterprises and cloud providers offering gen AI solutions to their customers, it is clear that HPE is capable of working with a diversity of customers to ensure they acquire a solution that best suits their needs and intended use cases.

In the second part of the webinar, Jordan Nanos dove deep into the HPE Machine Learning Platform, a comprehensive software solution that stands out for its versatility, being available for deployment on any system, not exclusively HPE hardware. The HPE Machine Learning Platform addresses the three critical stages of the machine learning lifecycle: data processing and pipelining, model development and optimization, and model deployment and monitoring.

Notably, the modules in the platform are powered by open-source tools from HPE's partner network and acquisitions, like Pachyderm or Determined AI, who still run their open source projects. This provides HPE's solutions with an additional layer of flexibility by avoiding unnecessary lock-ins into single-vendor proprietary software. Also of particular interest is HPE's pricing strategy, which is not based on a per-token or per-user licensing model. Rather, HPE's solutions follow a tier-pricing model, where customers are offered the tier that best accommodates their present requirements but allows room for growth.

The HPE Machine Learning Data Management software (MLDM) (powered by Pachyderm), which Nanos described as a "GitHub for data management." This solution offers data repositories with S3 bucket-like scalability, complete with commit tracking and pipeline process monitoring. The latter is similar to what one can achieve using Airflow, Kubeflow or Argo CD workflows. Moreover, MLDM can handle parallel processing jobs without requiring additional Spark, Ray, or Dask knowledge. Thus, MLDM sets itself apart because it effectively combines the functionality of multiple tools like GitHub LFS, Airflow, and Spark into a single, unified solution.

For the training phase, the HPE Machine Learning Development Environment (MLDE) (powered by Determined AI) seamlessly integrates with MLDM or other data preparation pipelines. It provides comprehensive features for model training, including advanced parameter search, experiment visualization and debugging, checkpoint management, and collaborative notebooks. The platform's flexibility is evident in its support for various hardware configurations, including NVIDIA and AMD GPUs, cloud TPUs, and CPU-based HPC clusters running Slurm or PBS. As a result of recent work focused on LLM training, MLDE can run fine-tuning jobs through the Hugging Face trainer API.

The final component, HPE Machine Learning Inference Software (MLIS)—based on KServe—addresses the crucial challenge of model deployment and scaling. It enables users to serve trained models as API endpoints, automatically handling auto-scaling and load balancing across multiple GPUs. In addition to supporting models from popular platforms like Hugging Face and NVIDIA NIM, this solution also accommodates custom models. MLIS also enables users to implement comprehensive metrics and logging services independently of the individual models they deploy on the platform.

A short demonstration of the HPE MLIS showcased the various features of the software, which covers practically everything customers may want to do with LLMs. From comparing the speeds of different model deployments and configuring and controlling the cluster resources available to each model, to ensuring that the data shared through model endpoints is not tracked by external third parties, the demonstration highlighted the most important strengths of HPE's software platform: flexibility, scalability, privacy, and security.

During the Q&A session, the speakers emphasized that HPE's solutions aren't exclusively for large enterprises. Startups can benefit from discounts on the developer kit and potentially become certified partners for delivering HPE AI solutions. A key advantage highlighted was the platform's ability to eliminate common bottlenecks in hybrid deployments by removing the need for deep Kubernetes expertise when building high-performance, auto-scaling clusters.

Looking ahead, HPE demonstrated its commitment to future-proofing its platform by ensuring support for diverse accelerators within a single cluster, while maintaining its focus on observability and monitoring features, crucial for enterprise-grade AI deployments.

Special announcement: schedule a complimentary consultation with HPE!

Data Phoenix and HPE are offering webinar attendees the opportunity to schedule a 30-minute session to:

  • Analyze current and planned AI workloads.
  • Provide tailored hardware recommendations.
  • Run benchmarks for your use case in HPE’s AI Lab when necessary.
  • Develop a customized budget plan for your AI initiatives.

This offer is available for organizations in Central Eastern Europe; more details and a registration form are available here.