Deploying AI Workloads on IBM Cloud

This guide provides step-by-step guidance for provisioning and configuring Red Hat OpenShift AI (RHOAI) clusters on IBM Cloud with NVIDIA GPUs, hosting large language models (LLMs), and securing AI workloads with Security and Compliance Center Workload Protection.

Overview

Deploying AI workloads on IBM Cloud involves several key components that work together to provide a secure, scalable, and compliant environment:

Red Hat OpenShift AI cluster with GPU support for model training and inference
RHEL AI on IBM Cloud for hosting and testing foundation models using InstructLab
Security and Compliance Center Workload Protection for continuous security monitoring and compliance

This deployment approach allows you to adapt various components, such as networking and security, to better suit your business needs after the foundational architecture has been established.

Security and compliance for AI workloads

Securing AI workloads requires a comprehensive approach that includes policy enforcement, control implementation, runtime monitoring, and real-time threat detection. Security and Compliance Center Workload Protection (SCC+WP) provides organizations with visibility into policy-driven posture enforcement and real-time threat detection capabilities.

A layered security approach that combines visibility, posture management, and behavioral detection serves as the foundation for securing AI workloads on IBM Cloud. This includes:

Full visibility into all AI components, including unsanctioned "shadow AI" deployments
Continuous risk management across the AI lifecycle
Policy-driven posture enforcement
Real-time threat detection and incident response
Vulnerability identification and compliance checking
Runtime threat blocking

Before you begin

Before you deploy this reference architecture, ensure you have the following prerequisites:

IBM Cloud account: An active IBM Cloud account with appropriate permissions.
IAM access policies: Required Identity and Access Management (IAM) access policies for provisioning resources.
SSH key pair: A VPC public and private SSH key pair that is not in the deployment region.
IBM Cloud API key: An API key for the user or service ID with the correct IAM access policies.
Landing zone planning: Review Planning for the landing zone deployable architectures.
Red Hat OpenShift subscriptions: Valid Red Hat OpenShift AI (RHOAI) entitlement. On IBM Cloud, this entitlement is typically provided as part of the managed OpenShift AI add-on. You do not need to procure a separate Red Hat "Advanced" subscription unless you are running OpenShift AI self-managed.

Provisioning the architecture

The deployment process consists of three main components that can be deployed independently or together:

Red Hat OpenShift AI cluster with LLM deployment
RHEL AI on IBM Cloud for model hosting and testing
Security and Compliance Center Workload Protection for security monitoring

Deploying Red Hat OpenShift AI with LLM

You can deploy a large language model to a Red Hat OpenShift AI cluster using one of the following methods:

Solution tutorial: Follow the step-by-step Deploying LLM to Red Hat OpenShift AI cluster solution tutorial for detailed instructions.
Deployable architecture: Use the Red Hat OpenShift AI deployable architecture to automate the provisioning and configuration process, then deploy your LLM.

The deployment includes:

Provisioning an OpenShift cluster with GPU-enabled worker nodes
Installing the Red Hat OpenShift AI operator
Configuring GPU resources and drivers
Setting up model serving infrastructure
Deploying and testing your LLM

Deploying RHEL AI on IBM Cloud

RHEL AI provides a platform for hosting and testing foundation models using InstructLab. To deploy RHEL AI:

Follow the Deploying and Running a model in RHEL AI on IBM Cloud solution tutorial.
Provision a Virtual Server Instance (VSI) with appropriate GPU resources.
Install and configure RHEL AI components.
Set up InstructLab for model testing and experimentation.
Run and validate foundation models on the hosted VSI.

Deploying Security and Compliance Center Workload Protection

Security and Compliance Center Workload Protection provides comprehensive security monitoring and compliance management for your AI workloads. The deployment process includes:

Provision an SCC Workload Protection instance

Create a new instance of Security and Compliance Center Workload Protection in your IBM Cloud account.
Configure data sources

Connect data sources by configuring agents on your OpenShift clusters and RHEL AI instances.
Launch the web UI

Access the SCC Workload Protection dashboard to monitor your AI workloads.
Configure security policies

Set up policies for vulnerability scanning, compliance checking, and runtime threat detection.
Enable continuous monitoring

Configure continuous monitoring to identify vulnerabilities, check compliance, block runtime threats, and respond to incidents.

For detailed deployment steps, see Getting started with Security and Compliance Center Workload Protection.

Additional services

You can enhance your AI workload deployment by adding the following services:

Account Infrastructure base: Create and configure the foundational components of an IBM Cloud account, including resource groups, access groups, and service authorizations. See Account Infrastructure base.
IBM Cloud Observability: Provision and configure logging, monitoring, and activity tracking services to gain visibility into your AI workloads. See IBM Cloud Observability.
IBM Cloud Event Notifications: Implement a high-throughput message bus built with Apache Kafka for event-driven architectures. See IBM Cloud Event Notifications.
IBM Cloud Internet Services: Leverage Cloudflare-powered services for fast, highly performant, reliable, and secure internet connectivity. See IBM Cloud Internet Services.

Network and security customization

After deploying the foundational architecture, you can customize networking and security components to meet your specific requirements:

Configure Virtual Private Cloud (VPC) networking with custom subnets and security groups
Implement network segmentation and Zero Trust policies
Set up private endpoints for secure service connectivity
Configure firewall rules and access control lists (ACLs)
Enable encryption for data in transit and at rest
Implement identity and access management (IAM) policies

Validation and testing

After deployment, validate your AI workload environment:

Verify that all components are running and properly configured
Test LLM inference endpoints for functionality and performance
Validate security policies and compliance posture
Review monitoring dashboards and alerts
Conduct security scans and vulnerability assessments
Test incident response procedures

Next steps

After deploying your AI workloads:

Review Key Features and Guardrails to understand security controls
Explore Use Cases for practical implementation examples
Set up continuous monitoring and alerting
Establish operational procedures for model updates and security patches
Configure automated compliance reporting