Deploying AI Workloads on IBM Cloud

This guide provides step-by-step guidance for provisioning and configuring Red Hat OpenShift AI (RHOAI) clusters on IBM Cloud with NVIDIA GPUs, hosting large language models (LLMs), and securing AI workloads with Security and Compliance Center Workload Protection.

Overview

Deploying AI workloads on IBM Cloud involves several key components that work together to provide a secure, scalable, and compliant environment:

  • Red Hat OpenShift AI cluster with GPU support for model training and inference
  • RHEL AI on IBM Cloud for hosting and testing foundation models using InstructLab
  • Security and Compliance Center Workload Protection for continuous security monitoring and compliance

This deployment approach allows you to adapt various components, such as networking and security, to better suit your business needs after the foundational architecture has been established.

Security and compliance for AI workloads

Securing AI workloads requires a comprehensive approach that includes policy enforcement, control implementation, runtime monitoring, and real-time threat detection. Security and Compliance Center Workload Protection (SCC+WP) provides organizations with visibility into policy-driven posture enforcement and real-time threat detection capabilities.

A layered security approach that combines visibility, posture management, and behavioral detection serves as the foundation for securing AI workloads on IBM Cloud. This includes:

  • Full visibility into all AI components, including unsanctioned "shadow AI" deployments
  • Continuous risk management across the AI lifecycle
  • Policy-driven posture enforcement
  • Real-time threat detection and incident response
  • Vulnerability identification and compliance checking
  • Runtime threat blocking

Before you begin

Before you deploy this reference architecture, ensure you have the following prerequisites:

IBM Cloud account
An active IBM Cloud account with appropriate permissions.
IAM access policies
Required Identity and Access Management (IAM) access policies for provisioning resources.
SSH key pair
A VPC public and private SSH key pair that is not in the deployment region.
IBM Cloud API key
An API key for the user or service ID with the correct IAM access policies.
Landing zone planning
Review Planning for the landing zone deployable architectures.
Red Hat OpenShift subscriptions
Valid Red Hat OpenShift AI (RHOAI) entitlement. On IBM Cloud, this entitlement is typically provided as part of the managed OpenShift AI add-on. You do not need to procure a separate Red Hat "Advanced" subscription unless you are running OpenShift AI self-managed.

Provisioning the architecture

The deployment process consists of three main components that can be deployed independently or together:

  1. Red Hat OpenShift AI cluster with LLM deployment
  2. RHEL AI on IBM Cloud for model hosting and testing
  3. Security and Compliance Center Workload Protection for security monitoring

Deploying Red Hat OpenShift AI with LLM

You can deploy a large language model to a Red Hat OpenShift AI cluster using one of the following methods:

Solution tutorial
Follow the step-by-step Deploying LLM to Red Hat OpenShift AI cluster solution tutorial for detailed instructions.
Deployable architecture
Use the Red Hat OpenShift AI deployable architecture to automate the provisioning and configuration process, then deploy your LLM.

The deployment includes:

  • Provisioning an OpenShift cluster with GPU-enabled worker nodes
  • Installing the Red Hat OpenShift AI operator
  • Configuring GPU resources and drivers
  • Setting up model serving infrastructure
  • Deploying and testing your LLM

Deploying RHEL AI on IBM Cloud

RHEL AI provides a platform for hosting and testing foundation models using InstructLab. To deploy RHEL AI:

  1. Follow the Deploying and Running a model in RHEL AI on IBM Cloud solution tutorial.
  2. Provision a Virtual Server Instance (VSI) with appropriate GPU resources.
  3. Install and configure RHEL AI components.
  4. Set up InstructLab for model testing and experimentation.
  5. Run and validate foundation models on the hosted VSI.

Deploying Security and Compliance Center Workload Protection

Security and Compliance Center Workload Protection provides comprehensive security monitoring and compliance management for your AI workloads. The deployment process includes:

  1. Provision an SCC Workload Protection instance

    Create a new instance of Security and Compliance Center Workload Protection in your IBM Cloud account.

  2. Configure data sources

    Connect data sources by configuring agents on your OpenShift clusters and RHEL AI instances.

  3. Launch the web UI

    Access the SCC Workload Protection dashboard to monitor your AI workloads.

  4. Configure security policies

    Set up policies for vulnerability scanning, compliance checking, and runtime threat detection.

  5. Enable continuous monitoring

    Configure continuous monitoring to identify vulnerabilities, check compliance, block runtime threats, and respond to incidents.

For detailed deployment steps, see Getting started with Security and Compliance Center Workload Protection.

Additional services

You can enhance your AI workload deployment by adding the following services:

Account Infrastructure base
Create and configure the foundational components of an IBM Cloud account, including resource groups, access groups, and service authorizations. See Account Infrastructure base.
IBM Cloud Observability
Provision and configure logging, monitoring, and activity tracking services to gain visibility into your AI workloads. See IBM Cloud Observability.
IBM Cloud Event Notifications
Implement a high-throughput message bus built with Apache Kafka for event-driven architectures. See IBM Cloud Event Notifications.
IBM Cloud Internet Services
Leverage Cloudflare-powered services for fast, highly performant, reliable, and secure internet connectivity. See IBM Cloud Internet Services.

Network and security customization

After deploying the foundational architecture, you can customize networking and security components to meet your specific requirements:

  • Configure Virtual Private Cloud (VPC) networking with custom subnets and security groups
  • Implement network segmentation and Zero Trust policies
  • Set up private endpoints for secure service connectivity
  • Configure firewall rules and access control lists (ACLs)
  • Enable encryption for data in transit and at rest
  • Implement identity and access management (IAM) policies

Validation and testing

After deployment, validate your AI workload environment:

  1. Verify that all components are running and properly configured
  2. Test LLM inference endpoints for functionality and performance
  3. Validate security policies and compliance posture
  4. Review monitoring dashboards and alerts
  5. Conduct security scans and vulnerability assessments
  6. Test incident response procedures

Next steps

After deploying your AI workloads:

  • Review Key Features and Guardrails to understand security controls
  • Explore Use Cases for practical implementation examples
  • Set up continuous monitoring and alerting
  • Establish operational procedures for model updates and security patches
  • Configure automated compliance reporting