Deploying AI Workloads on IBM Cloud
This guide provides step-by-step guidance for provisioning and configuring Red Hat OpenShift AI (RHOAI) clusters on IBM Cloud with NVIDIA GPUs, hosting large language models (LLMs), and securing AI workloads with Security and Compliance Center Workload Protection.
Overview
Deploying AI workloads on IBM Cloud involves several key components that work together to provide a secure, scalable, and compliant environment:
- Red Hat OpenShift AI cluster with GPU support for model training and inference
- RHEL AI on IBM Cloud for hosting and testing foundation models using InstructLab
- Security and Compliance Center Workload Protection for continuous security monitoring and compliance
This deployment approach allows you to adapt various components, such as networking and security, to better suit your business needs after the foundational architecture has been established.
Security and compliance for AI workloads
Securing AI workloads requires a comprehensive approach that includes policy enforcement, control implementation, runtime monitoring, and real-time threat detection. Security and Compliance Center Workload Protection (SCC+WP) provides organizations with visibility into policy-driven posture enforcement and real-time threat detection capabilities.
A layered security approach that combines visibility, posture management, and behavioral detection serves as the foundation for securing AI workloads on IBM Cloud. This includes:
- Full visibility into all AI components, including unsanctioned "shadow AI" deployments
- Continuous risk management across the AI lifecycle
- Policy-driven posture enforcement
- Real-time threat detection and incident response
- Vulnerability identification and compliance checking
- Runtime threat blocking
Before you begin
Before you deploy this reference architecture, ensure you have the following prerequisites:
- IBM Cloud account
- An active IBM Cloud account with appropriate permissions.
- IAM access policies
- Required Identity and Access Management (IAM) access policies for provisioning resources.
- SSH key pair
- A VPC public and private SSH key pair that is not in the deployment region.
- IBM Cloud API key
- An API key for the user or service ID with the correct IAM access policies.
- Landing zone planning
- Review Planning for the landing zone deployable architectures.
- Red Hat OpenShift subscriptions
- Valid Red Hat OpenShift AI (RHOAI) entitlement. On IBM Cloud, this entitlement is typically provided as part of the managed OpenShift AI add-on. You do not need to procure a separate Red Hat "Advanced" subscription unless you are running OpenShift AI self-managed.
Provisioning the architecture
The deployment process consists of three main components that can be deployed independently or together:
- Red Hat OpenShift AI cluster with LLM deployment
- RHEL AI on IBM Cloud for model hosting and testing
- Security and Compliance Center Workload Protection for security monitoring
Deploying Red Hat OpenShift AI with LLM
You can deploy a large language model to a Red Hat OpenShift AI cluster using one of the following methods:
- Solution tutorial
- Follow the step-by-step Deploying LLM to Red Hat OpenShift AI cluster solution tutorial for detailed instructions.
- Deployable architecture
- Use the Red Hat OpenShift AI deployable architecture to automate the provisioning and configuration process, then deploy your LLM.
The deployment includes:
- Provisioning an OpenShift cluster with GPU-enabled worker nodes
- Installing the Red Hat OpenShift AI operator
- Configuring GPU resources and drivers
- Setting up model serving infrastructure
- Deploying and testing your LLM
Deploying RHEL AI on IBM Cloud
RHEL AI provides a platform for hosting and testing foundation models using InstructLab. To deploy RHEL AI:
- Follow the Deploying and Running a model in RHEL AI on IBM Cloud solution tutorial.
- Provision a Virtual Server Instance (VSI) with appropriate GPU resources.
- Install and configure RHEL AI components.
- Set up InstructLab for model testing and experimentation.
- Run and validate foundation models on the hosted VSI.
Deploying Security and Compliance Center Workload Protection
Security and Compliance Center Workload Protection provides comprehensive security monitoring and compliance management for your AI workloads. The deployment process includes:
-
Provision an SCC Workload Protection instance
Create a new instance of Security and Compliance Center Workload Protection in your IBM Cloud account.
-
Configure data sources
Connect data sources by configuring agents on your OpenShift clusters and RHEL AI instances.
-
Launch the web UI
Access the SCC Workload Protection dashboard to monitor your AI workloads.
-
Configure security policies
Set up policies for vulnerability scanning, compliance checking, and runtime threat detection.
-
Enable continuous monitoring
Configure continuous monitoring to identify vulnerabilities, check compliance, block runtime threats, and respond to incidents.
For detailed deployment steps, see Getting started with Security and Compliance Center Workload Protection.
Additional services
You can enhance your AI workload deployment by adding the following services:
- Account Infrastructure base
- Create and configure the foundational components of an IBM Cloud account, including resource groups, access groups, and service authorizations. See Account Infrastructure base.
- IBM Cloud Observability
- Provision and configure logging, monitoring, and activity tracking services to gain visibility into your AI workloads. See IBM Cloud Observability.
- IBM Cloud Event Notifications
- Implement a high-throughput message bus built with Apache Kafka for event-driven architectures. See IBM Cloud Event Notifications.
- IBM Cloud Internet Services
- Leverage Cloudflare-powered services for fast, highly performant, reliable, and secure internet connectivity. See IBM Cloud Internet Services.
Network and security customization
After deploying the foundational architecture, you can customize networking and security components to meet your specific requirements:
- Configure Virtual Private Cloud (VPC) networking with custom subnets and security groups
- Implement network segmentation and Zero Trust policies
- Set up private endpoints for secure service connectivity
- Configure firewall rules and access control lists (ACLs)
- Enable encryption for data in transit and at rest
- Implement identity and access management (IAM) policies
Validation and testing
After deployment, validate your AI workload environment:
- Verify that all components are running and properly configured
- Test LLM inference endpoints for functionality and performance
- Validate security policies and compliance posture
- Review monitoring dashboards and alerts
- Conduct security scans and vulnerability assessments
- Test incident response procedures
Next steps
After deploying your AI workloads:
- Review Key Features and Guardrails to understand security controls
- Explore Use Cases for practical implementation examples
- Set up continuous monitoring and alerting
- Establish operational procedures for model updates and security patches
- Configure automated compliance reporting