IBM Cloud Docs
Gen AI Pattern for Watsonx on IBM Cloud

Gen AI Pattern for Watsonx on IBM Cloud

This reference architecture summarizes the best practices for Watsonx Gen AI Pattern deployment on IBM Cloud.

AI holds the promise to transform life and business but raises concerns around trust, security, and regulatory compliance. Understanding Gen AI and its infrastructure is vital for navigating its complex landscape. This reference architecture showcases how IBM Cloud and Watsonx provide a secure environment for deploying and governing Gen AI applications.

A more specific use case of this pattern is a Retrieval Augmented Generation (RAG) pattern. RAG enables foundation models to produce factually correct outputs by querying relevant content. RAG is a solution for any business scenario where there is a large body of documentation that a user must consult to provide confident answers.

Below is a diagram that shows the flow of a RAG solution. It is not the entire reference architecture, but a small portion highlighting the options and key IBM Cloud components for an end-to-end RAG flow, i.e., data administration and processing, end-user gen AI application, conversational flow and gen AI task inferencing from LLMs/foundation models. For example, watsonx Assistant can provide conversational flow. It requires indexed data in Elasticsearch or Watson Discovery for content retrieval and includes an embedded LLM for response generation. In the watsonx.ai option, watsonx.ai can provide endpoints to use for querying. It provides options like in-memory and Elasticsearch for content retrieval and has watsonx.ai LLMs/foundation models for response generation.

RAG.
Figure 1. RAG Pattern

Architecture diagram

The below diagram represents the architecture for Gen AI on IBM cloud and reuses the best practices for IBM Cloud for Financial Services and VPC reference architecture.

Architecture.
Figure 2. Reference Architecture

Central to the architecture are three VPCs, which provide for separation of concerns between provider management functionality and consumer workloads.

Management VPC
Provides compute, storage, and network services to enable the client or service provider's administrators to monitor, operate, and maintain the environment.

Workload VPC
Provides compute, storage, and network services to support hosted applications and operations that deliver services to the consumer.

Edge VPC
The edge VPC is used to enhance boundary protection for the workload VPC, by allow consumers to access Gen AI User Interface through the public internet. (see here)

Other features of the reference architecture:

  • Can reside in one or more multizone regions to provide additional resiliency.

  • Enables access to the management VPC from the application provider's enterprise environment through IBM Cloud Virtual Private Network Gateway for VPC.

  • Provides connectivity from the consumer's enterprise environment to the workload VPC through Direct Link.

  • Connects management VPC, workload VPC, and Edge VPC by using IBM Cloud Transit Gateway.

Design concepts

Below is the Architecture Framework Design heatmap that covers design considerations and architecture decisions for the following aspects and domains:

  • Data: Artifical Intelligence
  • Compute: Virtual Servers, Containers, Serverless
  • Storage: Primary Storage, Backup
  • Networking: Enterprise Connectivity, Load Balancing, Domain Name Services
  • Security: Data Security, Identity & Access, Application Security, Infrastructure & Endpoints, Governance, Risk & Compliance
  • DevOps: Build & Test, Delivery Pipeline, Code Repository
  • Resiliency: High Availability
  • Service Management: Monitoring, Logging, Auditing / tracking, Automated Deployment

heatmap
Figure 3. Architecture design scope

Requirements

The following table outlines the requirements that are addressed in this architecture.

Table 1. Requirements
Aspect Requirements
Compute Provide properly isolated compute resources with adequate compute capacity for the applications.
Storage Provide storage that meets the application and database performance requirements.
Networking Deploy workloads in isolated environment and enforce information flow policies.
Provide secure, encrypted connectivity to the cloud’s private network for management purposes.
Distribute incoming application requests across available compute resources.
Security Ensure all operator actions are executed securely through a bastion host.
Protect the boundaries of the application against denial-of-service and application-layer attacks.
Encrypt all application data in transit and at rest to protect from unauthorized disclosure.
Encrypt all security data (operational and audit logs) to protect from unauthorized disclosure.
Encrypt all data using customer managed keys to meet regulatory compliance requirements for additional security and customer control.
Protect secrets through their entire lifecycle and secure them using access control measures.
Firewalls must be restrictively configured to prevent all traffic, both inbound and outbound, except that which is required, documented, and approved.
DevOps Delivering software and services at the speed the market demands requires teams to iterate and experiment rapidly. They must deploy new versions frequently, driven by feedback and data.
Resiliency Support application availability targets and business continuity policies.
Ensure availability of the application in the event of planned and unplanned outages.
Backup application data to enable recovery in the event of unplanned outages.
Provide highly available storage for security data (logs) and backup data.
Service Management Monitor system and application health metrics and logs to detect issues that might impact the availability of the application.
Generate alerts/notifications about issues that might impact the availability of applications to trigger appropriate responses to minimize down time.
Monitor audit logs to track changes and detect potential security problems.
Provide a mechanism to identify and send notifications about issues found in audit logs.

Components

The following table outlines the products or services used in the architecture for each aspect.

Table 2. Components
Aspects Architecture components How the component is used
Data Watsonx Assistant Conversational artificial intelligence platform
Watson Discovery Automates the discovery of information and insights with advanced Natural Language Processing and Understanding
watsonx.ai Brings together new generative AI capabilities powered by foundation models and traditional machine learning (ML) into a powerful studio spanning the AI lifecycle
watsonx.data Enables you to scale analytics and AI with all your data, wherever it resides
watsonx.governance Direct, manage and monitor the artificial intelligence activities
Elasticsearch Database to store vector representations  also known as embeddings created by using machine learning algorithms
Milvus A vector database that stores, indexes, and manages massive embedding vectors that are developed by deep neural networks and other machine learning (ML) models.
Compute Virtual Servers for VPC Web, App, and database servers
Code Engine Abstracts the operational burden of building, deploying, and managing workloads in Kubernetes so that developers can focus on what matters most to them: the source code
Red Hat OpenShift Kubernetes Service (ROKS) A managed offering to create your own cluster of compute hosts where you can deploy and manage containerized apps on IBM Cloud
Storage Cloud Object Storage Web app static content, backups, logs (application, operational, and audit logs)
VPC Block Storage Web app storage if needed
Networking VPC Virtual Private Network (VPN) Remote access to manage resources in private network
Virtual Private Endpoint (VPE) For private network access to Cloud Services, e.g., Key Protect, COS, etc.
VPC Load Balancers Application Load Balancing for web servers, app servers, and database servers
Direct Link 2.0 Seamlessly connect on-premises resources to cloud resources
Transit Gateway (TGW) Connects the Workload and Management VPCs within a region
Cloud Internet Services (CIS) Global load balancing between regions
Access Control List (ACL) To control all incoming and outgoing traffic in Virtual Private Cloud
Security IAM IBM Cloud Identity & Access Management
Key Protect A full-service encryption solution that allows data to be secured and stored in IBM Cloud
BYO Bastion Host on VPC VSI Remote access with Privileged Access Management
App ID Add authentication to web and mobile apps
Secrets Manager Certificate and Secrets Management
Security and Compliance Center (SCC) Implement controls for secure data and workload deployments, and assess security and compliance posture
Hyper Protect Crypto Services (HPCS) Hardware security module (HSM) and Key Management Service
Virtual Network Function (VNF) Virtualized network services running on virtual machines.
DevOps Continuous Integration (CI) A pipeline that tests, scans and builds the deployable artifacts from the application repositories
Continuous Deployment (CD) A pipeline that generates all of the evidence and change request summary content
Continuous Compliance (CC) A pipeline that continuously scans deployed artifacts and repositories
Container Registry Highly available, and scalable private image registry
Resiliency VPC VSIs, VPC Block across multiple zones in two regions Web, app, database high availability and disaster recovery
Service Management IBM Cloud Monitoring Apps and operational monitoring
IBM Log Analysis Apps and operational logs
Activity Tracker Event Routing Audit logs

Compliance

CI / CD / CC Pipelines
The Continuous Integration (CI), Continuous Deployment (CD), and Continuous Compliance (CC) pipelines, referred to as DevSecOps Application Lifecycle Management are used to deploy the application, check for vulnerabilities, and ensure auditability. Below are some of important compliance features of DevSecOps Application Lifecycle Management:

  • Vulnerability Scans
    Vulnerability scans involve using specialized tools to look for security vulnerabilities in the code. This is crucial to identify and fix potential security issues before they become a problem in production.

  • Sign Build Artifacts
    The code is compiled and built into software or application artifacts (like executable files or libraries). These artifacts are then digitally signed to ensure their authenticity and integrity.

  • Evidence Gathering
    This involves collecting and storing evidence of the development process, such as commit logs, build logs, and other relevant data. It helps in tracing back and understanding what happened at different stages of development.

  • Evidence Locker
    This involves collecting and storing evidence of the development process, such as commit logs, build logs, and other relevant data. This helps in tracing back and understanding what happened at different stages of development.

Security and Compliance Center (SCC)
This reference architecture utilizes the Security and Compliance Center (SCC) which defines policy as code, implements controls for secure data and workload deployments and assess security and compliance posture. For this reference architecture two profiles are used. The IBM Cloud Framework for Financial Services and AI ICT Guardrails. A profile is a grouping of controls that can be evaluated for compliance.