IBM Cloud Docs
Understanding your responsibilities when you use IBM Spectrum LSF

Understanding your responsibilities when you use IBM Spectrum LSF

Learn about the management responsibilities and terms and conditions that you have when you use the IBM Spectrum LSF deployable architecture.

Overview of shared responsibilities

IBM Spectrum LSF is a product that is deployed to user resources in the IBM Cloud shared responsibility model. Start by reviewing the following table of who is responsible for particular cloud resources for IBM Spectrum LSF. Next, view more granular tasks for shared responsibilities in the proceeding sections.

If you use other IBM Cloud products such as Object Storage, responsibilities that are marked as yours in the following table, such as disaster recovery for Data, might be IBM's or shared. Consult those products' documentation for your responsibilities.

Responsibilities by resource
Resource Incident and operations management Change management Security and regulation compliance Disaster recovery
Data You You You You
Application Orchestration You You You You
Observability Shared IBM Shared IBM
App networking You You You You
Cluster networking Shared Shared Shared You
Cluster version Shared Shared Not applicable Not applicable
Management nodes Shared Shared Shared You
Compute nodes Shared Shared Shared You
Virtual storage Shared Shared Shared You
Virtual network Shared Shared Shared You

Review the following sections for the specific responsibilities for you and for IBM when you use the IBM Spectrum LSF deployable architecture.

Incident and operations management

Incident and operations management includes tasks such as monitoring, event management, high availability, problem determination, recovery, and full state backup and recovery.

Responsibilities for incident and operations
IBM Responsibilities Your Responsibilities
Management nodes
  • Deploy highly available dedicated management nodes in a secured, IBM-owned infrastructure account for each cluster.
  • Ensure the health of management nodes in OS level.
Use the provided console tools to request that management nodes are rebooted or reloaded, and troubleshoot issues such as when the management nodes are in an unhealthy state.
Compute nodes
  • Provision compute nodes in VPC under your IBM Cloud infrastructure account.
  • Ensure that compute nodes successfully provision when the user account and permissions are correctly set up, and a sufficient quota exists.
  • Fulfill requests for more infrastructure, such as adding, reloading, updating, and removing compute nodes.
  • Provide tools, such as the LSF Resource Connector to extend your cluster infrastructure.
  • Fulfill automation requests to help recover compute nodes.
  • Ensure the health of compute nodes in OS level.
  • Use the provided API, CLI, or console tools to adjust storage capacity to meet the needs of your workload.
  • Deploy application/tools in cluster
Cluster networking
  • Set up cluster management components, such as public or private cloud service endpoints.
  • Fulfill requests for more infrastructure, such as attaching worker nodes to existing VPC or subnets upon resizing a compute pool.
  • Provide the ability to set up a VPN connection with on-premises resources such as through the strongSwan IPSec VPN service or the IBM Cloud VPC VPN.
  • Provide the ability to isolate network traffic with login nodes.
Use IBM Cloud VPC tools to adjust networking configuration to meet the needs of your workload.
Observability
  • Provide standard IBM Spectrum LSF tools for monitoring the status of LSF cluster.
  • Provide a standard IBM Cloud Console for monitoring the status of VPC resources(VSI, network, storage, and so on).
Set up and monitor the health of your cluster health metrics.

Change management

Change management includes tasks such as deployment, configuration, upgrades, patching, configuration changes, and deletion.

You and IBM share responsibilities for keeping your clusters at the supported platform and operating system versions, along with recovering infrastructure resources that might require changes. You are responsible for change management of your application data.

Responsibilities for change management
IBM Responsibilities Your Responsibilities
Management nodes Provide management node patch operating system(OS), version, and security updates for image used for new cluster creation. Use the IBM Cloud tools to apply the provided(existing) management nodes updates that include operating system; or to request that management nodes are rebooted.
Compute nodes Provide compute node patch operating system (OS), version, and security updates. Not supported on existing running VSIs, only for new VSIs with latest image. Use IBM Cloud tools to apply the provided compute node updates that include operating system patches; or to raise ticket to request that worker nodes are rebooted.
Cluster version Provide image for new version of LSF for new cluster creation. Update existing management nodes and compute nodes to new LSF version, or create new cluster with latest image to run with new cluster version

Identity and access management

Identity and access management includes tasks such as authentication, authorization, access control policies, and approving, granting, and revoking access.

You and IBM share responsibilities for controlling access to your IBM Spectrum LSF instances. For IBM Cloud® Identity and Access Management responsibilities, consult that product's documentation. You are responsible for identity and access management to your application data.

Responsibilities for identity and access management
IBM Responsibilities Your Responsibilities
Observability Provide the ability to integrate IBM Cloud Activity Tracker with your cluster to audit the actions that users take in the cluster. Set up IBM Cloud Activity Tracker or other capabilities to track user activity in the cluster.

Security and regulation compliance

Security and regulation compliance includes tasks such as security controls implementation and compliance certification.

IBM is responsible for the security and compliance of HPC Clusters on IBM Cloud. Compliance with industry standards varies depending on the infrastructure provider that you use for the cluster. You are responsible for the security and compliance of any workloads that run in the cluster and your application data.

Responsibilities for security and regulation compliance
IBM Responsibilities Your Responsibilities
General Provide security controls commensurate to best practice for IBM Spectrum LSF in Cloud.
Provide options for cluster network connectivity, such as public and private cloud service endpoints Set up and maintain security and regulation compliance for your apps and data. For example, choose how to set up your cluster network, protect sensitive information such as with IBM Key Protect encryption, and configure further security settings to meet your workload's security and compliance needs. If applicable, configure your firewall.
Management nodes As part of your incident and operations management responsibilities for the management nodes, apply the provided security patch updates.
Compute nodes Disable certain insecure actions for compute nodes, such as not permitting users to SSH into the host. As part of your incident and operations management responsibilities for the worker nodes, apply the provided security patch updates.

Disaster recovery

Disaster recovery includes tasks such as providing dependencies on disaster recovery sites, provisioning disaster recovery environments, data and configuration backup, replicating data and configuration to the disaster recovery environment, and failover on disaster events.

IBM is responsible for the recovery of Spectrum Computing on IBM Cloud components if there is disaster. You are responsible for the recovery of the workloads that run the cluster and your application data. If you integrate with other IBM Cloud services such as file, block, object, cloud database, logging, or audit event services, consult those services' disaster recovery information.

Responsibilities for disaster recovery
IBM Responsibilities Your Responsibilities
General
  • Set up and maintain disaster recovery capabilities for your apps and data. For example, to prepare your cluster for HA/DR scenarios, follow the guidance in High availability on IBM Cloud. Note that persistent storage of data such as application logs and cluster metrics are not set up by default.
  • Creating resources in a secondary region and managing the application and data disaster recovery.

Applications and data

You are responsible for the applications, workloads, and data that you deploy to IBM Cloud. However, IBM provides various tools to help you set up, manage, secure, integrate, and optimize your applications.

Responsibilities for applications and data
Resource How IBM helps What you can do
Data
  • Maintain platform-level standards so that your data can be stored with controls commensurate (refer to IBM File storage statement) to a minimum set of security compliance standards.
  • Integrate with IBM Cloud services that you can use to store and manage your data, such as File Storage and Block Storage.
  • Maintain responsibility for your data and how your apps consume the data.
Applications
  • Provision clusters with Spectrum LSF, Data Manager, and License Scheduler.
  • Generate an API key that is used to access infrastructure permissions for each resource group and region
  • Maintain responsibility for your apps, data, and their complete lifecycle.
  • Use the provided tools and features to configure and deploy; keep up to date; set up resource requests and limits; size your compute pool to have enough resources to run your apps; set up permissions; integrate with other services; externally serve; save, back up, and restore data; and otherwise manage your highly available and resilient workloads.