IBM Cloud Docs
Getting started with Red Hat AI InstructLab on IBM Cloud

Getting started with Red Hat AI InstructLab on IBM Cloud

Get ready to dive into AIThe capability to acquire, process, create and apply knowledge in the form of a model to make predictions, recommendations or decisions. with Red Hat® AI InstructLab on IBM Cloud®!

InstructLab is an open source project created by IBM and Red Hat to be a cost-effective entry point into the world of machine learningA branch of artificial intelligence (AI) and computer science that focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving the accuracy of AI models..

Get familiar with the capabilities

To use InstructLab, you do not need to have any preexisting knowledge. You do not even need to have an idea for what to create yet. Let's start by just getting familiar with the concepts and what kinds of things you can do with the technology.

Generative AIA class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data. starts with a large language model (LLM)A language model with many parameters, trained on a large quantity of text.. With a prompt, these models can take sets of data and provide a statistically probable output for that prompt. You can automatically generate a data set that is similar to real data, and then use it to train the model to get the most probable output possible.

With InstructLab, you can use an existing, pre-trained LLM compiled by a community of contributors, and then generate the data to further train the model. By incorporating IBM Cloud, you have a place to store the taxonomy, which is the informational structure, for the model as you modify and train it on an ongoing basis.

Task flow diagram for generating a model with the service.
Task flow diagram.

Prerequisites

If this is the first time InstructLab is being used in the account, complete these tasks.

Optional: Prepare your taxonomy

If you don't already have a taxonomy, you can use the InstructLab community taxonomy to start.

To create your own taxonomy instead, see Preparing taxonomies for more information.

  1. Optional: Fork the community taxonomy repo and clone it to your local machine.

  2. Optional: Make updates to the taxonomy in your fork. The following example adds rhyming questions to the linguistics directory.

    a. In your fork, create a /instructlab-taxonomy/compositional_skills/grounded/linguistics/rhyming_words/qna.yaml file.

    b. In the qna.yml file, add a question related to rhyming words.

    - answer: 'Here are two rhyming words for "cave":
        1\. Brave
    
        2\. Gave'
      question: 'Give me two words that rhyme with cave'
    

    c. If your additions include reference documents in Github, such as this example, you can use public github.com repositories.

    document:
    repo: https://github.com/<organization>/<repository>
    commit: <commit_sha>
    patterns:
        - <filename>.md
    

    d. Save the changes and push the changes to the fork.

    e. Optional: Learn more about how to modify the taxonomy for the model.

    f. Optional: Validate the updated taxonomy.

  3. In a browser, open the Releases page for your Github repository. For example: https://github.com/<my-org>/taxonomy/releases.

  4. Click Create a release.

  5. Create a tag, select a target branch, and enter a name for the release.

  6. Click Publish a release.

  7. Download the packaged TAR file that was automatically generated from the release by clicking Source code (tar.gz).

  8. Optional: If you are using a private repository for your taxonomy knowledge documents, complete the following steps.

    1. Follow the GitHub documentation to create a classic personal access token (PAT).

    2. In the Repository access section, scope your PAT to your taxonomy repo.

    3. In the Repository permissions section, select Contents > read-only and Metadata > read-only.

Upload your taxonomy by using the console

Complete the following steps to store your taxonomy in Object Storage.

  1. From the Projects page, select your project.

  2. Click Taxonomies.

  3. Click Upload and enter the following details.

    Taxonomy file
    Select your .tar.gz file.
    Taxonomy name
    Give the taxonomy an alphanumeric name.
    Private repository access
    Enable this option if your taxonomy knowledge documents are in a private repo.
    Secrets Manager service instance: Select an existing instance or create one.
    Secrets Manager secret: Select an existing secret or create one. If you are creating a secret, select the Key-value secret type and add your personal access token in the following format. Note that the value for github_url must containhttps://. The URL is the same URL that you used in the repo section of the your taxonomy document reference.
    {
    "github_url": "https://...",
    "github_pat": "xxxxx"
    }
    

    For more information, see Creating Key-value secrets.

    Cloud storage
    Either select a Object Storage instance and bucket to use or create an instance and bucket.
    Service authorization
    Check the box to allow InstructLab to write your taxonomy to Object Storage
    Optional storage settings
    Specify the path where you want to store the taxonomy tar.gz in Object Storage.
  4. Click Upload

Add the taxonomy to Object Storage by using the CLI

Complete the following steps to store your taxonomy in Object Storage.

You can use the set command to save Object Storage bucket details and credentials, and more. This can simplify your commands going forward. Note that when using the set command, you must set each value individually. For more information, see the Config command reference.

  1. Log in to your IBM Cloud account from the CLI.
    ibmcloud login -a https://cloud.ibm.com --sso -r us-east
    
  2. If you plan to allow InstructLab to create IBM Cloud® Object Storage Instance resources for you, target a resource group.
    ibmcloud target -g <resource_group>
    
    Example:
    ibmcloud target -g Default
    
  1. Create the authorization policy for InstructLab and Object Storage.

    ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name cloud-object-storage
    
  2. Optional: If you are using a private repo to store your taxonomy knowledge documents, complete the following steps.

    1. Create a service authorization to allow InstructLab to access your Secrets Manager instance and secrets.

      ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name secrets-manager
      
    2. Add your personal access token (PAT) to Secrets Manager by creating a Key-value secret. Make sure your key-value details are stored in the following format.

      {
      "github_url": "https://...",
      "github_pat": "xxxxx"
      }
      

      Example command for creating a key-value secret.

      ibmcloud secrets-manager secret-create --secret-prototype='{"name": "my-secret","description": "Description of my key-value secret.","secret_type": "kv","secret_group_id": "67d025e1-0248-418f-83ba-deb0ebfb9b4a","labels": ["dev","us-south"],"data": {"github_url": "https://...","github_pat": "xxxxx"},"custom_metadata": {"metadata_custom_key": "metadata_custom_value"},"version_custom_metadata": {"custom_version_key": "custom_version_value"}}'
      

      For more information, see Creating Key-value secrets.

    3. List your Secrets Manager instances.

      ibmcloud resource service-instances --service-name secrets-manager
      
    4. Get your instance details.

      ibmcloud resource service-instances INSTANCE
      
  3. Run the taxonomy add --help command and review the command options.

    ibmcloud ilab taxonomy add --help
    
  4. Optional If you have an existing Object Storage instance that you want to use, get your service instance details.

    1. List your Object Storage instances.
      ibmcloud resource service-instances --service-name cloud-object-storage
      
    2. Get your instance details.
      ibmcloud resource service-instances INSTANCE
      
  5. Add your taxonomy to your Object Storage bucket. Review the following example commands.

    Quick start Example command to automatically create an Object Storage instance and bucket in your account and upload a taxonomy from your Downloads folder to it.

    ibmcloud ilab taxonomy add \
    --name example-taxonomy-1 \
    --taxonomy-path-cos "taxonomies/taxonomy.tar.gz" \
    --taxonomy-path "Downloads/taxonomy.tar.gz"
    

    Example command to upload a taxonomy from a taxonomy folder on your machine to an existing Object Storage instance and bucket.

    ibmcloud ilab taxonomy add \
    --name example-taxonomy-name-1 \
    --taxonomy-path-cos "taxonomies/taxonomy.tar.gz" \
    --taxonomy-path "./taxonomy" \
    --cos-bucket example-cloud-object-storage-bucket-1 \
    --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloud \
    --cos-id 628e4348-2183-42fa-a03a-6f0f78453530
    

    Example command to use an existing Object Storage instance and bucket as well as Secrets Manager credentials.

    ibmcloud ilab taxonomy add \
    --name example-taxonomy-1 \
    --taxonomy-path-cos taxonomies/taxonomy.tar.gz \
    --taxonomy-path "Downloads/taxonomy.tar.gz" \
    --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloud \
    --cos-id 628e4348-2183-42fa-a03a-6f0f78453530 \
    --secrets-manager-git-id SEC-MGR-ID
    --secrets-manager-git-url https://URL
    

    Example command to upload a taxonomy from your /Users/USER/instructlab-taxonomy folder to an new, automatically created bucket.

    ibmcloud ilab taxonomy add \
    --name test \
    --taxonomy-path "/Users/USER/instructlab-taxonomy" \
    --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloud \
    --cos-id 628e4348-2183-42fa-a03a-6f0f78453530
    

What's next?

Generate data from the taxonomy.