Getting started with Red Hat AI InstructLab on IBM Cloud
Get ready to dive into AIThe capability to acquire, process, create and apply knowledge in the form of a model to make predictions, recommendations or decisions. with Red Hat® AI InstructLab on IBM Cloud®! Red Hat AI InstructLab is an open source project created by IBM and Red Hat to be a cost-effective entry point into the world of machine learningA branch of artificial intelligence (AI) and computer science that focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving the accuracy of AI models..
InstructLab is available to allowlisted accounts only.
Prerequisites
Before you begin, you must have a paid IBM Cloud® account.
The following account types are supported:
- Pay-As-You-Go
- Subscription
Trial accounts are not supported. For more information or to upgrade your account, see Account types.
Get familiar with the capabilities
If you are new to machine learning, you are in the correct place. To use InstructLab, you do not need to have any preexisting knowledge. You do not even need to have an idea for what to create yet. Let's start by just getting familiar with the concepts and what kinds of things you can do with the technology.
Generative AIA class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data. starts with a large language model (LLM)A language model with a large number of parameters, trained on a large quantity of text.. With a prompt, these models can take sets of data and provide a statistically probable output for that prompt. You can automatically generate a data set that is similar to real data, and then use it to train the model to get the most probable output possible.
With InstructLab, you can use an existing, pre-trained LLM compiled by a community of contributors, and then generate the data to further train the model. By incorporating IBM Cloud, you have a place to store the taxonomy, which is the informational structure, for the model as you modify and train it on an ongoing basis.
Install the CLIs
-
Optional: Install the Git CLI to store and manage your taxonomies.
-
Install the IBM Cloud CLI.
-
ibmcloud plugin install cos
-
Install the plug-in.
ibmcloud plugin install ilab
-
Log in to your IBM Cloud account from the CLI.
ibmcloud login -a https://cloud.ibm.com --sso -r us-east
-
If you plan to allow InstructLab to create IBM Cloud® Object Storage Instance resources for you, target a resource group.
ibmcloud target -g <resource_group>
Example:
ibmcloud target -g Default
-
Create a project in the InstructLab instance.
ibmcloud resource service-instance-create <project_name> instructlab instructlab-pricing-plan us-east
Give InstuctLab permission to create and update COS artifacts
Give InstructLab the Writer
access role for the COS service. The logged-in user must also have the same permission.
-
Create the authorization policy for InstructLab.
ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name cloud-object-storage
If you already have COS resources to use, you can scope the authorization to only those resources.
ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name cloud-object-storage --target-service-instance-id <cloud-object-storage-instance-id> --target-resource <cloud-object-storage-bucket> --target-resource-type bucket
-
Verify that the authorization policy was created.
ibmcloud iam authorization-policies
Result when authorization is not scoped to a specific COS bucket:
Getting authorization policies under account abc1234 as user... OK ID: <id> Source service name: instructlab Source service instance: All instances Target service name: cloud-object-storage Target service instance: All instances Roles: Writer
Result when authorization is scoped to a specific COS bucket:
Getting authorization policies under account abc1234 as user... OK ID: <id> Source service name: instructlab Source service instance: All instances Target service name: cloud-object-storage Target service instance: bucket Roles: Writer
-
If necessary, give the
Writer
permission to the logged-in user. Include the Object Storage service instance ID from the previous step.ibmcloud iam user-policy-create <user> --roles Writer --service-instance <cloud-object-storage-instance-id>
Create an authorization policy for InstructLab
Give InstructLab the Writer
access role for the COS service. The logged-in user must also have the same permission.
-
Create an authorization policy gives InstructLab
Writer
access to the COS service.a. In the user interface, click Manage > Access (IAM) > Authorizations.
b. For the source service, select the InstructLab service.
c. For the target service, select the Cloud Object Storage service.
d. In the Roles section, for Service access, select Writer.
e. Click Authorize.
-
If necessary, give the
Writer
permission to the logged-in user.a. In the user interface, click Manage > Access (IAM) > Users.
b. Click the user name > Access and in the Assign policies section, click Assign access.
c. Select the Cloud Object Storage service and complete the prompts.
d. In the Roles and Actions section, for Service access, select Writer.
Create a project in the Instruct Lab instance
-
From, the console, Click Projects > Create.
-
In the Service name field, enter a name for the project.
-
Accept the license agreement and click Create.
-
Click the Manage tab to return to the Projects page.
-
Click the project that you created.
Optional: Create a COS instance and bucket
-
If you don't have a service instance yet, provision a COS instance in your account.
-
Create a new bucket and make a note of the bucket name for later.
Optional: Create a COS instance and bucket by using the CLI
- If you don't have a service instance yet, provision a COS instance in your account. Make a note of your instance
ID.
ibmcloud resource service-instance-create <instance-name> cloud-object-storage <plan> global
- Create a new bucket and make a note of the bucket name for later.
ibmcloud cos bucket-create --bucket <bucket-name> [--class <class-name>] [--ibm-service-instance-id <instance-id>] [--region REGION] [--output FORMAT]
Prepare a taxonomy
In this example, use the Git CLI to clone and update the InstructLab community taxonomy.
-
Fork the community taxonomy repo by clicking Fork and completing the steps.
-
Clone your fork to your local machine.
git clone https://github.com/<my-org>/taxonomy
-
Optional: Make updates to the taxonomy in your fork. This example adds rhyming questions to the linguistics directory.
a. In your cloned fork, create a
/instructlab-taxonomy/compositional_skills/grounded/linguistics/rhyming_words/qna.yaml
file.b. In the
qna.yml
file, add a question related to rhyming words.- answer: 'Here are two rhyming words for "cave": 1\. Brave 2\. Gave ' question: 'Give me two words that rhyme with cave '
c. If your additions include reference documents in Github, such as this example, you can use public
github.com
repositories and IBM internalgithub.ibm.com
repositories.document: repo: https://github.ibm.com/<organization>/<repository> commit: <commit_sha> patterns: - <filename>.md
If you are using a private repository, you must give the
instructlab-ibm
user read access to the repository. Click Settings > Collaborators and in the Manage Access section, click Add people. Inviteinstructlab-ibm
. The invitation is labeled aspending
for 1-2 business days until the invitation is accepted. Until the invitation is accepted, you can continue to work with the taxonomy and generate data, but wait to complete the training steps.d. Save the changes and push the changes to the fork.
f. Optional: Learn more about how to modify the taxonomy for the model.
g. Optional: Validate the updated taxonomy.
Add your taxonomy to COS
After you receive access to InstructLab, store your taxonomy in COS.
-
Create a packaged TAR file of the contents of the Github taxonomy repository by creating a release.
a. In a browser, open the Releases page for your Github repository. Example:
https://github.com/<my-org>/taxonomy/releases
b. Click Create a release.
c. Create a tag, select a target branch, and enter a name for the release.
d. Click Publish a release.
e. To download the packaged TAR file that was automatically generated, in the release that was created, click Source code (tar.gz).
-
Upload the taxonomy to your project.
a. Click Taxonomies.
b. Click Upload.
c. Select the
.tar.gz
file, give the taxonomy an alphanumeric name, select the COS instance and bucket to use (or create a new one), then click Upload.
Add your taxonomy to COS by using the CLI
After you receive access to InstructLab, store your taxonomy in COS.
-
Optional: Run the
set
command to set and save COS bucket details and credentials, which can simplify your commands going forward. You must set each value individually.ibmcloud ilab config set \ cos-bucket \ cos-endpoint \ cos-id \ project-id \ secrets-manager-git-id \ secrets-manager-url \ taxonomy-path \ # The local directory path to the taxonomy file. taxonomy-path-cos \
Understanding this command's components Component Description cos-bucket <bucket_name>
If you are adding a taxonomy to an existing bucket, include the name. You can find this name on the Buckets tab of your COS instance. If you want the bucket to be created for you, you can enter a name for it. If no name is specified and a bucket does not exist yet, a bucket is created that is named instructlab_TIME
, whereTIME
is the current epoch time.cos-endpoint <endpoint>
Use the public, regional endpoint. For example https://s3.us-east.cloud-object-storage.appdomain.cloud
. You can find these in the Endpoints tab of the COS console.cos-id <service_id>
If you have a COS service instance to use, include the service ID. In the user interface for the COS service instance, click Details. Note the CRN, which can be used for the service instance ID. If you want one to be created for you, it is created with the name InstructLab
.project-id
Your InstructLab project ID. secrets-manager-git-id
The git ID for your Secrets Manager instance. secrets-manager-url
The URL to your Secrets Manager instance. taxonomy-path-cos <directory_path>
The relative directory path within the COS bucket to the taxonomy file. Example command to save the taxonomy path, but have the COS service instance and bucket created for you later.
ibmcloud ilab config set cos-bucket test-1
Example command to save the taxonomy path and the details for an existing COS service, but have the COS bucket created for you later.
ibmcloud ilab config set taxonomy-path /Users/USER/instructlab-taxonomy
-
Add the taxonomy to a COS bucket.
ibmcloud ilab taxonomy add --name <name>
Understanding this command's components Parameter Description --name <taxonomy_name>
Required. The name of the taxonomy as it is to be displayed in the COS bucket. Use alphanumeric characters in the taxonomy name. --taxonomy-path <local_directory_path>
Required if not specified with the init
command. The local directory path to the taxonomy.--taxonomy-path-cos <directory_path>
Optional. The relative directory path within the COS bucket to the taxonomy file. --cos-bucket <bucket_name>
Optional. If you are adding a taxonomy to an existing bucket, include the name. You can find this name on the Buckets tab of your COS instance. If you want the bucket to be created for you, you can enter a name for it. If no name is specified and a bucket does not exist yet, a bucket is created that is named instructlab_TIME
.--cos-endpoint <endpoint>
Optional. Use the public, regional endpoint. For example https://s3.us-east.cloud-object-storage.appdomain.cloud
. You can find these in the Endpoints tab of the COS console.--cos-region <region>
Optional. The default value is us-east
.--cos-id <service_id>
Optional. If you have a COS service instance to use, include the service ID. In the user interface for the COS service instance, click Details. Note the CRN, which can be used for the service instance ID. If you want one to be created for you, it is created with the name InstructLab
.Example command to use the details that were saved with the
init
command.ibmcloud ilab taxonomy add --name test
Example command to have the COS bucket created for you in an existing service instance.
ibmcloud ilab taxonomy add \ --name test \ --taxonomy-path /Users/USER/instructlab-taxonomy \ --cos-id existing_service_instance_id
Example command to use an existing bucket.
ibmcloud ilab taxonomy add \ --name test \ --taxonomy-path /Users/USER/instructlab-taxonomy \ --cos-bucket existing-instruct-lab-bucket