Red Hat® AI InstructLab on IBM Cloud® is a business-ready, private, and secure generative AI solution, powered by Red Hat Enterprise Linux AI.
Getting started with Red Hat AI InstructLab on IBM Cloud
Get ready to dive into AIThe capability to acquire, process, create and apply knowledge in the form of a model to make predictions, recommendations or decisions. with Red Hat® AI InstructLab on IBM Cloud®!
InstructLab is an open source project created by IBM and Red Hat to be a cost-effective entry point into the world of machine learningA branch of artificial intelligence (AI) and computer science that focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving the accuracy of AI models.. You can use InstructLab to make contributions to a large language model without having to own and operate hardware infrastructure.
What is Red Hat AI InstructLab?
To use InstructLab, you don't need to have any preexisting knowledge. You don't even need to have an idea for what to create yet. Let's start by just getting familiar with the concepts.
InstructLab is a project for enhancing large language models (LLMs), which are AI models that utilize machine learning techniques to generate human language. You start by providing knowledge and skills that matter most to your business in what's known as a taxonomy, or a directory of data. The taxonomy is used to generate synthetic data, which is then used to train the model through multiple phases of fine-tuning. This process aligns your LLM with your goals by providing not just general knowledge, but the specific skills and contexts that are most important for your unique business needs.
For more information, see How it works. Or, jump in and get started by preparing and uploading your first taxonomy.
Set up your IBM Cloud® account
Make sure you have the following before continuing.
-
A Pay-As-You-Go or Subscription IBM Cloud® account. Trial accounts are not supported. For more information or to upgrade the account, see Account types.
-
Optional: If you are using a private repo to store your taxonomy knowledge documents, create a Secrets Manager instance.
Optional: Prepare a taxonomy
-
Fork the IBM Cloud taxonomy or create your own.
-
If you don't already have a taxonomy, you can fork the IBM Cloud taxonomy repo and clone it to your local machine. This taxonomy has the correct file structure already created for you. You can add knowledge and skills in the corresponding directories.
-
To create your own taxonomy instead, see Preparing taxonomies for more information.
You can also use the IBM Cloud community Jupyter notebook to create your taxonomy. For more information, see redhat-ai-instructlab-jupyter-notebooks GitHub repo
-
-
Make updates to your taxonomy. The following example adds rhyming questions to the linguistics directory.
a. In your fork, in the
compositional_skills/linguisticsdirectory, create aqna.yamlfile.b. In the
qna.ymlfile, add a question related to rhyming words.- answer: 'Here are two rhyming words for "cave": 1\. Brave 2\. Gave' question: 'Give me two words that rhyme with cave'c. If your additions include reference documents, such as web articles or files in Github, you can reference the public GitHub repository and SHA of a file like in this example.
document: repo: https://github.com/<organization>/<repository> commit: <commit_sha> patterns: - <filename>.mdd. Save the changes and push the changes to the fork.
f. Optional: Validate the updated taxonomy.
-
In a browser, open the Releases page for your Github repository. For example:
https://github.com/<my-org>/taxonomy/releases. -
Click Create a release.
-
Create a tag, select a target branch, and enter a name for the release.
-
Click Publish a release.
-
Download the packaged
tar.gzfile that was automatically generated from the release by clicking Source code (tar.gz). -
Optional: If you are using a private repository for your taxonomy knowledge documents, complete the following steps.
-
Follow the GitHub documentation to create a classic personal access token (PAT).
-
In the Repository access section, scope your PAT to your taxonomy repo.
-
In the Repository permissions section, select Contents >
read-onlyand Metadata >read-only.
-
Upload your taxonomy by using the console
Complete the following steps to store your taxonomy in Object Storage.
-
From the Projects page, select your project.
-
Click Taxonomies.
-
Click Upload and enter the following details.
- Taxonomy file
- Select your
.tar.gzfile. - Taxonomy name
- Give the taxonomy an alphanumeric name.
- Private repository access
- Enable this option if your taxonomy knowledge documents are in a private repo.
- Secrets Manager service instance: Select an existing instance or create one.
- Secrets Manager secret: Select an existing secret or create one. If you are creating a secret, select the Key-value secret type and add your personal access token in the following format. Note that the
value for
github_urlmust containhttps://. The URL is the same URL that you used in thereposection of the your taxonomy document reference.
{ "github_url": "https://...", "github_pat": "xxxxx" }For more information, see Creating Key-value secrets.
- Cloud storage
- Either select a Object Storage instance and bucket to use or create an instance and bucket.
- Service authorization
- Check the box to allow InstructLab to write your taxonomy to Object Storage
- Optional storage settings
- Specify the path where you want to store the taxonomy
tar.gzin Object Storage.
-
Click Upload
Add the taxonomy to Object Storage by using the CLI
Complete the following steps to store your taxonomy in Object Storage.
You can use the set command to save Object Storage bucket details and credentials, and more. This can simplify your commands going forward. Note that when using the set command, you must set each value individually.
For more information, see the Config command reference.
- Log in to your IBM Cloud account from the CLI.
ibmcloud login -a https://cloud.ibm.com --sso -r us-east - If you plan to allow InstructLab to create IBM Cloud® Object Storage Instance resources for you, target a resource group.
Example:ibmcloud target -g <resource_group>ibmcloud target -g Default
-
Create the authorization policy for InstructLab and Object Storage.
ibmcloud iam authorization-policy-create Writer --source-service-name instructlab --target-service-name cloud-object-storage -
Optional: If you are using a private repo to store your taxonomy knowledge documents, complete the following steps.
-
Create a service authorization to allow InstructLab to access your Secrets Manager instance and secrets.
ibmcloud iam authorization-policy-create Reader --source-service-name instructlab --target-service-name secrets-manager -
Add your personal access token (PAT) to Secrets Manager by creating a Key-value secret. Make sure your key-value details are stored in the following format.
{ "github_url": "https://...", "github_pat": "xxxxx" }Example command for creating a key-value secret.
ibmcloud secrets-manager secret-create --secret-prototype='{"name": "my-secret","description": "Description of my key-value secret.","secret_type": "kv","secret_group_id": "67d025e1-0248-418f-83ba-deb0ebfb9b4a","labels": ["dev","us-south"],"data": {"github_url": "https://...","github_pat": "xxxxx"},"custom_metadata": {"metadata_custom_key": "metadata_custom_value"},"version_custom_metadata": {"custom_version_key": "custom_version_value"}}'For more information, see Creating Key-value secrets.
-
List your Secrets Manager instances.
ibmcloud resource service-instances --service-name secrets-manager -
Get your instance details.
ibmcloud resource service-instance INSTANCE
-
-
Run the
taxonomy add --helpcommand and review the command options.ibmcloud ilab taxonomy add --help -
Optional If you have an existing Object Storage instance that you want to use, get your service instance details.
- List your Object Storage instances.
ibmcloud resource service-instances --service-name cloud-object-storage - Get your instance details.
ibmcloud resource service-instances INSTANCE
- List your Object Storage instances.
-
Add your taxonomy to your Object Storage bucket. Review the following example commands.
Quick start Example command to automatically create an Object Storage instance and bucket in your account and upload a taxonomy from your
./taxonomyfolder to it.ibmcloud ilab taxonomy add \ --name example-taxonomy-1 \ --taxonomy-path "./taxonomy"Example command to upload a taxonomy from a
taxonomyfolder on your machine to an existing Object Storage instance and bucket.ibmcloud ilab taxonomy add \ --name example-taxonomy-name-1 \ --taxonomy-path-cos "taxonomies/taxonomy.tar.gz" \ --taxonomy-path "./taxonomy" \ --cos-bucket example-cloud-object-storage-bucket-1 \ --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloudExample command to use an existing Object Storage instance and bucket as well as Secrets Manager credentials.
ibmcloud ilab taxonomy add \ --name example-taxonomy-1 \ --taxonomy-path-cos taxonomies/taxonomy.tar.gz \ --taxonomy-path "./taxonomy" \ --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloud \ --secrets-manager-git-id SEC-MGR-ID --secrets-manager-git-url https://URLExample command to upload a taxonomy from your
/Users/USER/instructlab-taxonomyfolder to an new, automatically created bucket.ibmcloud ilab taxonomy add \ --name test \ --taxonomy-path "/Users/USER/instructlab-taxonomy" \ --cos-endpoint https://s3.us-east.cloud-object-storage.appdomain.cloud \ --cos-id 628e4348-2183-42fa-a03a-6f0f78453530
What's next?
Now that you've uploaded a taxonomy, the next step is to Generate data from your taxonomy.