IBM Cloud Docs
Training models for InstructLab

Training models for InstructLab

Complete the following steps to train your model on generated data. Then test the model to verify the results.

Configuration information or files cannot be passed to the model for fine tuning.

Prerequisites

  1. Prepare your taxonomy
  2. Add the taxonomy TAR to your Object Storage bucket. You can use the CLI or the UI.
  3. Generate data from your taxonomy.

Aligning models by using the console

  1. From the InstructLab Projects page, Click your project > Aligned models > Align.

  2. Enter an alphanumeric name for the model, select the training data to use, and click Align. The state is queued, then running. Wait for the state to be completed. This process could take minutes or hours. When the alignment is complete, in the Object Storage bucket, a trained_models directory is created with logs for troubleshooting.

Training models by using the CLI

  1. Get the ID for the data to use.

    ibmcloud ilab data list
    
  2. Run the command to start training the model with the generated data. Note the ID.

    ibmcloud ilab model train --name testmodel --data-id <data_id>
    
  3. Check the details of your data generation. Include the ID for the model. The state is queued, then running. Wait for the state to be completed. This process could take minutes or hours.

    ibmcloud ilab model get --id <model_id>
    

    Example model get command with the --output json option.

    ibmcloud ilab model get --id daef9836-631f-4686-ad18-e0e6a0910f5d --output json 
    

    Example JSON output

    {                                                                                                                 
      "base_model": "granite-7b",                                                                                     
      "created_at": "2024-10-10T16:06:05.000Z",                                                                       
      "data_id": "8b1433c0-e375-4b00-b36d-2ad00697014e",                                                              
      "id": "daef9836-631f-4686-ad18-e0e6a0910f5d",                                                                   
      "model_metrics": {                                                                                              
        "mmlu": {                                                                                                     
          "overall_average": 0.51,                                                                                    
          "scores": {                                                                                                 
            "mmlu_abstract_algebra": 0.3,                                                                             
            "mmlu_anatomy": 0.43,                                                                                     
            "mmlu_astronomy": 0.49                                                                                    
          }                                                                                                           
        },                                                                                                            
        "mt_bench": {                                                                                                 
          "error_rate": 0.01,                                                                                         
          "overall_average": 6.86,                                                                                    
          "scores": {                                                                                                 
            "turn_one": 7.25,                                                                                         
            "turn_two": 6.47                                                                                          
          }                                                                                                           
        },                                                                                                            
        "mt_bench_branch": {                                                                                          
          "error_rate": 0.01,                                                                                         
          "improvements": {                                                                                           
            "compositional_skills/STEM/math/time_series/qna.yaml": 8.67,                                              
            "compositional_skills/extraction/invoice/csv/qna.yaml": 8.4                                               
          },                                                                                                          
          "no_change": {                                                                                              
            "compositional_skills/roleplay/explain_like_i_am/primary_schooler/qna.yaml": 0,                           
            "compositional_skills/writing/freeform/technical/proposal/qna.yaml": 0                                    
          },                                                                                                          
          "regressions": {                                                                                            
            "compositional_skills/extraction/inference/qualitative/sentiment/qna.yaml": -9,                           
            "compositional_skills/extraction/information/named_entities/places/qna.yaml": -9                          
          }                                                                                                           
        }                                                                                                             
      },                                                                                                              
      "name": "test",                                                                                                 
      "state": "completed",                                                                                           
      "status": "completed",                                                                                          
      "taxonomy_id": "e62ccea5-97e6-4568-86bf-2f359987b115"                                                           
    }
    

When the state is completed, in the Object Storage bucket, a trained_models directory is created with logs for troubleshooting.

Training models by using the API

  1. Get the ID for the data to use.

    Example command.

    curl -X 'GET' \
      'https://us-east.instructlab.ibm.com/v1/data' \
      -H 'accept: application/json'
    

    Example output.

    {
      "data": [
        {
          "id": "add785e6-a8c3-4f5f-ab89-c506a3f115da",
          "name": "example-data-1",
          "state": "",
          "status": "queued",
          "created_at": "2024-10-23T02:58:50.000Z",
          "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7"
        }
      ]
    }
    
  2. Run the command to start training the model with the generated data. Note the ID.

    Example command.

    curl -X 'POST' \
      'https://us-east.instructlab.ibm.com/v1/models' \
      -H 'accept: application/json' \
      -H 'Content-Type: application/json' \
      -d '{
      "name": "example-model-1",
      "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da"
    }'
    

    Example output.

    {
      "id": "baa8cfb5-e306-4e15-869d-735b74b1919d",
      "name": "example-model-1",
      "state": "",
      "status": "queued",
      "created_at": "2024-10-23T02:58:50.000Z",
      "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da",
      "base_model": "granite-7b",
      "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7",
      "model_metrics": {
        "mmlu": {
          "overall_average": 0.3,
          "scores": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        },
        "mmlu_branch": {
          "error_rate": 0.4,
          "improvements": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "regressions": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "no_change": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        },
        "mt_bench": {
          "overall_average": 0.8,
          "error_rate": 0.6,
          "scores": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        },
        "mt_bench_branch": {
          "error_rate": 0.4,
          "improvements": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "regressions": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "no_change": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        }
      }
    }
    
  3. Check the details of your data generation. Include the ID for the model. The state is queued, then running. Wait for the state to be completed. This process could take minutes or hours.

    Example command.

    curl -X 'GET' \
      'https://us-east.instructlab.ibm.com/v1/models/baa8cfb5-e306-4e15-869d-735b74b1919d' \
      -H 'accept: application/json'
    

    Example output.

    {
      "id": "baa8cfb5-e306-4e15-869d-735b74b1919d",
      "name": "example-model-1",
      "state": "",
      "status": "queued",
      "created_at": "2024-10-23T02:58:50.000Z",
      "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da",
      "base_model": "granite-7b",
      "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7",
      "model_metrics": {
        "mmlu": {
          "overall_average": 0.3,
          "scores": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        },
        "mmlu_branch": {
          "error_rate": 0.4,
          "improvements": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "regressions": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "no_change": {
            "additionalProp1": 1,
            "additionalProp3": 3,
            "additionalProp2": 2
          }
        },
        "mt_bench": {
          "overall_average": 0.8,
          "error_rate": 0.6,
          "scores": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        },
        "mt_bench_branch": {
          "error_rate": 0.4,
          "improvements": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "regressions": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          },
          "no_change": {
            "additionalProp1": 1,
            "additionalProp2": 2,
            "additionalProp3": 3
          }
        }
      }
    }
    

When the state is completed, in the Object Storage bucket, a trained_models directory is created with logs for troubleshooting.

What's in my Object Storage bucket after training?

After training the model, your Object Storage bucket contains a trained models directory with the following files.

Artifacts
These files contain the Phase 1 and Phase 2 checkpoint data and the model for each epoch.
Eval
These files contain the evaluation metrics for the mmlu, mmlu_branch, mt_bench and mt_bench_branch benchmarks.
Logs
These files contain the Red Hat AI InstructLab execution logs and system details.
Model
These files contain the final outputted model in safetensors format. The contents of this directory are used by the model.

What's next?

Optional: You can deploy the model.