Failed to process invalid taxonomy files error during data generation in InstructLab

When you try to generate data, you get the following error.

Failed to process invalid taxonomy files.  See detailed logs in COS.

The taxonomy was in invalid format.

When a taxonomy is uploaded, the file structure is validated. During data generation, additional validation is done to verify the content of the taxonomy.

Fix issues with the taxonomy.

  1. In the Object Storage log files, look for errors.

    Example:

    ERROR 2025-04-16 12:47:57,890 instructlab.schema.taxonomy:131: taxonomy-small/knowledge/opensource/instructlab/qna.yaml:14:27 trailing spaces (trailing-spaces)
    ERROR 2025-04-16 12:47:57,890 instructlab.schema.taxonomy:131: taxonomy-small/knowledge/opensource/instructlab/qna.yaml:15:20 trailing spaces (trailing-spaces)
    
  2. Fix the errors locally.

  3. Run the ilab validation command.

    ilab taxonomy diff
    
  4. Upload the taxonomy updates to Object Storage.

  5. Run the data generation again.