IBM watsonx Code Assistant model details
watsonx Code Assistant
IBM watsonx Code Assistant is designed to accelerate the software development lifecycle and is built on IBM Granite code models.
General Programming
The public data sources used to train the models include:
- GitHub
- StarCoder
- CommitPack
- Glaive-code-assistant
Java
Enhanced enterprise Java model for remediation, explanation, unit test generation and fix. This model takes advantage of new enterprise Java data tailored to the these use cases and a code jam by hundreds of IBM WebSphere developers.
The public data sources used to train the models include:
- methods2test
- Jakarta EE source code and documentation
- MicroProfile source code and documentation
Jakarta EE Specifications
Copyright © 2016-2023, Eclipse Foundation and Copyright © 2016-2024, Eclipse Foundation AISBL. This software or document includes material copied from or derived from the JarkataEE Specifications available from https://jakarta.ee/specifications/. This software may generate material that implements, is copied from or derived from the documents referenced herein. User is responsible for determining whether the requirements of the Eclipse Foundation Specification License (https://www.eclipse.org/legal/efsl/) apply to the generated materials.
MicroProfile Specifications and API documentation
Copyright © 2016-2023, Eclipse Foundation and Copyright © 2016-2024 Eclipse Foundation AISBL. This software or document includes material copied from or derived from Eclipse Foundation MicroProfile Specification documents and Eclipse Foundation MicroProfile API documentation available from https://microprofile.io/specifications/. This software may generate material that implements, is copied from or derived from the documents referenced herein. User is responsible for determining whether the requirements of the Eclipse Foundation Specification License (https://www.eclipse.org/legal/efsl/) apply to the generated materials.
Other (Such as natural language and math)
The public data sources used to train the models include:
- MathInstruct
- OpenWebMath
- S2ORC: The Semantic Scholar Open Research Corpus
- RedPajama 1T
- Stack Exchange
- Wikimedia