This documentation is for IBM Watson® Knowledge Studio on IBM Cloud®. To see the documentation for the previous version of Knowledge Studio on IBM Marketplace, click this link.
About
Use IBM Watson® Knowledge Studio to create a machine learning model that understands the linguistic nuances, meaning, and relationships specific to your industry, or to create a rule-based model that finds entities in documents based on rules that you define.
Identify custom entities and relations
An entities and relations workspace enables you to create your own entity type system and train a custom model that can recognize custom entities in text. With machine learning models, you can also define custom relation types and train the model to recognize when two entities are related. Consider the following example sentence.
ABC Motors has received great reviews for its new 2020 Lightning.
A custom entities and relations model could be trained to recognize "2020 Lightning" as a Vehicle
entity, and "ABC Motors" as a Manufacturer
entity. The model could also be trained to recognize that
the two entities are connected by a isManufacturedBy
relation.
Build a machine learning model
Knowledge Studio provides easy-to-use tools for annotating unstructured domain literature, and uses those annotations to create a custom machine learning model that understands the language of the domain. The accuracy of the model improves through iterative testing, ultimately resulting in an algorithm that can learn from the patterns that it sees and recognize those patterns in large collections of new documents. You can deploy the finished machine learning model to other Watson cloud-based offerings and cognitive solutions to find and extract mentions of relations and entities, including entity coreferences.
Figure 1. Overview of the process to build a machine learning model
- Based on a set of domain-specific source documents, the team creates a type system that defines entity types and relation types for the information of interest to the application that will use the model.
- A group of two or more human annotators annotates a small set of source documents to label words that represent entity types, to identify relation types where the text identifies relationships between entity mentions, and to define coreferences, which identify different mentions that refer to the same thing, that is, the same entity. Any inconsistencies in annotation are resolved, and one set of optimally annotated documents is built, which forms the ground truth.
- Knowledge Studio uses the ground truth to train a model.
- The trained model is used to find entities, relations, and coreferences in new, never-seen-before documents.
See Creating a machine learning model for more details.
Build a rule-based model
Knowledge Studio provides a rules editor that simplifies the process of finding and capturing common patterns in your documents as rules. You can then create a model that recognizes the rule patterns, and deploy it for use in other services.
See Creating a rule-based model for more details.
Identify custom categories
The custom categories feature is experimental. The feature is subject to change or be discontinued with short notice. It is not intended to use custom categories in production environments.
With a categories workspace, you can define a hierarchy of custom content categories and provide relevant key phrases that the service uses to categorize text content. You can deploy a custom categories model to use instead of the standard categories offered by Natural Language Understanding and Discovery.
Watson services integration
Share domain artifacts and models between IBM Watson® Knowledge Studio and other Watson services.
Use Knowledge Studio to perform the following tasks:
- Bootstrap annotation by using the Natural Language Understanding service to automatically find and annotate entities in your documents. When human annotators begin to annotate the documents, they can see the annotations that were already made by the service and can review and add to them. See Pre-annotating documents with Natural Language Understanding for details.
- Upload analyzed documents that are in UIMA CAS XMI format. For example, you can upload UIMA CAS XMI files that were exported from IBM Watson Explorer content analytics collections or IBM Watson Explorer Content Analytics Studio.
- Deploy a machine learning or rule-based model to use with the Watson Discovery service.
- Deploy a machine learning or rule-based model to use with the Natural Language Understanding service.
- Export a machine learning model to use in IBM Watson Explorer.
- Export a rule-based model PEAR file to use in IBM Watson Explorer.
HIPAA support
US Health Insurance Portability and Accountability Act (HIPAA) support is available for Premium plans in the Washington, DC location created on or after 1 April 2019. See Enabling EU and HIPAA supported settings for more information.