About
Use IBM Watson® Knowledge Studio for IBM Cloud Pak for Data to create a machine learning model that understands the linguistic nuances, meaning, and relationships specific to your industry or to create a rule-based model that finds entities in documents based on rules that you define.
Identify custom entities and relations
An entities and relations workspace enables you to create your own entity type system and train a custom model that can recognize custom entities in text. With machine learning models, you can also define custom relation types and train the model to recognize when two entities are related. Consider the following example sentence.
ABC Motors has received great reviews for its new 2020 Lightning.
A custom entities and relations model could be trained to recognize "2020 Lightning" as a Vehicle
entity, and "ABC Motors" as a Manufacturer
entity. The model could also be trained to recognize that
the two entities are connected by an isManufacturedBy
relation.
Build a machine learning model
Knowledge Studio provides easy-to-use tools for annotating unstructured domain literature, and uses those annotations to create a custom machine learning model that understands the language of the domain. The accuracy of the model improves through iterative testing, ultimately resulting in an algorithm that can learn from the patterns that it sees and recognize those patterns in large collections of new documents. You can deploy the finished machine learning model to other Watson cloud-based offerings and cognitive solutions to find and extract mentions of relations and entities, including entity coreferences.
Figure 1. Overview of
the process to build a machine learning model
- Based on a set of domain-specific source documents, the team creates a type system that defines entity types and relation types for the information of interest to the application that will use the model.
- A group of two or more human annotators annotates a small set of source documents to label words that represent entity types, to identify relation types where the text identifies relationships between entity mentions, and to define coreferences, which identify different mentions that refer to the same thing, that is, the same entity. Any inconsistencies in annotation are resolved, and one set of optimally annotated documents is built, which forms the ground truth.
- Knowledge Studio uses the ground truth to train a model.
- The trained model is used to find entities, relations, and coreferences in new, never-seen-before documents.
See Creating a machine learning model for more details.
Build a rule-based model
Knowledge Studio provides a rules editor that simplifies the process of finding and capturing common patterns in your documents as rules. You can then create a model that recognizes the rule patterns, and deploy it for use in other services.
See Creating a rule-based model for more details.
Analyze text with advanced rules
The advanced rules feature is Beta. The feature is in a trial stage of development and is not recommended for use in production environments.
The visual advanced rules editor allows you to create text extractors with deeper customization potential than what is available in the entities and relations rules editor. A number of sample extractors are provided such as Finance Actions extractors and Parts of Speech extractors that can be edited and combined to create your own advanced rules model. You can analyze documents directly in the editor, or you can export your model to use with other services such as Natural Language Understanding.
To get started, see Creating an advanced rules model.
FISMA support
Federal Information Security Management Act (FISMA) support is available for IBM Watson® Knowledge Studio for IBM Cloud Pak for Data offerings purchased on or after August 30, 2019. FISMA support is also available to those who purchased the June 28, 2019 version and upgrade to the August 30, 2019 version. IBM Watson® Knowledge Studio for IBM Cloud Pak for Data is FISMA High Ready.
Watson services integration
Share domain artifacts and models between Knowledge Studio and other Watson services available on IBM Cloud Pak for Data.
Use Knowledge Studio to perform the following tasks:
- Upload analyzed documents that are in UIMA CAS XMI format. For example, you can upload UIMA CAS XMI files that were exported from IBM Watson Explorer content analytics collections or IBM Watson Explorer Content Analytics Studio.
- Export a machine learning or rule-based model to use with Watson Discovery for IBM Cloud Pak for Data.
- Export a machine learning to use with Natural Language Understanding for IBM Cloud Pak for Data.
- Export a machine learning model to use in IBM Watson Explorer.
- Export a rule-based model PEAR file to use in IBM Watson Explorer.