IBM Cloud Docs
About

About

Use IBM Watson® Knowledge Studio to create a machine learning model that understands the linguistic nuances, meaning, and relationships specific to your industry or to create a rule-based model that finds entities in documents based on rules that you define.

Identify custom entities and relations

An entities and relations workspace enables you to create your own entity type system and train a custom model that can recognize custom entities in text. With machine learning models, you can also define custom relation types and train the model to recognize when two entities are related. Consider the following example sentence.

ABC Motors has received great reviews for its new 2020 Lightning.

A custom entities and relations model could be trained to recognize "2020 Lightning" as a Vehicle entity, and "ABC Motors" as a Manufacturer entity. The model could also be trained to recognize that the two entities are connected by a isManufacturedBy relation.

Build a machine learning model

Knowledge Studio provides easy-to-use tools for annotating unstructured domain literature, and uses those annotations to create a custom machine learning model that understands the language of the domain. The accuracy of the model improves through iterative testing, ultimately resulting in an algorithm that can learn from the patterns that it sees and recognize those patterns in large collections of new documents. You can deploy the finished machine learning model to other Watson cloud-based offerings and cognitive solutions to find and extract mentions of relations and entities, including entity coreferences.

Overview of the process to build a machine learning model Figure 1. Overview of the process to build a machine learning model

  1. Based on a set of domain-specific source documents, the team creates a type system that defines entity types and relation types for the information of interest to the application that will use the model.
  2. A group of two or more human annotators annotates a small set of source documents to label words that represent entity types, to identify relation types where the text identifies relationships between entity mentions, and to define coreferences, which identify different mentions that refer to the same thing, that is, the same entity. Any inconsistencies in annotation are resolved, and one set of optimally annotated documents is built, which forms the ground truth.
  3. Knowledge Studio uses the ground truth to train a model.
  4. The trained model is used to find entities, relations, and coreferences in new, never-seen-before documents.

See Creating a machine learning model for more details.

Build a rule-based model

Knowledge Studio provides a rules editor that simplifies the process of finding and capturing common patterns in your documents as rules. You can then create a model that recognizes the rule patterns, and deploy it for use in other services.

See Creating a rule-based model for more details.

Analyze text with advanced rules

The advanced rules feature is Beta. The feature is in a trial stage of development and is not intended for use in production environments.

The visual advanced rules editor allows you to create text extractors with deeper customization potential than what is available in the entities and relations rules editor. A number of sample extractors, such as Finance Actions extractors and Parts of Speech, are provided. You can edit and combine them to create your own advanced rules model. You can analyze documents directly in the editor, or you can export your model to use with other services such as Natural Language Understanding.

To get started, see Creating an advanced rules model.

Identify custom categories

The custom categories feature is experimental. The feature is subject to change or be discontinued with short notice. It is not intended to use custom categories in production environments.

With a categories workspace, you can define a hierarchy of custom content categories and provide relevant key phrases that the service uses to categorize text content. You can deploy a custom categories model to use instead of the standard categories offered by Natural Language Understanding and Discovery.

Watson services integration

Share domain artifacts and models between IBM Watson® Knowledge Studio and other Watson services.

Use Knowledge Studio to perform the following tasks:

HIPAA support

US Health Insurance Portability and Accountability Act (HIPAA) support is available for Premium plans in the Washington, DC location created on or after 1 April 2019. See Enabling EU and HIPAA supported settings for more information.