About Watson Discovery

IBM Watson® Discovery is an intelligent document processing engine that helps you to gain insights from complex business documents.

Use Discovery to visually train AI for deep understanding of your content, including tables and images, to help you find business value that is hidden in your enterprise data. Use natural language or structured queries to find relevant answers, surface insights, and build AI-enhanced business processes anywhere.

Start by connecting your data to Discovery. Next, teach Discovery to understand the language and concepts that are unique to your business and industry. Enrich your data with award-winning Watson Natural Language Processing (NLP) technologies so you can identify key information and patterns. Finally, build search solutions that find answers to queries, explore your data to uncover patterns and insights, and leverage search results in automated workflows.

Use Watson Discovery to Connect and ingest data from existing content repositories. Teach Discovery to interpret your data. Enrich data with custom NLU to help it fit your domain. Next, choose whether to 1. Use the query API to get search Results by retrieving relevant data in your applications. 2. Explore your data in Watson Assistant, with the Discovery API, or from the Content Mining application. 3. Submit queries from the product user interface to ask a question in a natural language. — How to use Watson Discovery

Find out how Discovery is transforming data into artist insights at the 2023 GRAMMYs®. Read the IBM Business Operations blog post to learn more.

Overview video

Watch a video about how Discovery uses AI-powered search, retrieval, and content mining. This overview covers the key basics of projects, collections, fields, and enrichments. It explains how to upload your data and query for answers, find insights, and spot trends.

Video transcript

Get started with Watson Discovery presented by David Williams - (Music intro) Welcome to Watson Discovery with AI.

In this video, we'll walk through some key concepts and show you how to get started.

Watson Discovery is made up of four main concepts, projects, collections, fields, and enrichments.

A project is a space where can import different types of data from a variety of sources, and query for insights or answers.

A collection is a set of documents that you upload or crawl from a connected data source.

As documents are crawled, unstructured text is organized into fields such as author, file type, text, and more.

And enrichments are AI capabilities that you can apply to fields to identify and extract relevant information from your documents. This helps you find answers or insights from your data.

Let's dive in to the different project types.

A document retrieval project is used to build an AI-powered search function that finds answers in your business data.

A conversational project is used to enhance your chatbot's question and answer ability.

A content mining project helps you spot trends across large volumes of text-heavy business data.

Watson Discovery supports a wide selection of data sources you can crawl, like webpages, Cloud Object Storage, Microsoft SharePoint, and more. You can even upload your own data from any data source.

After connecting and processing your data, you can apply enrichments to bring your data to life. Some commonly used enrichments are entities, contracts, and table understanding. Entities enrichment can be used to recognize people, organizations, and more. Contracts enrichment can be used to decompose contracts to fields, clauses, and relationships. The table understanding enrichment can be used to identify tables and return them as an answer to a query.

You can also create custom enrichments, such as a dictionary, so Discovery can understand your industry-specific terminology and support intelligent queries.

Now, you know the basics.

To get started, take our step-by-step product tour to get familiar with the user interface and sample project.

Using Discovery

Discovery can be deployed as a managed cloud service or can be installed on premises. This documentation describes how to use the product regardless of how it is deployed. Information that applies exclusively to one deployment type is denoted by the appropriate icon:

IBM Cloud Pak for Data IBM Software Hub for installed instances, such as IBM Watson® Discovery Cartridge for IBM Cloud®.
IBM Cloud for managed instances, such as Discovery Plus, Enterprise, and Premium plan instances that are hosted by IBM Cloud or instances that are provisioned with IBM Cloud Pak for Data as a Service.

Click the Help icon from the header of any page in the product user interface to open the Discovery documentation.

Browser support

IBM Cloud Pak for Data IBM Software Hub

The minimum required browser software for the product user interface includes the following browsers:

Google Chrome

Latest version -1 for your operating system

Mozilla Firefox

Latest regular -1 and Extended Support Release (ESR) version for your operating system

Microsoft Edge

Latest version -1 for Windows

Apple Safari

Latest version -1 for Mac
The IBM Cloud Pak for Data web client where you create service instances supports the IBM Cloud Pak for Data requirements. For more information, see Supported web browsers

IBM Cloud

Deployments of Discovery that are managed by IBM Cloud follow the IBM Cloud requirements. For more information, see Prerequisites
For more information about browser support for deployments that are provisioned with Cloud Pak for Data as a Service, see Which web browsers are supported for Cloud Pak for Data as a Service.

Language support

Language support by feature is detailed in the Supported languages topic.

Beta features

IBM releases services, features, and language support for your evaluation that are classified as beta. These features might be unstable, might change frequently, and might be discontinued with short notice. Beta features also might not provide the same level of performance or compatibility that generally available features provide and are not intended for use in a production environment.

Terms and notices

IBM Cloud

IBM Cloud Pak for Data

Security on Cloud Pak for Data

Trademarks are listed in the Trademarks page for all IBM Cloud services.