Getting the most from Discovery
Discovery was redesigned to introduce new features and a simpler way to build solutions.
The redesigned product is referred to as Discovery v2. When you create an instance on IBM Cloud or install and provision an instance on IBM Cloud Pak for Data, you get the new and improved version of Discovery.
Advantages of using the latest version
Discovery v2 offers the following features and enhancements:
- A project-based experience that supports many different use cases within a single environment.
- Built-in customization tools for adding dictionaries, patterns, and classifiers to help business users build projects that understand the language of their domain.
- Connectors to popular data sources that can quickly access valuable data where it resides.
- Smart Document Understanding that learns from the structure of human-readable documents, such as PDFs.
- Natural language query support across all document types, optimized with machine learning to find targeted answers.
- Advanced search capabilities, such as answer finding, curations, and table retrieval.
- An out-of-the-box contract understanding function that helps you search and interpret legal contracts.
- A full-featured Content Mining application that you can use to conduct in-depth analysis of unstructured text.
- Customizable user interface components that help you to deploy custom applications.
For more information, see Migrating to Discovery v2.
Comparing v1 and v2 features
If you are already familiar with Discovery v1, learn more about how Discovery v2 compares.
Discovery v2 has new features that were previously unavailable. The following table describes feature support in both versions.
Feature | Product redesign (v2) | Earlier version (v1) |
---|---|---|
Use projects to organize your work | ||
Use the Smart Document Understanding (SDU) to annotate your documents | ||
Leverage intuitive user interface tools to add domain-specific artifacts, such as dictionaries and custom machine learning models | ||
Create a content mining project type and then use the built-in Content Mining application to do in-depth data analysis (IBM Cloud Pak for Data, Enterprise, and Premium plans only) | ||
Perform real-time NLP with the Analyze API (IBM Cloud Pak for Data and Enterprise plans only) | ||
Apply a pretrained Smart Document Understanding model to your collection for similar benefits with less effort | ||
Process text from scanned documents or other images | ||
Extract meaning from tables | ||
Get insights from contracts (IBM Cloud Pak for Data, Enterprise, and Premium plans only) | ||
Apply the Part of Speech enrichment to your data | ||
Use the Entity Extraction, Document and Phrase Sentiment Analysis, and Keyword Extraction enrichments | ||
Use the Category classification, Concept tagging, Relation Extraction, Emotion Analysis, and Semantic Role Extraction, Sentiment of Keywords and Entities enrichments, which are available with the Natural Language Understanding service | ||
Build a custom entity type system | ||
Apply Watson Knowledge Studio NLP models to your data | ||
Support for more connectors from a IBM Cloud Pak for Data deployment, including databases, file systems, FileNet P8, and HCL Notes | ||
Some connectors support document-level security from a IBM Cloud Pak for Data deployment | ||
Programmatically configure external data source crawls | ||
Configure the normalization processes of document segmentation and HTML file inclusion or exclusion rules during ingestion | ||
Configure the JSON normalization process during ingestion and after enrichment | ||
Configure dictionary tokenization | ||
Advanced question-answering capabilities, such as returning the exact answer | ||
Discovery Query Language (DQL) API support | ||
Retrieve passages from documents | ||
Perform relevancy training to improve query results | ||
Configure continuous relevancy training | ||
Retrieve tables | ||
Query result deduplication | ||
Identify document similarity in query results | ||
Indicate a preference (bias) in queries | ||
Review query logging and metrics |
Limit details
For more information about artifact limits per plan, see the feature documentation:
- Advanced rules model limits
- Classifier limits
- Collection limits
- Dictionary limits
- Document limits
- Entity extractor limits
- Machine Learning model limits
- Pattern limits
- Project limits
- Query limits
- Regular expression limits
- SDU limits
- External enrichment limits
The following limits apply only to Content Mining project types:
To check the current status of the limits and usage for your plan type, you can open the Plan limits and usage page at any time.
-
From the product page header, click the user icon .
The Usage section shows a short summary.
-
Click View all to see usage information for all of the plan limit categories.
To leave the page, click the web browser back button or the My Projects tab.