High availability and disaster recovery

IBM Watson® Discovery is highly available in all IBM Cloud® regions where Discovery is offered. However, recovering from potential disasters that affect an entire region requires planning and preparation.

IBM Cloud

This information applies only to managed deployments.

You are responsible for understanding your customization and usage of the service. You are also responsible for being ready to re-create an instance of the service in a new region and to restore your data in any region. See How do I ensure zero downtime? for more information.

High availability

IBM Watson® Discovery supports high availability with no single point of failure. The service achieves high availability automatically and transparently by using the multi-zone region (MZR) feature provided by IBM Cloud.

IBM Cloud enables multiple zones that do not share a single point of failure within a single location. It also provides automatic load balancing across the zones within a region.

Disaster recovery

Disaster recovery can become an issue if an IBM Cloud region experiences a significant failure that includes the potential loss of data. Because MZR is not available in all regions, wait for IBM to bring a region back online if it becomes unavailable. If underlying data services are compromised by the failure, also wait for IBM to restore those data services.

If a catastrophic failure occurs, IBM might not be able to recover data from database backups. In this case, you need to restore your data to return your service instance to its most recent state. You can restore the data to the same or to a different region.

Your disaster recovery plan includes knowing, preserving, and being prepared to restore all data that is maintained on IBM Cloud.

Backing up your data in Watson Discovery

You have several methods to back up the data that is stored in IBM Watson® Discovery. You can back up the following data in your disaster recovery plan:

Data as a copy of the source documents
Data that is imported and exported from Discovery

You can back up the resources that are listed in the following table to prevent data loss during disaster recovery.

Resource recovery support details
This table has row and column headers. The row headers identify resources. The column headers identify the different types of recovery support. To understand which recovery methods are supported for a resource, go to the row that describes the resource, and find the column for the recovery method that you are interested in.
Resource	Backup process	UI support for resource download	UI support for resource upload	API support
Uploaded and crawled files	Securely store the backups for all source documents of the ingested data. For more information, see Ingested documents. Preserve the credentials of data sources for recrawling by reconnecting to the original data sources. For more information, see Restoring connections to external data sources. For more information, see Note.	No	No	Download – No Upload - No
Relevancy training data	Download all training queries and examples and save them locally to back up the training data. For more information, see Training data.	No	No	Download – Yes Upload - Yes
Expansion list	Download the expansion lists through API and store them locally to restore the expansion lists. For more information, see Expansion lists.	No	Yes	Download – Yes Upload - Yes
Stop words list	Download the stop words lists through API and store them locally to restore the stop words lists. For more information, see Stop words.	No	Yes	Download – Yes Upload - Yes
Smart Document Understanding user-trained model	Export your models and store them locally to back them up. For more information, see Restoring Smart Document Understanding models.	Yes	Yes	Download - No Upload - No
Text classifier enrichment	Back up the .csv files of the classifier and store them locally. For more information, see Text classifier.	No	Yes	Upload - No Download - No
Sentence classifier enrichment from sentence labelling	Download the .sc files of the sentence classifier models you created from the sentence labelling UI. For more information, see Restoring sentence classifiers.	Yes	Yes	Download - No Upload - No
Document classifier enrichment	Back up the .csv files of the classifier and store them locally. For more information, see Content Mining application resources.	No	Yes	Download - No Upload - Yes
Dictionary enrichment	Download the .csv files of the dictionaries that you created by using the Discovery UI or Content Mining application. For more information, see Restoring dictionary enrichments. You cannot download the dictionary as CSV files, if you created the dictionary by uploading the CSV files.	Yes	Yes	Download - No Upload - Yes
Entity extractor enrichment	Download the entity extractor models and store them locally to back up the models. For more information, see Restoring entity extractors.	Yes	Yes	Download - No Upload - No
Regular expression enrichment	Keep the terms of the regular expression that you specified in the Discovery UI. For more information, see Regular expressions enrichments. You can create the regular expression enrichment through API, but you cannot download it.	No	Yes	Download - No Upload - Yes
Pattern enrichment	Download the pattern enrichments as .zip files and store them locally. For more information, see Pattern enrichments and Restoring pattern enrichments.	Yes	Yes	Download - No Upload - No
Advanced rules enrichment	Back up the original model files as .zip files and store them locally. For more information, see Advanced rules models enrichment.	No	Yes	Download - No Upload - No
Rule-based models created in Knowledge Studio	Back up the original model files as .pear files and store them locally. For more information, see Use imported ML models to find custom terms.	No	Yes	Download - No Upload - Yes
Machine learning models created in Knowledge Studio	Back up the original model files as .zip files and store them locally. For more information, see Use imported ML models to find custom terms.	No	Yes	Download - No Upload - Yes
IBM Cloud Pak for Data IBM Software Hub Custom UIMA text analysis models	Back up the original model files as .pear files and store them locally. For more information, see Use imported ML models to find custom terms.	No	Yes	Download - No Upload - Yes

You cannot subsequently download files that you add to Discovery because the original files are not stored in Discovery. However, you can retrieve information from the file that is stored in the collection index when the original file is processed. Use the Query API to submit a query that will return a passage from the file of interest, and then check the response body for data from the file. For example, for some file types, text from the original file is stored in the text field.

For information about resources that are created with the Content Mining application, see Content Mining resources.

Sentence classifier enrichments

To back up sentence classifier models that were created from the sentence labeling UI, download the models and store them locally. A model must be fully trained before it can be downloaded. For more information, see Downloading the sentence classifier model.

Ingested documents

Your uploaded documents are converted, enriched, and stored in the search index. If a disaster occurs, the search index is not recoverable. Store a backup of all your source documents in a safe place.

If you also import documents by doing scheduled crawls of external data sources, you might want to retain your data source credentials externally so that you can reestablish the connection to your data sources quickly. For the list of available sources and the credentials that are needed for each one, see Configuring IBM Cloud data sources.

You can get some of the text that was stored in the index when the original document was ingested by using the Query API. For more information, see Recovering documents.

Training data

Refer to this task to back up your training data queries and examples for a trained project. Training data is used for explicit training of your projects and is stored on a per project basis. To extract the training data, use the API to download the queries and the ratings from Discovery. To back up training data queries and examples, complete the following steps:

Download your training data by using the list training queries API.
Save your training queries and examples locally.

The document IDs that you use in your training data point to the documents in your current project. Use the same IDs in your new projects to ensure that the correct documents are referenced. If the IDs do not match, your restored relevancy training will not work.

Expansion lists

If you are using synonyms (query expansions) for query modification, back up your .json expansion list, and store it locally. For more information, see Implementing synonyms.

Stopwords

In the case of stopwords, back up the text file. For more information about stopwords, see Defining stopwords.

Collection information

This is not required, but it is a best practice to retrieve the status for each collection regularly and store the information locally. By retaining these statistics, you can later verify that your restoration processes were successful if needed.

Smart Document Understanding models

If you use Smart Document Understanding (SDU), you have models that are associated with your configuration. To avoid loss of this information, export your models, back them up, and store them locally. SDU models have the file extension of .sdumodel.

Dictionary enrichments

Open your project, and click Improve and customize.
On the Improvement tools panel, click Teach domain concepts and then Dictionaries.
Click the download icon next to your dictionary. Your dictionary then downloads as a .csv file.

Regular expressions enrichments

Back up your regular expressions as a .csv file, and store them locally. Note the regular expressions that you specified to create your enrichments so that you can re-create the enrichments from them. For more information, see Regular expressions.

Machine learning enrichments

Back up your machine learning model .zip or .pear files, and store them locally. For more information, see Machine learning enrichments and Watson Explorer Content Analytics Studio models.

Pattern enrichments

Open your project.
On the Improvement tools panel of the Improve and customize page, click Teach domain concepts and then Patterns (Beta).
Click the download icon next to your pattern. Your pattern model then downloads as a .zip file.

Advanced rules models enrichment

Back up your model files as .zip files, and store them locally. For more information, see Advanced rules models.

Classifier enrichments

Back up your classifier .csv files, and store them locally. For more information, see Classifier.

Entity extractor enrichments

To back up entity extractor models, download the models and store them locally. A model must be fully trained before it can be downloaded. For more information, see Exporting the entity extractor.

Content Mining application resources

You cannot back up certain data types and must manually re-create them. There are several Content Mining custom user resources that the application does not automatically back up. If data loss occurs, you must either manually re-create the following custom user resources in the Content Mining application or upload a locally saved file that contains the resource:

Custom map
Searched document export: You can export a searched document in the Documents view in the Content Mining application, but you cannot reupload it in the application.
Facet analysis result export: You can download the results of your facet analysis by clicking the Export icon, then Export results, and Export in the Analysis export options dialog box.
Collection: You can restore a Content Mining collection if you stored the collection locally as a .csv file and then upload it in the application. Otherwise, you must manually re-create the collection.
Document classifier: You can restore a document classifier if you stored the document classifier locally as a .csv file and then upload it in the application. Otherwise, you must manually re-create the document classifier.
Custom annotators
- Dictionary: You can restore a dictionary in the application if you stored the dictionary locally as a .csv file and upload it in the application.
- Regular expressions: You can restore a regular expression in the application if you stored the regular expression locally as a .csv file and upload it in the application.
- Machine learning models: You can restore a machine learning model if you stored the model locally as a .zip file and then upload it in the application.
- PEAR File: You can upload a .pear file if you stored the file locally and then upload it in the application.

You cannot back up the following resources locally and must recreate them in the Content Mining application.

Saved analysis
Report
Dashboard

Restoring your data to a new Watson Discovery instance

Consider using your backups to restore to a new Discovery instance in a different data center, also known as a region or location.

To begin restoration, first start by reviewing your list of collections and associated data sources, as well as your file backups.

Create your projects and collections. Use the Discovery tooling, or the API. See Create a project and Create a collection.
Add back stopwords into the collections. See Defining stopwords.
If you use custom query expansion, add your query expansions. See Implementing synonyms.
If you use any custom entity models from IBM Watson® Knowledge Studio for enrichment, reimport that model into your Discovery instance. For details, see Managing enrichments.

After you set up your projects and collections as they were before, begin ingesting your source documents. Depending upon how you ingested your documents previously, you can do so by using your own solution or one of the following methods:

The API
A connector

Restoring sentence classifiers

To restore a sentence classifier model that was created from the sentence labeling UI, import the exported .sc file to create a new machine learning model. You cannot open the exported model in the sentence labeling UI to continue working with it. However, you can import a finished model and apply it to collections as a sentence classifier enrichment.

For more information about how to import a sentence classifier model to create a machine learning enrichment, see Use imported ML models to find custom terms.

Restoring training data

After you restore your projects, you can begin the process of re-creating your relevancy training models. To restore your training data queries and examples, re-create your individual training queries and the examples by using the create training query API, or you can restore your queries and examples on Discovery. For more information about restoring your training data by using Discovery, see the instructions for accessing the Train page in Improving result relevance with training.

For the restore to work properly, note that the document IDs that you use in your training data point to the documents in your current project. Use the same IDs in your new projects to ensure that the correct documents are referenced. If the IDs do not match, your restored relevancy training will not work.

Restoring connections to external data sources

In case of an unanticipated loss of data, you might lose your scheduled crawls of external data sources. See Configuring IBM Cloud data sources for the list of available sources.

To restore your external data, reestablish your connections to these data sources, and then recrawl them.

To find the data source credentials that you stored, follow the instructions for your chosen data source in Configuring IBM Cloud data sources. These instructions explain how you can reconnect to your data sources and get the data imported into Discovery.

Restoring Smart Document Understanding models

To import a previously exported Smart Document Understanding (SDU) model, see Importing and exporting models. SDU models have the file extension of .sdumodel.

When importing an SDU existing model into a new collection, it is a best practice to create the new collection and add one document, then import the model and upload the remainder of your documents.

Restoring dictionary enrichments

Open your project.
On the Improvement tools panel of the Improve and customize page, click Teach domain concepts, Dictionaries, and then Upload.
In the Apply dictionary dialog box, enter a name for your .csv file, select a language, specify the facet path, click Upload, and select your dictionary .csv file.
Click Create.

After you upload dictionary .csv files for recovery, you cannot use the dictionary editor to further edit the terms. If you want to use the dictionary editor, create a dictionary, and manually add the dictionary terms.

For information about uploading a dictionary enrichment .csv file, see Dictionary.

Restoring pattern enrichments

You can restore pattern enrichment .zip files as advanced rules models .zip files by completing the following steps:

Open your project.
On the Improvement tools panel of the Improve and customize page, click Teach domain concepts, Advanced rules models, and then Upload.
In the Apply advanced rules model dialog box, enter a name for your .zip file, select a language, specify a result field, click Upload, and select your advanced rules models .zip file.
Click Create.

After you upload pattern model .zip files for recovery, you cannot use the pattern editor to further edit the .zip files.

For more information about uploading an advanced rules models .zip file, see Advanced rules models.

Restoring entity extractors

To restore an entity extractor model, import the exported .ent file to create a new machine learning model. You cannot open the exported model in the entity extractor tool to continue working with it. However, you can import a finished model and apply it to collections as an entity extractor enrichment.

For more information about how to import an entity extractor model to create a machine learning enrichment, see Use imported ML models to find custom terms.