Information security
IBM is committed to providing our clients and partners with innovative data privacy, security, and governance solutions.
IBM Cloud IBM Cloud only
This information applies only to managed deployments.
Notice: Clients are responsible for ensuring their own compliance with various laws and regulations, including the European Union General Data Protection Regulation. Clients are solely responsible for obtaining advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulations that may affect the clients' business and any actions the clients may need to take to comply with such laws and regulations.
The products, services, and other capabilities that are described herein are not suitable for all client situations and might have restricted availability. IBM does not provide legal, accounting, or auditing advice or represent or warrant that its services or products ensure that clients are in compliance with any law or regulation.
If you need to request GDPR support for IBM Cloud® Watson resources that are created, see GDPR Subject Access Request.
European Union General Data Protection Regulation (GDPR)
IBM is committed to providing our clients and partners with innovative data privacy, security, and governance solutions to assist them on their journey to GDPR compliance.
Learn more about IBM's own GDPR readiness journey and our GDPR capabilities and offerings to support your compliance journey here.
Labeling and deleting data in Discovery
Discovery includes an API to label data per call. For more information about how to label data by using either the API or from the Discovery product user interface, see Labeling data.
Customer data can be deleted by using the API. For more information about deleting customer data, see Deleting labeled data.
Experimental and beta features are not intended for use with a production environment and are not guaranteed to function as expected when you label and delete thier associated data. Do not use experimental and beta features when you implement a solution that requires the labeling and deletion of data.
Methods that support labeling data
The following stored information can be deleted by using a customer_id if the customer_id was specified when the information was originally added by using the associated method:
- Curations (
/v2/projects/{project_id}/curations) Only available fornatural_language_queryquery types. - Documents (
/v2/projects/{project_id}/collections/{collection_id}/documents) - Notices (
/v2/projects/{project_id}/notices) Only ingestionnoticesare labeled. - Training data (
/v2/projects/{project_id}/training_data/queries) - Dictionaries (Only when created in the Discovery product user interface)
- Exported documents (Only when created in the Content Mining application)
- Reports (Only when created in the Content Mining application)
Exported documents and reports can be viewed in the Repository and Report pages of the Content Mining application. They are not available by using the API.
Discovery does not log query request data.
For more information about the options for labeling data in Discovery, see Labeling data.
The following stored information is not explicitly labeled and cannot be deleted by specifying the customer_id. Personal Data is not supported in these fields.
Any string fields (including but not limited to name and description) of the following stored items:
- Collections
- Projects
Labeling data
Data can be labeled by using the API, or by using the Discovery product user interface. For more information about labeling with the product user interface, see Labeling data in the product user interface.
You cannot label data that is added by crawling external data sources.
Data is labeled by adding the customer_id of your choice to the optional X-Watson-Metadata header. Discovery can then delete it by customer_id.
You can label data with the API in different ways:
-
When you ingest documents by using the
POST /v2/projects/{project_id}/collections/{collection_id}/documentsorPOST /v2/projects/{project_id}/collections/{collection_id}/documents/IDoperations, send an optional headerX-Watson-Metadata. TheX-Watson-Metadataheader must include either of the following items:- Semicolon separated
field=valuepairs (for example:customer_id=123) - The
customer_idfield. By adding thecustomer_idinX-Watson-Metadataheader, the request indicates that it contains data that belongs to thiscustomer_id.
- Semicolon separated
Optionally, you can include the customer_id field with the metadata multipart form part instead of including the X-Watson-Metadata header.
If you specify a customer_id in the metadata multipart form part and the X-Watson-Metadata header for the same document, then the customer_id in the X-Watson-Metadata header is
used.
This example adds the customer_id to both the X-Watson-Metadata header and the metadata:
curl -k -u "apikey:$API_KEY" \
-H "x-watson-userinfo:instance-id=asdf" \
-H "x-watson-metadata:customer_id=customer_header_123" \
-H "x-watson-discovery-next:true" \
-F "file=@$FILENAME" \
-F "metadata={\"customer_id\": \"new123\"}" \
-X POST "$API_URL/v2/projects/$PROJECT_ID/collections/$COLLECTION_ID/documents?version=2020-03-08" \
Example output:
{
"document_id":"8b152926-e9f5-4f34-940a-c02da7ef3af4",
"result_metadata":{
"collection_id":"24265c0b-2a55-3ccf-0000-017334467b6e"
},
"metadata":{
"date":1594319812384,
"parent_document_id":"8b152926-e9f5-4f34-940a-c02da7ef3af4",
"customer_id":"customer_header_123"
},
"extracted_metadata":{
"sha1":"CEC7C1D3423C7D4ED58FC448F52681ECA93CED8A",
"numPages":"1",
"filename":"Simple.pdf",
"author":[
"Simple Man"
],
"subject":"Simple Metadata",
"file_type":"pdf",
"title":"Simple Title",
"publicationdate":"2016-10-05"
}
}
If your documents are already ingested, you must reingest them to add the X-Watson-Metadata header and customer_id.
Restrictions:
- The value of the
X-Watson-Metadataheader cannot exceed 4 KB of text. - The
X-Watson-Metadataheader must contain a semicolon-separated list offield=valuepairs. Thefieldandvaluemust not contain semicolons (;) or equals signs (=). customer_ids are unique within each Discovery instance. They are NOT unique per project or collection.- A
customer_idcannot be more than 256 characters in length. - If a
customer_idcontains only white space or is empty, it is treated as though thecustomer_idwas not provided at all, and no error messages are returned.
Labeling data in the product user interface
Data can be labeled by using the Discovery product user interface, or by using the API. For more information about labeling with the API, see Labeling data.
To label data with the product user interface:
- Open the Projects page by selecting My Projects.
- Select Data usage and GDPR.
- Choose the GDPR data label tab.
- Set the Label data with customer ID toggle to
on. The Customer ID field appears. - Enter a unique ID for the customer in the Customer ID field. Do not include personal data in a Customer ID.
- Click Save ID.
After the Customer ID (customer_id) field is set, all data that is uploaded during the current browser session is labeled with the specified Customer ID. (You cannot label data that is added by
crawling external data sources.)
Adding a Customer ID labels the documents, notices, dictionaries, and training data within that URL domain from that point forward, including each instance under that domain. Any actions, including document uploads, that occurred in the Discovery product user interface before the Customer ID field was added are not labeled.
If you switch domains or browsers, empty the browser cache, or start an incognito session after you specify your Customer ID by using the Discovery product user interface, the Customer ID is not retained, and your data is not labeled. If you must switch domains or browsers, after the switch, open the GDPR data label tab, enter the Customer ID again, and then click Save ID.
If an existing Customer ID needs to be changed:
- Delete the data associated with that Customer ID. For instructions, see Deleting labeled data.
- Follow the instructions to label data with the Discovery product user interface, or by using the API.
- Upload or crawl the data.
Deleting labeled data
Customer data that is labeled with a customer_id can be deleted by using the API. For more information about how to label data by using either the API or from the Discovery product user interface, see Labeling data.
You cannot delete labeled data from the Discovery product user interface.
- Use the
DELETE /v2/user_dataoperation and provide thecustomer_idof the data you want to delete.DELETE /v2/user_datadeletes all data that is associated with a particularcustomer_idwithin that service instance, as specified in Methods that support labeling data. Also, see Delete labeled data in the API reference
- To ensure all labeled content is correctly removed, run the
DELETE /v2/user_dataoperation after theprocessingandpendingcounts for your collections return0.
Notes on deleting labeled data:
- Deletions happen asynchronously. You cannot track the progress of deletions.
- If a nonexistent
customer_idis provided, nothing is deleted, but a202 - Acceptedresponse is returned. - Projects and collections are not labeled with a
customer_id, even if aX-Watson-Metadataheader is included in the request to create the project or collection. Only the individual documents within a collection are labeled. Therefore, when data is deleted, individual projects and collections are NOT deleted.