Document status webhook API
You can use the document status webhook feature to send a webhook event to your external application when the status of ingested documents becomes available
or failed
. The webhook event helps you to take the next action
on indexed documents, without having to get the document status first through the Get document details API.
IBM Cloud Pak for Data When you run Discovery in an air-gapped environment, you must connect to the external application through an HTTP proxy. For more information, see Setting up HTTP proxy in air-gapped environments.
For using the document status webhook feature, do the following things:
-
Set up the external application that can receive webhook notifications from Discovery.
To do so, you must register your external application as a webhook endpoint on a collection by using the
create collection
orupdate collection
API methods. For more information, see Create collection or update collection in the API reference.The external application receives a webhook
ping
event, which notifies that the webhook is sucessfully created. The external application must be accessible from IBM Cloud. -
Ingest the documents to the collection. When the status of the ingested documents becomes
available
orfailed
, the external application receives thedocument.status
webhook event.You can verify the status of the ingested documents in the
data
object of thedocument.status
webhook event. Thedocument_ids
andstatus
parameters show the IDs of the ingested documents and their status. For more information, see Data model of theping
event and Data model of thedocument.status
event.
The following image shows the webhook configuration flow.
The following image shows the document status webhook feature process flow.
For more information about the query API, see Query a project API method in the API reference.
You can also refer to the webhook-doc-status-sample application for the document status webhook API feature. To view the sample application, you must have access to the Discovery doc-tutorial-downloads repository.
Webhook security
To authenticate the webhook request, verify the JSON Web Token (JWT) that is sent with the request. The webhook microservice automatically generates a JWT and sends it in the Authorization
header with each webhook call. It is your
responsibility to add code to the external service that verifies the JWT.
The system can generate a JWT based on the sample secret
that you specify, and in the Authorization
header, you can pass this system-generated JWT to the external application. If you specify a value in the header
,
then the webhook microservice sends that value to the external application instead of the JWT.
For example, if you specify sample secret
in the Secret
field of the Webhooks object in the Create collection or update collection APIs, you might add sample code such as the following in Node.js:
const jwt = require('jsonwebtoken');
...
const token = request.headers.authentication; // grab the "Authentication" header
try {
const decoded = jwt.verify(token, 'sample secret');
} catch(err) {
// error thrown if token is invalid
}
Data model of the ping
event
Following are the ping
event parameters:
Parameter | Description |
---|---|
event |
The event name is ping . |
instance_id |
The Discovery instance ID. |
version |
The Discovery API version in the format yyyy-mm-dd . |
data |
An object with the event information:
|
created_at |
The date and time the event was created. |
For example, following is a ping
event that is sent to a webhook:
POST https://example.com/webhook
Authorization: Basic YWxhZGRpbjpvcGVuc2VzYW1l
X-Global-Transaction-ID: 5144bb45-dc81-402c-a045-249fd1318515
Content-Type: application/json
{
"event": "ping",
"version": "2023-03-31",
"instance_id": "1a5d4916-6097-4150-977a-ca897226565c",
"data": {
"url": "https://example.com/webhook",
"events": [
"document.status"
],
"metadata": {
"project_id": "02a803f9-c814-4fcb-a764-e01e3d4dd002",
"collection_id": "f41ae858-0ca9-d0ed-0000-01890118cc5b"
}
},
"created_at": "2023-08-16T08:34:46.000Z"
}
Data model of the document.status
event
Following are the document.status
event parameters:
Parameter | Description |
---|---|
event |
The event name is document.status . |
instance_id |
The Discovery instance ID. |
version |
The Discovery API version in the format yyyy-mm-dd . |
data |
An object with the event specific information: project_id , collection_id , and document_ids . |
status |
The status of the documents. |
created_at |
The date and time the event was created. |
For example, following is a document.status
event that is sent to a webhook:
POST https://example.com/webhook
Authorization: Basic YWxhZGRpbjpvcGVuc2VzYW1l
X-Global-Transaction-ID: 5144bb45-dc81-402c-a045-249fd1318515
Content-Type: application/json
{
"event": "document.status",
"version": "2023-03-31",
"instance_id": "1a5d4916-6097-4150-977a-ca897226565c",
"data": {
"project_id": "02a803f9-c814-4fcb-a764-e01e3d4dd002",
"collection_id": "f41ae858-0ca9-d0ed-0000-01890118cc5b",
"document_ids": [
"1a5d4916-6097-4150-977a-ca897226565b",
"2a5d4916-6097-4150-977a-ca897226565b"
],
"status": "available"
},
"created_at": "2023-08-16T08:34:46.000Z"
}