IBM Cloud Docs
Monitoring an IBM Cloudant cluster

Monitoring an IBM Cloudant cluster

A key part of ensuring best performance, or troubleshooting any problems, is monitoring the affected system.

The monitoring API is only available to IBM® Cloudant® for IBM Cloud® Enterprise customers with dedicated clusters and not to IBM Cloud® Public customers.

You want to be able to answer the question:

In what way has the system behavior changed as a result of any configuration or application modifications?

To answer the question, you need data. The data comes from monitoring the system. Monitoring the system while it replicates can be performed by using the _active_tasks endpoint, which is described in more detail in the Active tasks documentation.

For more detailed system information, use the cluster monitoring API.

Monitoring metrics overview

When you monitor the cluster, you can obtain data about how it is performing. Details such as the number of HTTP requests processed per second, or how many documents are processed per second, can all be obtained through the monitoring API. The API can be invoked only by an administrative user, and is applied to a specific monitoring endpoint.

For example, if you wanted to monitor the number of documents processed by a map function each second, you would direct the request to the map_doc endpoint.

For more information, see a full list of the available endpoints.

The data is returned in JSON format by default. You can specify a raw format if you prefer.

Syntax of the monitoring request

All requests to the monitoring API have the following form:

curl -u $ADMIN_USER "https://$ADMIN_USER.cloudant.com/_api/v2/monitoring/$END_POINT?cluster=$CLUSTER[&format=(json|raw)]"

The fields are described in the following table:

Monitoring API request fields
Field Meaning
ADMIN_USER The account name. The account must have administrative privileges.
CLUSTER The cluster that you are interested in.
DURATION Specifies the duration of the preferred time series query. Select from one of the following time intervals: ["5min", "30min", "1h", "12h", "24h", "1d", "3d", "7d", "1w", "1m", "3m", "6m", "12m", "1y"]. DURATION must be paired with either the START or END request.
END UTC timestamp in ISO-8601 or UTC epoch second, which specifies the end of a time series query. The timestamp can't have a query where START and END are the same, or where END is before START, or where START is after END.
END_POINT The aspect of the cluster you want to monitor.
START UTC timestamp in ISO-8601 or integer seconds where epoch format specifies the starting point of a time series query that is mutually exclusive with END.

Several of the fields have default values:

Default values for monitoring API request fields
Field Default value
DURATION 5 minutes
END No default value
START The current time

Results format

By default, the monitoring results are returned in JSON format. If you prefer, you can choose to receive the results in raw format.

The results include a text string that identifies the metric that is stored on the server that provides the API capability, for example:

sumSeries(net.cloudant.mycustomer001.db*.df.srv.used)

The results include cluster-level data.

IBM Cloudant stores the queried data at the following resolutions: 10 seconds for the past 24 hours; 1 minute for the past 7 days; and 1 hour for the past 2 years. As a result, and to ensure that IBM Cloudant always stores the higher resolution interval length, deltas on the boundary of these resolutions are trimmed by one interval's length.

With format=json (default)

Unless you specify otherwise, the metric data that is returned is in JSON format. Each value that is returned consists of [datapoint, timestamp] values.

See an example monitoring request for disk use data returned in JSON format:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/disk_use?cluster=myclustername&format=json"

See an example result after you request disk use data in JSON format:

[
	{
		"target": "sumSeries(net.cloudant.mycustomer001.db*.df.srv.used)",
		"datapoints": [
			[523562172416.0, 1391019360],
			[524413976576.0, 1391019420],
			[519036682240.0, 1391019480],
			[518762102784.0, 1391019540],
			[523719393280.0, 1391019600]
		]
	},
	{
		"target": "sumSeries(net.cloudant.mycustomer001.db*.df.srv.free)",
		"datapoints": [
			[6488926978048.0, 1391019360],
			[6487768301568.0, 1391019420],
			[6493145661440.0, 1391019480],
			[6493420257280.0, 1391019540],
			[4330660167680.0, 1391019600]
		]
	}
]

With format=raw

The raw format data contains a series of text strings, identifying the name of the metric and associated values.

The text string (for example sumSeries(net.cloudant.mycustomer001.db*.df.srv.used)) is the name of the metric. The next two numbers are the start and end times, expressed as UTC epoch seconds. The final number is the step size in seconds.

The numbers after the | character contain the metric data that is obtained from your chosen endpoint. For example, requesting metric data from the disk use endpoint returns the output from a df command, with the disk use expressed as bytes stored.

See an example monitoring request for disk use data returned in raw format:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/disk_use?cluster=myclustername&format=raw"

See an example result after you request disk use data in raw format:

sumSeries(net.cloudant.mycustomer001.db*.df.srv.used),1391019780,1391020080,60|344708448256.0,345318227968.0,346120126464.0,346716471296.0,175483256832.0
sumSeries(net.cloudant.mycustomer001.db*.df.srv.free),1391019780,1391020080,60|6.49070326579e+12,6.4896982057e+12,6.48884414054e+12,6.48801589658e+12,4.32277107507e+12

Monitoring endpoints

To list all of the currently supported monitoring endpoints, make a request to the monitoring endpoint.

The following table lists the supported monitoring endpoints that are provided by the API:

Monitoring API endpoints
Endpoint Description
connections The status of multiple load balancer connections.
disk_use The disk use, as measured by a df command.
kv_emits The number of key:value emits per second.
map_doc The number of documents processed by a map function, per second.
network The octets that are received and transmitted.
rate/status_code The rate of requests, which are grouped by status code.
rate/verb The rate of requests, which are grouped by HTTP verb.
rps The number of reads per second.
wps The number of writes per second.

See an example showing how to obtain a list of the currently supported monitoring endpoints:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring"

See an example response that lists the available monitoring endpoints:

{
    "targets": [
        "node_disk_free_srv",
        "rps",
        "kv_emits",
        "smoosh_channels/slack_dbs",
        "smoosh_channels/upgrade_dbs",
        "smoosh_channels/ratio_dbs",
        "smoosh_channels/ratio_views",
        "smoosh_channels/slack_views",
        "smoosh_channels/upgrade_views",
        "uptime",
        "map_doc",
        "wps",
        "node_peak_cpu",
        "rate/status_code",
        "rate/verb",
        "disk_use",
        "node_mean_cpu",
        "memory",
        "os_proc_count",
        "run_queue",
        "node_disk_use_srv",
        "process_count",
        "response_time"
    ]
}

Examples of monitoring requests

connections

See an example of a connections monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/connections?cluster=myclustername&node=myloadbalancername&format=json"

The response includes a data series for the following connection states:

{
  "end": 1512989500,
  "start": 1512989170,
  "target_responses": [
     {
       "datapoints": [
            [
              0,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections CLOSED"
      },
      { 
        "datapoints": [
            [
              19,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections CLOSE_WAIT"
      },
      {
        "datapoints": [
            [
              2,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections CLOSING"
      },
      {
        "datapoints": [
            [
              280,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections ESTABLISHED"
      },
      {
        "datapoints": [
            [
              7,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections FIN_WAIT1"
      },
      { 
        "datapoints": [
            [
              0,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections FIN_WAIT2"
      },
      {
        "datapoints": [
            [
              0,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections LAST_ACK"
      },
      {
        "datapoints": [
            [
              4,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections LISTEN"
      },
      {
        "datapoints": [
            [
              1,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections SYN_RECV"
      },
      {
        "datapoints": [
            [
              0,
              1512989170
            ]
        ],
        "target": "myclustername.myloadbalancername Connections SYN_SENT"
      },
      {
        "datapoints": [
            [
              28,
              1512989170
              ]
            ],
        "target": "myclustername.myloadbalancername Connections TIME_WAIT"
      }
   ]
}

You must explicitly specify the load balancer in the request.

disk_use

See an example of a disk_use monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/disk_use?cluster=myclustername&format=json"

See example results (abbreviated) from a disk_use monitoring request:

{
	"start": 1435935759,
	"end": 1435936059,
	"target_responses": [
		{
			"target": "myclustername Used disk space (bytes)",
			"datapoints": [
				[
					6855438336.0,
					1435935780
				],
				[
					null,
					1435935795
				],
				[
					null,
					1435935810
				],
				...
				[
					null,
					1435936065
				]
			]
		},
		{
			"target": "myclustername Free disk space (bytes)",
			"datapoints": [
				[
					7141069422592.0,
					1435935780
				],
				[
					null,
					1435935795
				],
				[
					null,
					1435935810
				],
				...
				[
					null,
					1435936065
				]
			]
		}
	]
}

kv_emits

See an example of a kv_emits monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/kv_emits?cluster=myclustername&format=json"

See example results (abbreviated) from a kv_emits monitoring request:

{
	"start": 1436194248,
	"end": 1436194548,
	"target_responses": [
		{
			"target": "myclustername Key:value pairs emitted per second from map functions",
			"datapoints": [
				[
					0.0,
					1436194230
				],
				[
					0.0,
					1436194245
				],
				[
					0.8000000000001819,
					1436194260
				],
				...
				[
					null,
					1436194515
				]
			]
		}
	]
}

map_doc

See an example of a map_doc monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/map_doc?cluster=myclustername&format=json"

See example results (abbreviated) from a map_doc monitoring request:

{
	"start": 1436194475,
	"end": 1436194775,
	"target_responses": [
		{
			"target": "myclustername Documents per second through map functions",
			"datapoints": [
				[
					0.0,
					1436194480
				],
				[
					0.5,
					1436194495
				],
				[
					0.4000000000005457,
					1436194510
				],
				...
				[
					0.0,
					1436194765
				]
			]
		}
	]
}

network

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/network?cluster=myclustername&node=myloadbalancername&format=json"
{
  "end": 1512989748,
  "start": 1512989450,
  "target_responses": [
      {
        "datapoints": [
          [
           20247725.5,
           1512989450
          ]
        ],
        "target": "myclustername Octets tx Per Second"
      },
      {
         "datapoints": [
             [
               17697329.3046875,
               1512989450
             ]
         ],
         "target": "myclustername Octets rx Per Second" 
       }
    ]
}

You must explicitly specify the load balancer in the request.

rate/status_code

See an example of a rate/status_code monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/rate/status_code?cluster=myclustername&format=json"

See example results (abbreviated) from a rate/status_code monitoring request:

{
	"start": 1436194902,
	"end": 1436195202,
	"target_responses": [
		{
			"target": "myclustername 2xx",
			"datapoints": [
				[
					null,
					1436194910
				],
				[
					36.0,
					1436194920
				],
				...
				[
					0.0,
					1436195200
				]
			]
		},
		{
			"target": "myclustername 3xx",
			"datapoints": [
				[
					null,
					1436194910
				],
				[
					0.0,
					1436194920
				],
				...
				[
					null,
					1436195200
				]
			]
		},
		{
			"target": "myclustername 4xx",
			"datapoints": [
				[
					null,
					1436194910
				],
				[
					0.0,
					1436194920
				],
				...
				[
					0.0,
					1436195200
				]
			]
		},
		{
			"target": "myclustername 5xx",
			"datapoints": [
				[
					null,
					1436194910
				],
				[
					0.0,
					1436194920
				],
				...
				[
					null,
					1436195200
				]
			]
		}
	]
}

rate/verb

See an example of a rate/verb monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/rate/verb?cluster=myclustername&format=json"

See example results (abbreviated) from a rate/verb monitoring request:

{
	"start": 1436195497,
    "end": 1436195797, 
	"target_responses": [
		{
			"target": "myclustername GET", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					36.0, 
					1436195510
				], 
				...
				[
					49.5, 
					1436195790
				]
			]
		}, 
		{
			"target": "myclustername POST", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					0.0, 
					1436195510
				], 
				...
				[
					0.0, 
					1436195790
				]
			]
		}, 
		{
			"target": "myclustername PUT", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					0.0, 
					1436195510
				], 
				...
				[
					0.0, 
					1436195790
				]
			]
		}, 
		{
			"target": "myclustername DELETE", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					0.0, 
					1436195510
				], 
				...
				[
					0.0, 
					1436195790
				]
			]
		}, 
		{
			"target": "myclustername COPY", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					0.0, 
					1436195510
				], 
				...
				[
					0.0, 
					1436195790
				]
			]
		}, 
		{
			"target": "myclustername HEAD", 
			"datapoints": [
				[
					null, 
					1436195500
				], 
				[
					0.0,
					1436195510
				], 
				...
				[
					0.0, 
					1436195790
				]
			]
		}
	]
}

response_time

See an example of a response_time monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/response_time?cluster=myclustername&format=json"

See example results (abbreviated) from a response_time monitoring request:

{
  "start": 1523984559,
  "end": 1523984859,
  "target_responses": [
    {
      "datapoints": [
        [
          118.1668472290039,
          1523984540
        ],
        [
          90.57628631591797,
          1523984550
        ],
        [
          142.6778106689453,
          1523984560
        ],
        [
          118.42487335205078,
          1523984570
        ],
        [
          120.38044738769531,
          1523984580
        ],
        [
          103.94148254394531,
          1523984590
        ],
        [
          126.64134979248047,
          1523984600
        ],
        [
          113.03324127197266,
          1523984610
        ],
        [
          136.9058074951172,
          1523984620
        ],
        [
          148.68711853027344,
          1523984630
        ],
        [
          121.22771453857422,
          1523984640
        ],
        [
          142.86459350585938,
          1523984650
        ],
        [
          103.75953674316406,
          1523984660
        ],
        [
          139.1707763671875,
          1523984670
        ],
        [
          118.29866027832031,
          1523984680
        ],
        [
          126.3541259765625,
          1523984690
        ],
        [
          115.5962905883789,
          1523984700
        ],
        [
          106.68751525878906,
          1523984710
        ],
        [
          144.12387084960938,
          1523984720
        ],
        [
          103.8598861694336,
          1523984730
        ],
        [
          136.84429931640625,
          1523984740
        ],
        [
          110.58084106445312,
          1523984750
        ],
        [
          94.69702911376953,
          1523984760
        ],
        [
          126.85747528076172,
          1523984770
        ],
        [
          100.8759994506836,
          1523984780
        ],
        [
          145.0876922607422,
          1523984790
        ],
        [
          100.77622985839844,
          1523984800
        ],
        [
          null,
          1523984810
        ],
        [
          null,
          1523984820
        ],
        [
          null,
          1523984830
        ]
      ],
      "target": "myclustername Response Time (ms)"
    }
  ]
}

rps

See an example of an rps monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/rps?cluster=myclustername&format=json"

See example results (abbreviated) from an rps monitoring request:

{
	"start": 1436195908,
	"end": 1436196208,
	"target_responses": [
		{
			"target": "myclustername Document Reads Per Second",
			"datapoints": [
				[
					0.20000000001164153,
					1436195910
				],
				[
					0.10000000000582077,
					1436195925
				],
				...
				[
					null,
					1436196195
				]
			]
		}
	]
}

wps

See an example of a wps monitoring request:

curl "https://$ACCOUNT.cloudant.com/_api/v2/monitoring/wps?cluster=myclustername&format=json"

See example results (abbreviated) from a wps monitoring request:

{
	"start": 1436195999,
	"end": 1436196299,
	"target_responses": [
		{
			"target": "myclustername Document Writes Per Second",
			"datapoints": [
				[
					1.2999999999992724,
					1436196000
				],
				[
					0.5,
					1436196015
				],
				...
				[
					null,
					1436196285
				]
			]
		}
	]
}