Overview

Documents are JSON objects. Documents are also containers for your data, and are the basis of the IBM® Cloudant® for IBM Cloud® database.

If you're using an IBM Cloudant service on IBM Cloud®, documents are limited to a maximum size of 1 MB. Exceeding this limit causes a 413 error.

IBM Cloudant uses an eventually consistent model for data. If you use the eventually consistent model, it's possible, under some conditions, to retrieve older document content. For example, older content is retrieved when your application writes or updates a document that is followed immediately by a read of the same document.

In other words, your application would see the document content as it was before the write or update occurred. For more information about this model, see the topic on Consistency.

Document fields

All documents must have two fields:

A unique _id field. The _id field is detailed in the next section.
A _rev field. The _rev field is a revision identifier, and is essential to the IBM Cloudant replication protocol.

In addition to these two mandatory fields, documents can generally contain any other content that can be described by using JSON, subject to some caveats detailed in the following sections.

Document IDs

The format of a document ID differs depending on whether a database is partitioned or not. When a database is partitioned, the partition key for each document is defined as part of the document ID as detailed in the next section.

IDs in partitioned databases

When you use a partitioned database, the document ID specifies both the partition key and the document key. These keys are specified by splitting the document ID into two parts that are separated by a colon:

$PARTITION_KEY:$DOCUMENT_KEY

The $PARTITION_KEY might be the same between documents. The $DOCUMENT_KEY must be unique within each partition. That is, overall the entire document ID must be unique within a database. A document key might contain further colon characters.

IDs in non-partitioned databases

For non-partitioned databases, the _id field is either created by you, or generated automatically as a UUID by IBM Cloudant.

If you choose to specify the document _id field, it must be limited to no more than 7168 characters (7k).

As with partitioned databases, the document ID must be unique within a database.

Field name restrictions

Field names that begin with the underscore character (_) are reserved in IBM Cloudant. This rule means that you can't normally have your own field names that begin with an underscore. For example, the field example would be accepted, but the field _example would result in a doc_validation error message.

See an example of a JSON document that attempts to create a field with an underscore prefix:

{
	"_top_level_field_name": "some data"
}

See an error message that is returned when you attempt to create a field with an underscore prefix:

{
	"error": "doc_validation",
	"reason": "Bad special document member: _top_level_field_name"
}

However, if the field name is for an object that is nested within the document, you can use an underscore prefix for the field name.

See an example of a JSON document that attempts to create a field with an underscore prefix, nested within an object:

{
	"another_top_level_field_name": "some data",
	"another_field": {
		"_lower_level_field_name": "some more data"
	}
}

See an example success message (abbreviated) returned when a nested field with an underscore prefix is created:

{
	"ok": true,
	"id": "2",
	"rev": "1-9ce...8d4"
}

Quorum - writing and reading data

In a distributed system, it's possible that a request might take some time to complete. A "quorum" mechanism is used to help determine when a request, such as a write or read, completes successfully.

For more information about quorum settings and their implications on dedicated IBM Cloudant systems, contact IBM Cloudant support.

Time to live

Time to live (TTL) is a property of data, where after a relative amount of time, or at an absolute time, the data is considered expired. The data itself might be deleted or moved to an alternative (archive) location.

IBM Cloudant does not support Time to Live functions. The reason is that IBM Cloudant documents are only "soft" deleted, not deleted. The soft deletion involves replacing the original document with a smaller record. This small record or "tombstone" is required for replication purposes. It helps ensure that the correct revision to use can be identified during replication.

If the TTL capability was available in IBM Cloudant, the resulting potential increase in short-lived documents and soft deletion records would mean that the database size might grow in an unbounded fashion.