IBM Cloud Docs
Getting Started with Databases for DataStax

Getting Started with Databases for DataStax

IBM Cloud® Databases for DataStax is deprecated and no longer supported as of 30 June 2024. For more information, see the deprecation details.

This tutorial uses the IBM Cloud Dashboard to create an IBM Cloud® Databases for DataStax service instance. You also see essential information to enable your application to work with the database.

This tutorial is a short introduction to using an IBM Cloud® Databases for DataStax deployment.

Before you begin

Create a service instance

  1. Navigate to the IBM Cloud® Databases for DataStax service in the IBM Cloud Catalog.

  2. Give your service a memorable name that will later appear in your account's Resource List.

  3. Choose a resource group.

  4. Choose a location.

  5. Choose your resource allocation, using one of the provided templates or a custom allocation.

  6. Configure your service.

    • Database version
    • Encryption
    • Endpoints: Select which endpoints you'd like to be initially enabled, either public or private.
  7. Click Create.

    After you click Create, the system displays a message to say that the instance is being provisioned, which returns you to the Resource list. From the Resource list, you see that the status for your instance is, Provision in progress.

  8. When the status changes to Active, select the instance.

Set your admin password

Review the Getting to production documentation for general guidance on setting up a basic IBM Cloud® Databases for DataStax deployment.

Connect with DataStax drivers

Drivers are a key component to connecting external applications to your Databases for DataStax deployment. Review the information on Connecting an external application for details on compatible drivers. Further details on specific drivers, including upgrade guides, can be found at Developing applications with Apache Cassandra and DataStax Enterprise.

Only DataStax drivers that are explicitly listed in the Connecting an external application table function correctly for connecting to Databases for DataStax.

Node repairs with NodeSync

NodeSync is a continuous repair service that runs automatically in the background to validate that your data is in sync on all replicas and reduces the need for manual repairs. For operational simplicity and performance, Databases for DataStax enables the NodeSync service on all key spaces and tables to handle node repairs. Traditional repairs are unnecessary, as the automation ensures that NodeSync handles the repairs for you.

While the automation enables NodeSync, you can also manually create tables with NodeSync enabled so that the service can repair the tables' data when necessary:

CREATE TABLE myTable (...) WITH nodesync = { 'enabled': 'true'};

For more information, see enabling the NodeSync service.

Nodetool is unsupported. Manual repairs that are issued against any table that is enabled with the NodeSync service are ignored. See error:

WARNING: A manual nodetool repair or a repair operation from the OpsCenter node administration menu fails to run if a NodeSync-enabled table is targeted.

Recommendations

Benchmark before production

  • Do not use cqlsh and COPY for benchmarking, as COPY does not mimic typical client behavior. Instead, nosqlbench can be used for benchmarking.

Data migrations

Resource configurations

  • The recommended configuration for a node is:
    • 16 CPUs
    • 32 GB to 64 GB RAM
    • 16 K disk IOPS (16 k IOPS == 1.6 TB disk)

Next steps

Detailed information on CQL, the Cassandra Query Language, can be found by consulting CQL for DSE Documentation.

Looking to administer your deployment? Consult DataStax's documentation on using the stand-alone CQLSH client.

You can manage your deployment with IBM Cloud CLI, the Cloud Databases CLI plug-in, or by using the Cloud Databases API.

If you are planning to use Databases for DataStax for your applications, check out Connecting an external application and Connecting an IBM Cloud application.

To ensure the stability of your applications and your database, check out High-Availability and Performance.