IBM Cloud Docs
Apache Kafka

Apache Kafka

Apache Kafka is a distributed event streaming platform. Connect to an Apache Kafka real-time processing server to write and to read Streams of events from and into topics.

Configure the following details for Apache Kafka data source:

Register data source
Field Description
Display name Enter the data source name to be displayed on the screen.
Hostname Enter the hostname. You can add multiple host information. To add, click the Add icon. A new row appears for adding hostname and port. Enter the details.
Port Enter the port number.
SASL connection

Use the toggle switch to enable or disable the Simple Authentication Security Layer (SASL) to include an authentication mechanism. If enabled,

  1. Upload the SSL certificate:
    i. The Upload SSL certificate (.pem, .crt, .cert, or .cer) link is enabled.
    ii. Click the Upload SSL certificate (.pem, .crt, .cert, or .cer) link.
    iii. Browse the SSL certificate and upload.
  2. Select one of the following SASL mechanisms:
  • PLAIN
  • SCRAM SHA-256
  • SCRAM SHA-512
  1. Enter the Username and API key/Password.
Test connection Click the Test connection link to test the data source connection. If the data source connection is successful, a success message appears.
Catalog name Enter the name of the catalog. This catalog is automatically associated with your data source.
Add topics You can add topics after you create the data source.
i. Go to the Infrastructure manager.
ii. Click the Apache Kafka data source.
iii. Click Add topics option.
iv. Upload .json definition files. You can either drag the files or use the Click to upload option. Topic names are determined from the definition files.
v. Use the Edit option to view and edit the topic files.
Create Click Create to create the data source.

Sample .json definition file

The following is the sample .json definition file to be uploaded to the Kafka source configuration section for Kafka topics:

{
 "topicName": "customer_orders",
 "tableName": "orders",
 "fileContent": {
     "tableName": "orders",
     "columns": [
         {
             "name": "order_id",
             "type": "INTEGER",
             "primaryKey": true
         }
     ],
     "partitionKey": "customer_id",
     "retentionPeriod": "7 days"
 },
 "contents": {
     "tableName": "orders",
     "topicConfig": {
         "partitions": 1,
         "replicationFactor": 1,
         "retentionMs": 604800000,
         "cleanupPolicy": "delete"
     },
     "schema": {
         "type": "struct",
         "fields": [
             {
                 "name": "order_id",
                 "type": "int64"
             }
         ]
     }
 }
}

Limitations for SQL statements

  • For data source-based catalogs the CREATE SCHEMA, CREATE TABLE, DROP SCHEMA, DROP TABLE, DELETE, DROP VIEW, ALTER TABLE, and ALTER SCHEMA statements are not available in the Data Manager UI.

Limitations for data types

  • When the fields of data type REAL have 6 digits or more in the decimal part with the digits being predominately zero, the values when queried are rounded off. It is observed that the rounding off occurs differently based on the precision of the values. For example, a decimal number 1.654 when rounded to 3-digits after the decimal point are the same. Another example is 10.890009 and 10.89000. It is noticed that 10.89000 is rounded to 10.89, whereas 10.89009 is not rounded off. This is an inherent issue because of the representational limitations of binary floating point formats. This might have a significant impact when querying involves sorting.