Data Engine is IBM Cloud's central service for data lakes. It provides stream ingestion, data preparation, ETL, and data query from Object Storage and Kafka. It also manages tables and views in a catalog that is compatible with Hive metastore and other big data engines and services can connect to it. Data Engine supports full standard ANSI SQL to submit work as serverless jobs. There is no infrastructure to manage. The service is highly available, offers a Multi-AZ deployment, and autoscales based on your workload.
Transform data to prepare it for all disciplines of AI by changing data content, granularity, format, and layout on disk.
Query server logs, click-stream data, IoT messages, or raw analytics events using SQL directly on the files where you store them in Cloud Object Storage. Run ad-hoc SQL queries against fresh data before the data has moved through the pipeline and into the reporting stack.
Use SQL to query structured data archived from legacy systems, databases (for example, NoSQL), and/or data warehouses in Cloud Object Storage.
Run SQLs from your application using the rich API ecosystem, comprising REST API, Python SDK, Node SDK, and IBM Cloud Function interface.
Use SQL to stream data from Kafka into Cloud Object Storage using a permanent running job.
Use an external Hive metastore catalog to organise your tables and views in a central catalog within the IBM Cloud.