Release notes for IBM Analytics Engine serverless instances

Use these release notes to learn about the latest updates to IBM Analytics Engine serverless instances that are grouped by date.

February 2025

You can now use Spark version, 3.5.4 to run the applications in IBM Analytics Engine. In IBM Analytics Engine, Apache Spark 3.4.4 and Apache Spark 3.5.4 are the supported versions.

September 2024

Deprecating the support for IBM Log Analysis: IBM Log Analysis is deprecated and will no longer be supported as of 30 March 2025. You must migrate to IBM Cloud Logs. For information about how to migrate, see Migrating to IBM Cloud Logs.

August 2024

Deprecating the support for Spark 3.3 runtimes: Support for Spark 3.3 runtime in IBM Analytics Engine will be deprecated on Sep 17, 2024 and the default version will be changed to Spark 3.4 runtime. To ensure a seamless experience and to leverage the latest features and improvements, switch to Spark 3.4. To upgrade your instance to Spark 3.4, see Replace Instance Default Runtime.

February 2024

02 February 2024

Application logs are now available in the instance home: The application logs are now forwarded to Analytics Engine instance home by default. You can access the log information from the IBM Cloud Object Storage (COS) bucket. You can download the log file for any specific application from the path <instance_id>/logs/<app_id> for recording, sharing, and debugging purposes. For more information, see Forwarding logs to instance home.
New location for logging Spark application events: Starting 7 February 2024, the Spark application events are logged on a new path (/<instance_id>/spark-events) available in the instance-home bucket. To view the older applications on the Spark history interface, copy the Spark application events to a new location. For more information about copying the events, see Spark history server

December 2023

01 December 2023

Encrypting internal network data for Spark workload: You can now enable data encryption for the internal network data in transit (internal communication between components of the Spark application) by configuring the IBM Analytics Engine properties at instance level or job level. For more information about encrypting internal network data for Spark workload, see Encrypting internal network data for Spark workload.

November 2023

22 November 2023

Configuring Spark log level information: The default log level in the IBM Analytics Engine Serverless Spark application, shall be changed to 'ERROR' by January 3, 2023. You can change the existing log configuration of logging at the 'INFO' level, to display a relevant and concise messages. For more information on changing the log level, see Configuring Spark log level information.

October 2023

19 October 2023

Removal of development(*-devel) packages: For security reasons, the *-devel packages (operating system development packages) are not pre-installed on the Spark runtime from now on. If you are already using the development packages, the programs that use the development packages cannot be compiled . For any queries, contact IBM Support.

09 October 2023

Removal of Spark 3.1 support: IBM Analytics Engine no longer supports Spark 3.1. Upgrade your existing IBM Analytics Engine instances to Spark 3.3 for the latest features and enhancements. For more information about the upgrade, see Replace Instance Default Runtime.

From the current release onwards, Spark 3.3 is the default runtime version for IBM Analytics Engine instances.

September 2023

29 September 2023

Integration with watsonx.data: IBM Analytics Engine now integrates with IBM® watsonx.data to leverage the functional capabilities of watsonx.data. For more information about the integration and to work with watsonx.data, see Working with watsonx.data.

06 September 2023

Support for Spark 3.4: You can now provision IBM Analytics Engine severless plan instances with the default Spark runtime set to Spark 3.4, which enables you to run Spark applications on Spark 3.4.

August 2023

23 August 2023

Deprecating the support of R v3.6 from Spark 3.1 and Spark 3.3 runtimes: The IBM Analytics Engine deprecates the support for R v3.6 from Spark 3.1 and Spark 3.3 runtimes by September 6, 2023. Support for R v4.2 is already deployed for Spark 3.1 and Spark 3.3 runtime. Make sure that you test your Spark application with the new version of R v4.2 for any failures before September 06, 2023. Contact IBM Support for any issues. To test the Spark application, see Spark application REST API.

09 August 2023

Deprecating the support for Spark 3.1: The support for Spark 3.1 version on the IBM Analytics Engine is deprecated and will be removed soon (by 09 October 2023). To ensure a seamless experience and to leverage the latest features and improvements, upgrade your existing IBM Analytics Engine instances to Spark 3.3. To upgrade your instance to Spark 3.3, see Replace Instance Default Runtime. From this release onwards, Spark 3.3 will be the default runtime version for all new IBM Analytics Engine instances created. This change enables you to benefit from the enhanced capabilities and optimizations available in the latest version.

July 2023

07 July 2023

Spark maintenance release version update for 3.3: Spark applications with runtime set to Spark 3.3 will run internally using Spark 3.3.2 from now on. The patch version is now upgraded from 3.3.0 to 3.3.2.

May 2023

29 May 2023

Removal of Python v3.9 support from Spark 3.1 and Spark 3.3 runtimes: The IBM Analytics Engine - Serverless Spark application plans to discontinue Python v3.9 support from Spark 3.1 and Spark 3.3 runtimes by June15, 2023. Support for Python v3.10 is already deployed for Spark 3.1 and Spark 3.3 runtime. Based on the workload, make sure that you test your spark application with the new version of Python, v 3.10 for any failures before June 15, 2023. Contact IBM Support for any issues. See the procedure for testing the Spark application Run a Spark application with nondefault language version.

25 May 2023

Pagination of application list in the REST API and CLI: You can now limit the number of applications returned by the Analytics Engine serverless REST API endpoint, SDK method and CLI command for listing applications. Use the query parameter limit to specify the number of applications to be returned and specify the value of next.start or previous.start from the API response as the value of the start query parameter to fetch the next or previous page of the results. The applications are listed in descending order based on the submission time, with the newest application being the first.

The pagination is an optional feature in this release. From the next release of the service, the results will be paginated by default.

05 January 2023

Analyze application runs on the Spark history server

You can now run a Spark history server on your IBM Analytics Engine serverless instance.

The Spark history server provides a Web UI to view Spark events that were forwarded to the Object Storage bucket that was defined as the instance home. The Web UI helps you analyze how your Spark applications ran by displaying useful information like:

A list of the stages that the application goes through when it is run
The number of tasks in each stage
The configuration details such as the running executors and memory usage

You are charged for the CPU cores and memory consumed by the Spark history server while it is running. The rate is $0.1475 USD per virtual processor core hour and $0.014 USD per gigabyte hour.

See Use the Spark history server.

September 2022

21 September 2022

Support for Spark 3.3: You can now provision IBM® Analytics Engine severless plan instances with the default Spark runtime set to Spark 3.3, which enables you to run Spark applications on Spark 3.3.

09 September 2022

You can now use Hive metastore to manage the metadata related to your applications tables, columns, and partition information when working with Spark SQL.: You could choose to externalize this metastore database to an external data store, like to an IBM Cloud Data Engine (previously SQL Query) or an IBM Cloud Databases for PostgreSQL instance. For details, see Working with Spark SQL and an external metastore.

July 2022

12 July 2022

You can now provision IBM® Analytics Engine serverless instances in a new region.: In addition to the IBM Cloud® us-south (Dallas) region, you can now also provision serverless instances in the eu-de (Frankurt) region.

08 July 2022

New API for platform logging: Start using the log_forwarding_config API to forward platform logs from an IBM Analytics Engine instance to IBM Log Analysis. Although you can still use the logging API, it is deprecated and will be removed in the near future. For details on how to use the log_forwarding_config API, see Configuring and viewing logs.

13 May 2022

Support for Python 3.9: You can now run Spark applications using Python 3.9. on your IBM Analytics Engine serverless instances.

04 April 2022

Limitation on how long Spark applications can run: Spark applications can run for a maximum period of 3 days (72 hours). Any applications that run beyond this period will be auto-cleaned in order to adhere to the security and compliance patch management processes for applications in Analytics Engine.

30 March 2022

Start using the Analytics Engine serverless CLI: Use this tutorial to help you get started quickly and simply with provisioning an Analytics Engine serverless instance, and submitting and monitoring Spark applications. See Create service instances and submit applications using the CLI.

9 September 2021

Introducing IBM Analytics Engine Standard serverless plan for Apache Spark

The IBM Analytics Engine Standard serverless plan for Apache Spark offers the ability to spin up IBM Analytics Engine serverless instances within seconds, customize them with library packages of your choice, and run your Spark workloads.

New: The IBM Analytics Engine Standard serverless plan for Apache Spark is now GA in the Dallas IBM Cloud service region.

This plan offers a new consumption model using Apache Spark whereby resources are allocated and consumed only when Spark workloads are running.

Capabilities available in the IBM Analytics Engine Standard serverless plan for Apache Spark include:

Running Spark batch and streaming applications
Creating and working with Jupyter kernels for interactive use cases
Running Spark batch applications through an Apache Livy like interface
Customizing instance with your own libraries
Autoscaling Spark workloads
Aggregating logs of your Spark workloads to the Log Analysis server

To get started using the serverless plan, see Getting started using serverless IBM Analytics Engine instances.