Known issues (Limitations)
The following limitations and known issues apply to IBM® watsonx.data.
ADLS and GCS are currently not supported as target locations for data ingestion within the platform.
WHERE/IN/DROP/RENAME clause in Hive tables with more than one partition does not work if partition type is of type VARCHAR.
Syncing data fails for schema and tables with same name
For users still in older versions before 2.1 version: In HMS, if you use the Register Bucket (sync) API and the bucket contains a schema and table which already exists in watsonx.data, then. sync fails for schema and tables with same name.
Error during SELECT after syncing
For users still in older versions before 2.1 version: In HMS, if you use the Register Bucket (sync) API and the bucket contains a schema that already exists in watsonx.data, sync can happen to an incorrect catalog and after syncing data SELECT cannot be run on the tables, due to credentials mismatch between the catalogs.
Activity tracker support in MDS
The Activity Tracker feature is not supported with MDS.
Hive catalog does not support CSV format for create table int type column
The Hive catalog does not support CSV format for create table int type column. The following error is displayed:
presto> create table hive_data.hive_schema.intcsv ( type int ) with ( format = 'CSV' ) ;
Query 20241017_021409_00059_fmcyt failed: Hive CSV storage format only supports VARCHAR (unbounded). Unsupported columns: type integer
Workaround: Use the following options for Hive catalog:
- Create table in varchar.
- Create view that cast the columns to their original data types.
Presto (C++) engine fails to support Service Principal authorization mode for Azure Data Lake Storage Gen2
Presto (C++) engine fails to support Service Principal authorization mode for Azure Data Lake Storage Gen2.
Workaround: Use ADLS Gen1 Access_Key method for ADLS Gen1 and ADLS Gen2 from the UI.
Test connection may fail for IBM Db2 data source
You may face issues while uploading the certificate obtained by following the steps provided in the topic IBM Db2 for SSL connection of IBM Db2 data source.
Workaround: Convert the SSL certificate to .pem format by the following steps:
-
Open a terminal or command prompt.
-
Navigate to the directory where the certificate file is located.
-
Run the following OpenSSL command:
openssl x509 -inform DER -in DigiCertGlobalRootCA.crt -out DigiCertGlobalRootCA.pem
-
Use the .pem certificate generated in step 3 to retest the connection.
Inconsistent CSV and Parquet file ingestion behaviour
Despite the design specifications stating that CSV files should only be ingested into tables created from CSV files, and parquet files should only be ingested into tables created from parquet files, there is a discrepancy in the actual behaviour where users are able to ingest CSV files into parquet tables. This can result in unexpected results, data quality issues, or performance problems if the schema or formatting of the CSV or parquet file does not align with the expected structure of the target table.
Invalid file associations in Presto resource group through UI and engine restart issues
When an invalid file is associated for an engine in Presto resource group through watsonx.data UI, the engine will experience a restart. However, the user interface may incorrectly display that the engine is using the newly assigned file.
Workaround: If you find that the new file is not associated with watsonx.data environment, reach out to IBM support for further assistance.
Time data type support in Hive and Iceberg
Hive: The Hive catalog does not natively support the time data type.
Iceberg: Iceberg does support the time data type.
Workaround: To enable correct handling of time data in Iceberg tables, the hive.parquet-batch-read-optimization-enabled property
must be set to false
.
Files with different schemas result in null values
watsonx.data now supports ingesting supported file types with varying schemas. However, when columns within these files have distinct schemas, the values in those columns is set to null.
Unsupported special characters in schema and table creation
The following special characters are not supported while creating schemas and tables:
Schemas (Hive and Iceberg): {
, [
, (
, )
, and /
.
Tables (Hive): {
, (
, [
, )
, and /
. (Creation of tables within a schema name that starts with the special character @
shall result in an error).
Tables (Iceberg): $
, @
, {
, [
, )
, (
, and /
.
ALTER TABLE
operation fails in Spark job submission
Spark jobs that creates a schema, table, and then attempt an ALTER TABLE
operation may encounter an authz.AccessControlException
due to insufficient permissions.
This occurs because, even though the schema and table creation are successful, the job tries to execute the ALTER TABLE
operation before the metastore data is updated with the newly created schema and table details.
Workaround: To prevent access denied errors, you must provide a delay in time between each operations that involves creation of new schemas or tables within the same Python script.
Workaround: You can disable DAS or make sure that your buckets or object storage are configured with HTTPS endpoints.
Attempting to read Parquet v2 tables through Presto (C++) results in an error
When you attempt to read Parquet v2 tables through Presto (C++) that were created via Data manager in watsonx.data, it gives the following error:
Error in ZlibDecompressionStream::Next
Workaround: Presto (C++) currently does not support reading Parquet v2 tables. You must copy the data to a new table in v1 format to be compatible for reading using Presto (C++).
-
Set the session property to PARQUET_1_0:
set session <catalog_name>.parquet_writer_version = 'PARQUET_1_0';
-
Run the following command to copy the data to a new table:
create table <catalog name>.<schema name>.<table name> as (select * from <originaltablename>;
Spark ingestion currently does not support special characters like quotation marks, back ticks, and parentheses for partitioned table column names.
Attempting to query Query History and Monitoring Management (QHMM) related tables using Presto (C++) engines might encounter errors
When you attempt to query QHMM related tables using Presto (C++) engines, you might encounter errors due to unsupported file formats. Presto (C++) supports only Parquet v1 formats. You can not use Presto (C++) to query data or tables in other formats.
Workaround: You can switch to use Presto (Java) engines to query QHMM related tables.
Server concurrency limit reached error in flight server
You might encounter a Server concurrency limit reached error when using the flight server to run queries. This occurs when the server experiences high memory usage due to a large number of concurrent requests.
Workaround: Increase the number of flight pods or restructure to simplify the queries to reduce the number of sub queries. Adjust the number of replicas based on your system load and available resources.
Use the following command to scale the number of pods for the wdp-connect-flight
deployment:
oc scale deployment wdp-connect-flight --replicas=<number of replicas>
For example, if you need to scale the number of pods to 36, run the following command:
oc scale deployment wdp-connect-flight --replicas=36
Incorrect recognition of Gregorian dates in Presto with Hive Parquet tables
Presto exhibits issues when processing historical dates prior to 0200-01-01
, specifically when they are stored in Hive tables formatted as Parquet. This issue occurs due to the conversion between the Gregorian and Julian calendars,
which were implemented in 1582-10-15
. Dates before this cutoff date are misinterpreted by Presto.
Incomplete information on column length in SHOW COLUMNS output
The SHOW COLUMNS
query in Presto currently provides information about columns including name, data type, additional details (extra), and comments. This issue highlights that the existing functionality lacks details about the length
of character-based data types (CHAR and VARCHAR). While some connectors return the actual length defined during table creation, others might provide a default value or no information at all.
To address this limitation, three new columns have been added to the SHOW COLUMNS
output:
-
Scale: Applicable to DECIMAL data type, indicating the number of digits after the decimal point.
-
Precision: Applicable to numerical data types, specifying the total number of digits. (Default: 10)
-
Length: Intended for CHAR and VARCHAR data types, representing the maximum number of characters allowed.
Current Limitations:
-
The reported length in the
Length
column might not always reflect the actual size defined in the table schema due to connector limitations. -
Connectors that don't provide length information will display a default value or null depending upon connector.
Calculation error for OPT_SORTHEAP in Query Optimizer
Due to a calculation error in the configuration setting of Query Optimizer for the value of OPT_SORTHEAP
, the performance of Query Optimizer might be affected.
Workaround: To resolve the calculation error for OPT_SORTHEAP
in Query Optimizer, complete the following steps to update the configuration as OPT_SORTHEAP= <initial_value>
to OPT_SORTHEAP <initial_value>/20
.
- Set up the
PROJECT_CPD_INSTANCE
environment variable pointing to the namespace where watsonx.data is installed.
export PROJECT_CPD_INSTANCE=<wxd_namespace
- Edit the value of
OPT_SORTHEAP
toOPT_SORTHEAP <initial_value>/20
by running the following command.
oc edit db2uinstance lakehouse-oaas -n $PROJECT_CPD_INSTANCE
- Wait for sometime for the
STATE
to change toReady
forlakehouse-oaas
and run the following command.
watch "oc get db2uinstance -n $PROJECT_CPD_INSTANCE"
Limitations - Presto (C++)
- Presto (C++) engine currently does not support database catalogs.
- Parquet is the only supported file format.
- Hive connector is supported.
- Default Iceberg table has only read support with Parquet v1 format.
- TPC-H/TPC-DS queries are supported.
DELETE FROM
andCALL SQL
statements are not supported.START
,COMMIT
, andROLLBACK
transactions are not supported.- Data types
CHAR
,TIME
, andTIME WITH TIMEZONE
are not supported. These data types are subsumed byVARCHAR
,TIMESTAMP
, andTIMESTAMP WITH TIMEZONE
.IPADDRESS
,IPPREFIX
,UUID
,kHYPERLOGLOG
,P4HYPERLOGLOG
,QDIGEST
, andTDIGEST
are not supported.VARCHAR
supports only a limited length.Varchar(n)
with a maximum length bound is not supported.TIME
andTIME WITH TIMEZONE
is supported in community development.TIMESTAMP
columns in Parquet files cannot be read.
- Scalar functions:
IPFunctions
,QDigest
,HyperLogLog
, and Geospatial internationalization are not supported.
- Aggregate functions:
QDigest
, Classification metrics, and Differential entropy are not supported.
- S3 and S3 compatible file systems (both read and write) are supported.
Presto (C++) fails to query an external partitioned table
When you query an external table with CHAR
data type columns, the query fails to run. This issue occurs due to the limitation that Presto (C++) does not support CHAR
data types.
Workaround: Change the CHAR
data type column to VARCHAR
data type.
Accessing Hive and Iceberg tables in the same glue metastore catalog
When using the AWS Glue Data Catalog to manage a bucket or storage location containing both Iceberg and Hive tables, attempting to access Iceberg tables from the Hive catalog gives, Not a Hive table
error and attempting to access
Hive tables from the Iceberg catalog gives, Not an Iceberg table
error.
Spark application submission fails when DAS (Data Access Service) is enabled for default MinIO buckets
DAS does not currently support default MinIO buckets or object storage that use HTTP endpoints.
Workaround: You can disable DAS or make sure that your default MinIO buckets or object storage are configured with HTTPS endpoints.
Default MinIO bucket access through S3 proxy is unavailable
Currently, it is not possible to access buckets stored in the default MinIO object storages created during instance provisioning using an S3 proxy functionality.
Presto SQL operations with Spark 3.3 and Iceberg timestamp data
When data containing timestampz
is ingested using Spark, Presto queries on these tables fail with the following error Iceberg column type timestamptz
is not supported.
Workaround: To ensure interoperability between Spark and Presto for datasets containing timestampz
, you must use Spark 3.4 applications with the configuration spark.sql.timestampType
set to TIMESTAMP_NTZ
.
Using ID as a column name in Cassandra CREATE TABLE
In Cassandra, you cannot create a table with a column named ID
while using a Cassandra connector through Presto. This is because ID
is a reserved keyword for the Cassandra driver that is used by Presto, which automatically
generates a UUID for each row. Attempting to create a table with a column name ID results in an error message indicating a duplicate column declaration as follows: Duplicate column id
declaration for table tm_lakehouse_engine_ks.testtable12
Workaround: Avoid using ID
as a column name when creating Cassandra tables through Presto.
User role with CreateCollection
L3 policy fails to create collection in Milvus
Users with User role
while creating collections in Milvus with pymilvus can fail when using the ORM Connection
and MilvusClient Connection
methods.
Workaround: You must follow the instructions:
ORM Connection
: The user requires both DescribeCollection and CreateCollection privileges granted in the L3 policy page. You must select all collections in a database while granting DescribeCollection
privilege in the
L3 policy through web console.
MilvusClient Connection
: Only CreateCollection
privilege is necessary in the L3 policy page. However, the first attempt to create a collection will fail.
- Run the
create_collection
function once. - Re-run the
create_collection
function again. This allows the policies to synchronise and the collection creation will succeed.
Special characters and mixed case impacting data synchronization
When synchronizing data between buckets containing tables or schemas with special characters or mixed case letters in their names, you might encounter with the following unexpected behaviors:
- Tables or schemas with certain special characters
%
,,
,{
,)
,(
,@
,$
,[
,:
will have their data entirely skipped during synchronization. - Tables or schemas with mixed case or uppercase letters will be converted to lowercase before synchronization.
Workaround: Avoid using special characters and mixed case in table and schema names. Rename existing tables and schemas to use only the supported characters.
Missing data validation for Amazon S3 storage endpoints
Currently, the user interface (UI) does not perform data validation for endpoints associated with the Amazon S3 storage type.
Incorrect alias usage in WITH
clause and USE catalog.schema
WITH
clause: When referencing data within the WITH
clause, use the exact alias name assigned during its definition. Using an incorrect alias triggers the following error message.
Schema must be specified when session schema is not set
USE catalog.schema
usage along with WITH
clause: When tables are specified using WITH
and USE catalog.schema
, queries with incorrect alias names will result in the following error.
Table does not exist
String literal interpretation in Presto (Java)
Presto (Java), by default interprets string literals as VARCHAR, unlike many other database systems that treat them as CHAR.
In Presto (Java), string comparisons are performed on the actual characters present in the string, excluding trailing spaces. This can cause queries to return incorrect results when working with strings that may contain trailing spaces, as these spaces are not considered during comparison.
Table names with multiple dots
Presto (Java) does not support creating or querying table names that contain three or more consecutive dots in its name. Attempts to reference such tables in queries may result in errors.
User is still visible in the Access control page of an engine after removing the user from IAM.
LDAP authentication is not supported for Teradata connector.
The watsonx.data Teradata connector does not currently support LDAP (Lightweight Directory Access Protocol) for user authentication.
Workaround: If you encounter the 502 error, reload the Spark history UI page after waiting 1-5 seconds. This should allow enough time for the server to become operational.
Cross catalog schema creation anomaly in Presto.
An anomaly exists in schema creation for Hive and Iceberg catalogs managed by Presto. When using a common Hive Metastore Service for multiple catalogs (Example, an Iceberg catalog and a Hive catalog, or two Iceberg or Hive catalogs), creating a schema in one catalog might create it in a wrong catalog. This occurs if the location specified during schema creation belongs to a different catalog than intended.
Workaround: You must always explicitly provide the correct storage path associated with the target catalog when using CREATE SCHEMA
statements in Presto. This ensures the schema is created in the desired location.
Presto (Java) queries with many columns and size exceeding default limit.
Presto (Java) queries involving multiple tables with a large number of columns (for example, 1000 columns per table or more) in the SELECT
clause might encounter performance issues across all deployment environments.
The iterative optimizer times out when max_reorder_joins
is set to 5 or higher (the default timeout is 3 minutes) and gives the following error:
The optimizer exhausted the time limit of 180000 ms
For queries exceeding the default max-task-update-size
limit (16MB in Presto (Java)), you might observe a TaskUpdate size exceeding this limit
error (the specific value of limit depends on the actual query).
Workaround:
-
You can improve query performance by temporarily disabling the
reorder_joins
rule using the following session property:set session reorder_joins = false;
-
Increase the
max-task-update-size
value in the config.properties file if the issue involves aTaskUpdate size exceeding the limit
error and restart Presto (Java).
Example:
experimental.internal-communication.max-task-update-size=64MB
Limitation: Transactions not supported in unlogged Informix databases.
In watsonx.data, when attempting to execute queries with transactional implications on unlogged Informix databases, queries will fail. This is because unlogged Informix databases, by design, do not support transactions.
Limitation: Netezza Performance Server INSERT statement limitation.
Netezza Performance Server currently does not support inserting multiple rows directly into a table using VALUES clause. This functionality is limited to single-row insertions. Refer to the official Netezza Performance Server documentation for details on the INSERT statement.
The following example using VALUES for multiple rows is not supported:
INSERT INTO EMPLOYEE VALUES (3,'Roy',45,'IT','CityB'),(2,'Joe',45,'IT','CityC');
Workaround: Use a subquery with SELECT and UNION ALL to construct a temporary result set and insert it into the target table.
INSERT INTO EMPLOYEE SELECT * FROM(SELECT 4,'Steve',35,'FIN','CityC' UNION ALL SELECT 5,'Paul',37,'OP','CityA') As temp;
Issue: Milvus unresponsive to queries.
Milvus may not respond to queries when attempting to load collections or partitions that exceed available memory capacity. This occurs because all search and query operations within Milvus are executed in memory, requiring the entire collection or partition to be loaded before querying.
Workaround:
-
Consider the memory limitations of your Milvus deployment and avoid loading excessively large collections or partitions.
-
If Milvus becomes unresponsive to queries, employ the appropriate Milvus API to unload or release some collections from memory. An example using Python SDK:
collection.release()
Issue: Inaccurate row count after deletions in Milvus.
The collection.num_entities
property might not reflect the actual number of rows in a Milvus collection after deletion operations. This property provides an estimate and may not account for deleted entities.
To get an accurate count of rows, execute a count(*)
query on the collection. This provides an accurate count even after deletions.
Pymilvus syntax:
collection = pymilvus.Collection(...)
collection.query(expr='', fields=['count(*)'])
Limitations: Unsupported Db2 operations.
watsonx.data currently does not support the ALTER TABLE DROP COLUMN operation for Db2 column-organized tables.
By default, Db2 instances create tables in column-organized format.
watsonx.data does not support creating row-organized tables in Db2.
Limitations: Handling Null Values in Elasticsearch.
Elasticsearch connector requires explicit definition of index mappings for fields to handle null values when loading data.
Limitations: Loading Nested JSON with Elasticsearch.
Elasticsearch connector requires users to explicitly specify nested JSON structures as arrays of type ROW for proper loading and querying. To process such structures, use the UNNEST operation.
Limitation: Users can create 3 instances of Milvus service for a single instance of watsonx.data in IBM Cloud.
Issue: Unable to create views in Presto.
Presto describes a view in a mapped database as a TABLE rather than a VIEW. This is apparent to JDBC program connecting to the Presto engine.
Issue: Using special characters in schema, table, or column names.
It is recommended to not use special characters such as question mark (?), hyphen (-), or asterisk (*) in table, column names and schema names. Though these special characters are supported and tables, columns and schemas can be created, using
these special characters might cause issues when running the INSERT
command.
Issue: User is not removed from the catalog access control on revoking data access.
When you grant user access to a user by adding them to the data control policies by using the Access Control screen, the user is successfully listed against the catalog. On revoking user access from the Access Control page, the user stays listed against the catalog and continues to have user access.
Issue: Unable to view expected catalogs from Presto (Java).
Users with administrator privileges are unable to view the expected Hive and PostgreSQL catalogs from Presto (Java).
Issue: Console UI lists invalid users.
watsonx.data user (user1) invites a new user (user2) to the account by using the Manage access and users screen (Manage > Access (IAM) > Manage access and users) and grants access to a role (MetastoreAccess, Viewer, Operator, Editor, Administrator). User2 gets access to resources in the watsonx.data instance through user1's account. Additionally, user2 is granted data access at the resource level by adding to the data control policies by using the Access Control screen. When user1 removes user2 from the user1's account, user2 is still listed in the Access Control tab at resource level.
Issue: Unable to view created schema.
When a user with the User role and the Create access (the user only has the Create access) is added to an external database, they cannot see the schemas that they created. Though the user can create schemas, they cannot view them. The following is the system response:
presto:default> show schemas;
Schema
--------
(0 rows)
Workaround: Provide select privilege for the schema the user created.
Issue: Access denied when querying an external database.
When a user with the User role and Create access (the user only has Create access), is added to an external database, they cannot run the select query from the table they have created. Though the user can connect to the Presto (Java) engine
and create tables and schemas, they cannot query from the table. The system displays a Access Denied
message.
Query 20230608_132213_00042_wpmk2 failed: Access Denied: Cannot select from columns [id] in table or view tab_appiduser_01
Workaround: Provide select privilege for the table the user created.
Issue: Schema created under different catalog.
Schemas are available across Iceberg and Hive catalogs. When a schema is created under Iceberg catalog, it is listed under Hive catalog and vice versa.
Issue: Presto (Java) does not support deletion of Iceberg tables.
Issue: DROP SCHEMA in Db2.
In Db2, the schema can be dropped only if it is empty. Initiating DROP SCHEMA
statement against a non-empty schema may result in Db2 SQL Error SQLCODE=-478
and SQLSTATE=42893
.
Issue: CREATE VIEW statement that is partially supported by Db2.
Db2 connector partially supports CREATE VIEW
statement. The Presto (Java) supported SQL syntax does not include creating views with custom column names (different than the table column names).
Issue: CREATE VIEW statement that is partially supported by NPSaaS.
NPSaaS connector partially supports CREATE VIEW
statement. The Presto (Java) Supported SQL syntax does not include creating views with custom column names (different than the table column names).
Issue: Presto (Java) does not recognize the path as a directory.
When you create a new table with a Presto (Java) Hive connector that uses an S3 folder from an external location, Presto (Java) does not recognize the path as a directory and an error might occur.
For example, when creating a customer table in the target directory DBCERT/tbint
in a bucket that is called dqmdbcertpq
by using the IBM Cloud UX and Aspera S3 console, the following error is encountered: External location must be a directory
.
CREATE TABLE "hive-beta"."dbcert"."tbint" (
RNUM int , CBINT bigint
) WITH (
format='PARQUET', external_location = 's3a://dqmdbcertpq/DBCERT/tbint'
);
Query 20230509_113537_00355_cn58z failed: External location must be a directory
Objects in a file system are stored as objects and their path. The object and path must have an associated metadata. If the path is not associated with the metadata, Presto (Java) fails to recognize the object and responds that the path is not a directory.
Issue: Assigning Grant or Revoke privilege.
Assigning Grant or Revoke privilege to a user through access policy does not work as expected in the following scenarios:
-
User_A adds a bucket and a Hive catalog (for example,
useracat02
). -
User_A creates a schema and a table.
-
User_B and User_C are assigned User roles to the catalog.
-
User_A adds allow grant policy to User_B.
-
User_B connects to the catalog and runs
grant select
to User_C.presto:default> grant select on useracat02.schema_test_01.tab_1 to "6ff74bf7-b71b-42f2-88d9-a98fdbaed304";
-
When the User_C connects to the catalog and runs
select
command on the table, the command fails with access denied message.presto:default> select * from useracat02.schema_test_01.tab_1; Query 20230612_073938_00132_hthnz failed: Access Denied: Cannot select from columns [name, id, salary, age] in table or view tab_1
Issue: Creating schema without a location.
When you create a schema without a location, it is not listed in the schema list of any catalog. For example, if you create a schema without specifying the location of the bucket, the schema is created in HMS and not in the bucket. When you try to create a new schema with the same name, it fails and responds that the schema already exists.
Workaround: Specify the location of the bucket when creating a schema.
Issue: Unique names for schema and bucket.
A schema and a bucket cannot be created with the same name. For example, if you create a schema that is named “sales” in one catalog, the same name cannot be used for another schema in another catalog. Similarly, if you register a bucket with the name “salesbucket”, another bucket with the same cannot be registered, even if the bucket is located in a different object store.
Workaround: Use unique names when creating schemas and buckets.
Issue: Creating schema for target table.
You must create schema for the target table if the schema does not exist.
Issue: Ingestion fails if CSV file contains bad record.
ibm-lh tool does not support skipping maximum bad records for CSV files if the mismatch field is greater than the table definition.
Issue: Creating schema location with path.
Use one of the following location options when creating a schema:
- Location pointing to a bucket/subpath without a trailing
/
. - Location pointing to a bucket/subpath with a trailing
/
– Recommended for better structuring.
Though you can use a location pointing to a bucket only with or without a trailing /
, it might lead to failure. Therefore, it is recommended to use a subpath.
Issue: Presto (Java) do not support AS OF
with iceberg tables.
Presto (Java) do not support AS OF <time stamp>
command in a SELECT query.
Workaround: Invoke CALL iceberg_data_rollback_to_snapshot
to move to the required timestamp.
If you use CALL iceberg_data_rollback_to_snapshot
with a timestamp, you cannot call the stored procedure to move to a later timestamp. Use Spark SQL as an alternative.
Issue: Only the creator has DROP access on the table in Apache Hive (API).
Only the creator of a table can drop the table that is created in the Apache Hive catalog. Other users cannot drop the table even if they have an explicit DROP access to the table. They get the Access Denied
message.
Issue: User-provided certificates are not supported by watsonx.data.
Currently, user-provided certificates are not supported in watsonx.data when adding database connections, object store buckets, or when using ibm-lh utility.
Issue: No columns to parse from file error.
When you try to ingest folder from AWS S3 using the ibm-lh tool, the following error may be encountered if there are no empty files in the folder:
No columns to parse from file
Workaround: First list the folders inside the bucket by using aws s3 ls
command. If no empty files are listed, copy all the files to another folder by using aws s3 cp
command.
Special characters in target table names can cause ingestion failures.
Ingestion fails if a target table name contains special characters in it when ingesting through the web console.
Workaround: You can ingest data by using ingestion through Spark CLI.
Limitation: Presto (Java) does not support VARBINARY
datatype.
The current version of Presto (Java) does not support binary strings with length. Execution of an ALTER TABLE
statement on a database results in the following error:
Unknown type 'varbinary(n)' for column 'testcolumn'
This is a limitation in Preso and not a limitation in watsonx.data.
Limitation: Back up your data to prevent data loss while working with VS Code development environment - Spark Labs.
As Spark labs are ephemeral in nature, you must back up the data stored periodically to prevent potential data loss during upgrades or a Spark master crash.