Ingesting data from databases
You can ingest data from connected databases into IBM® watsonx.data by using the Spark ingestion UI. This flow enables you to browse and select tables from your connected database and ingest them into your data lakehouse.
Before you begin
- Review the prerequisites for using the Spark ingestion UI.
- The database must be connected to watsonx.data before it appears in the list. Contact your administrator if the database you need is not available.
- To ensure the ingestion is accurate and consistent with source data, all input schemas must be identical or compatible with the target table schema.
Procedure
- Log in to the watsonx.data console.
- From the navigation menu, select Data manager.
- Click Ingest data.
- Select Databases as the ingestion flow.
- From the Select database dropdown, choose the connected database you want to ingest from.
- If you need to add a new database connection, click Add next to the database selector.
- After selecting a database, the available schemas are displayed in the Schemas panel.
- In the Database panel on the left:
- The Schemas section displays the number of schemas available
- Use the Find schema search box to filter schemas by name
- If no schemas are displayed, the message "No schemas in this database" appears with guidance to try selecting a different database
- In the Browse Table panel in the center:
- After selecting a schema, the Selected table and Schema information is displayed at the top
- Use the Find table search box to filter tables by name
- When no schema is selected, the message "No schema selected" appears with guidance: "Files within the selected schema will appear here. Try selecting a schema."
- Select the tables you want to ingest by clicking on them.
- See Configuring target table settings in the parent topic.
- See Configuring job details in the parent topic.
- Click Done to submit the ingestion job, or Cancel to discard the configuration.
Results
After the ingestion job completes successfully, the data from the source database tables is loaded into the target table in watsonx.data. The source database tables remain unchanged.