IBM Cloud Docs
Installing a custom Cloud Pak for Data connector

Installing a custom Cloud Pak for Data connector

After you have compiled and packaged your custom connector, you need to install it to your Discovery instance.

IBM Cloud Pak for Data IBM Cloud Pak for Data only

This information applies only to installed deployments.

Discovery provides a script named manage_custom_crawler.sh for installing and uninstalling custom connectors. The script is located in the scripts directory of the expanded custom-crawler-docs.zip file as described in Understanding the custom-crawler-docs.zip file.

Installing a connector

You can install your custom connector to your Discovery instance by performing the following steps.

  1. Ensure that you have completed all steps to create a custom connector up to and including the steps listed in Compiling and packaging the example connector.

  2. Run the following command from the directory on your local machine where you created and compiled your custom connector:

    bash scripts/manage_custom_crawler.sh --endpoint {endpoint} --token {access token} deploy -n {crawler name} -f {built_connector_zip_file}
    

    where you specify values for the following variables:

    • endpoint: URL for your service instance. You can get this value from the Access information section of the service instance overview page in the IBM Cloud Pak for Data administrative console.
    • access token: Bearer token that is required to access the endpoint. You can get this value from the same page as the endpoint.
    • crawler name: (Optional) Name that you specified for the crawler.
    • {built_connector_zip_file} is the name of the file you created in Compiling and packaging the example connector.

    For example:

    bash scripts/manage_custom_crawler.sh --endpoint https://mycpd.wd40.example.com/discovery/zen40-wd/instances/1638165624521059/api --token eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6ImVKcV9HY29NcHF5WUFJcVByZ0x0cERRZDNQcmRiTWo5TGg0X09WOEU4MlkifP.eyJ1aXQiOiIxMDAwMzMwOTk5IiwidXNlcm5hbWUiOiJhZG1pbiIsInJvbGUiOiJBZG1pbiIsInBlcm1pc3Npb25zIjpbImFkbWluaXN0cmF0b3IiLCJjYW5fcHJvdmlzaW9uIl0sImdyb3VwcyI6WzEwMDAwXSwic3ViIjoiYWRtaW4iLCJpc3MiOiJLTk9YU1NPIiwiYXVkIjoiRFNYIiwiaWF0IjoxNjQyNzAyMDA3fQ.5oymGw7pi6tAbTMW9rcdb62G95teR2-tKyznA_wjk_G698fbx1Zl73KZKyEWcTKtyX7IJ1Px5DPdophcqS9i3bPJowHy-ioVp6DML02mscZImhvZPra-e6gwUdhSB64KArmMClo1-kZG20EclNh6-oxR447Bjdsgp7IYpkmynmw0K6vPIqmzwEhr9gAK1vWLOoVd4EoiYNuxZaSFL5byJ0mnQxXzM14w3lKQHZ91WYVKc4JnuJiSVsdpGqVz1JNFmT8D9FBqJQ4uxtshnii0f1Yh-USKCbJmMPXicU8cDtJIfheBejwenfvejUTz5rgZgymYWrGvw3G2oOx_L1Yg-Q deploy -n awesome_crawler -f awesome_crawler.zip
    

    Instead of specifying an access token for authentication, you can specify username and password parameters. For more information, see Understanding the manage_custom_crawler.sh script.

When the custom crawler is deployed, a resource ID is assigned to the connector.

Verifying an installed connector

Verify that the connector has been deployed to the Discovery instance by logging into the Discovery tooling and ensuring that your connector is displayed as an option on the Configure collection page.

Using an installed connector on Discovery

To use the installed custom connector, follow the steps listed in Creating a collection. The custom connector appears in the list of connectors provided at Configuring Cloud Pak for Data data sources. For more information, see Using a custom connector with the Discovery tooling.

Uninstalling a connector

To uninstall a custom connector from a Discovery instance, complete the following steps:

  1. Optional: If you don't know the resource ID, run the following command to list the custom connectors. The resource IDs of the connectors are returned.

    scripts/manage_custom_crawler.sh --endpoint {endpoint} --token {token} list
    
  2. Run the following command from the directory where you extracted your custom connector ZIP file to uninstall the connector:

    scripts/manage_custom_crawler.sh --endpoint {endpoint} --token {token} undeploy --id {crawler_resource_id}
    

    where {crawler-resource-id} is the ID that is generated for the crawler when it is deployed.

    scripts/manage_custom_crawler.sh --endpoint {endpoint} --token {token} undeploy --id {crawler_resource_id}
    

Instead of specifying an access token for authentication, you can specify username and password parameters. For more information, see Understanding the manage_custom_crawler.sh script.

Understanding the manage_custom_crawler.sh script

The manage_custom_crawler.sh script has the following internal documentation:

Watson Discovery Custom Crawler Manager

This script will help you deploy, manage, and undeploy your custom crawler for
Watson Discovery.

Subcommands:
  deploy        Add a new Custom Crawler to your Watson Discovery instance.
  undeploy      Undeploy your Custom Crawler by name.
  list          List all Custom Crawlers for your Watson Discovery instance.

Options:
  -e --endpoint         The endpoint URL for your cluster and add-on service instance
                        (`https://{cpd_cluster_host}:{port}/discovery/{release}/instances/{instance_id}/api`)
  -t --token            The authorization token of your Cloud Pak instance
  -u --user             The user name of your Cloud Pak instance
  -p --password         The user password of your Cloud Pak instance
                        If the password is not specified, the command line prompts to input
  -n --name             The name of the custom crawler to upload (deploy only)
  -f --file             The path of the custom crawler package to upload (deploy only)
  -i --id               The crawler_resource_id value to delete the custom crawler (undeploy only)
  --help                Show this message.

4.0.5 and earlier releases only

Installing a connector in 4.0.5 and earlier releases

You can install your custom connector to your Discovery instance by performing the following steps.

  1. Ensure that you have completed all steps to create a custom connector up to and including the steps listed in Compiling and packaging the example connector.

  2. Run the following command from the directory on your local machine where you created and compiled your custom connector:

    bash scripts/manage_custom_crawler.sh deploy -z {built_connector_zip_file}
    

    where {built_connector_zip_file} is the name of the file you packaged in Compiling and packaging the example connector.

    If your Discovery instance is running on Red Hat OpenShift, specify the -o or --openshift parameter with the script.

    For example:

    bash scripts/manage_custom_crawler.sh deploy -z myCrawler.zip -o true
    

Uninstalling a connector in 4.0.5 and earlier releases

To uninstall a custom connector from a Discovery instance, run the following command at the root of the unzipped custom-crawler-docs.zip directory:

bash scripts/manage_custom_crawler.sh undeploy -n {built_connector_name}

where {build_connector_name} is the name, not the zip file, of the installed connector.

If your IBM Watson® Discovery instance is running on Red Hat OpenShift, specify the -o or --openshift parameter with the script.

bash scripts/manage_custom_crawler.sh undeploy -n {built_connector_name} -o true

Understanding the manage_custom_crawler.sh script in 4.0.5 and earlier releases

The manage_custom_crawler.sh script has the following internal documentation:

Usage: ${BASH_SOURCE[0]} [--pathToZip PATH] [--properties PROPERTIES] [--xml XML]

Watson Discovery Custom Crawler Manager

This script will help you deploy, manage, and undeploy your custom crawler for
Watson Discovery.

Subcommands:
  deploy        Add a new Custom Crawler to your Watson Discovery instance.
  properties    Generate the properties file for your crawler.
  undeploy      Undeploy your Custom Crawler by name.
  list          List all Custom Crawlers for your Watson Discovery instance.

Options:
  -d --discovery        The name of the Watson Discovery instance
  -z --zipfile          The path to the zip file to be uploaded.
                        For deploy only.
  -x --xml              The path to the XML file to be uploaded.
                        For deploy only.
  -n --name             The name of the Custom Crawler to undeploy.
  -m --messages         The path to the properties file, used when doing a two part deploy.
                        For properties only.
  -o --openshift        Set flag to true if this is an OpenShift Cluster
  --help                Show this message.