Developing and implementing a custom crawler plug-in

The crawler plug-in includes a file that is called com.ibm.es.ama.plugin.CrawlerPlugin. This file is the initialization interface that has methods you can use when you work with your crawler plug-in.

IBM Cloud Pak for Data IBM Software Hub

This information applies only to installed deployments.

Interfaces and Javadoc

The interface library is stored in the lib/ama-zing-crawler-plugin-${build-version}.jar directory of the SDK directory. The Javadoc for the JAR file is available in the lib/ama-zing-crawler-plugin-${build-version}-javadoc.jar file in the same directory.

Initialization interface

Use the com.ibm.es.ama.plugin.CrawlerPlugin interface to manage the crawler plug-in. The interface has the following methods:

Supported methods
Method	Description
`init`	Start a crawler plug-in
`term`	Stop a crawler plug-in
`updateDocument`	Update crawled documents

Dependency management

The file build.gradle manages the Java dependencies.

Crawler plug-in sample

A sample crawler plug-in is available that illustrates how to add, update, and delete metadata. The plug-in example also updates and deletes documents that are crawled by the local file system connector. The Java source code file is named src/main/java/com/ibm/es/ama/plugin/sample/SampleCrawlerPlugin.java.

Logging messages

The custom crawler plug-in supports the java.util.logging.Logger package for logging messages.

Any log messages that you add must meet the following requirements:

The log level must be INFO or higher.
The logger name must start with com.ibm.es.ama.

Messages are written to the log file of the crawler pod where the plug-in is running. A logging sample is available in the crawler plug-in sample.