Developing and implementing a custom crawler plug-in
The crawler
plug-in includes a file that is called com.ibm.es.ama.plugin.CrawlerPlugin
. This file is the initialization interface that has methods you can use when you work with your crawler
plug-in.
IBM Cloud Pak for Data IBM Software Hub
This information applies only to installed deployments.
Interfaces and Javadoc
The interface library is stored in the lib/ama-zing-crawler-plugin-${build-version}.jar
directory of the SDK directory. The Javadoc for the JAR file is available in the lib/ama-zing-crawler-plugin-${build-version}-javadoc.jar
file in the same directory.
Initialization interface
Use the com.ibm.es.ama.plugin.CrawlerPlugin
interface to manage the crawler
plug-in. The interface has the following methods:
Method | Description |
---|---|
init |
Start a crawler plug-in |
term |
Stop a crawler plug-in |
updateDocument |
Update crawled documents |
Dependency management
The file build.gradle
manages the Java dependencies.
Crawler plug-in sample
A sample crawler
plug-in is available that illustrates how to add, update, and delete metadata. The plug-in example also updates and deletes documents that are crawled by the local file system connector. The Java source code file
is named src/main/java/com/ibm/es/ama/plugin/sample/SampleCrawlerPlugin.java
.
Logging messages
The custom crawler
plug-in supports the java.util.logging.Logger
package for logging messages.
Any log messages that you add must meet the following requirements:
- The log level must be
INFO
or higher. - The logger name must start with
com.ibm.es.ama
.
Messages are written to the log file of the crawler
pod where the plug-in is running. A logging sample is available in the crawler
plug-in sample.