Build an Elasticsearch chatbot

Objectives

This tutorial shows how an IBM watsonx.ai model can be enhanced with knowledge gleaned by spidering content from your website to produce a chatbot that is capable of answering questions related to your knowledge base. This technique is known as Retrieval-Augmented Generation (RAG). Pre-trained large language models have good general knowledge, being trained with a large corpus of public content, but they lack domain-specific knowledge about your business, such as:

"Can I get a refund if the box is opened?"
"What is the waiting list for treatment?"
"Do you deliver on Saturdays?"

We can build a chatbot using the following IBM Cloud services:

Databases for Elasticsearch that runs the ELSER Natural Languge Processing (NLP) model to enhance the incoming data before being stored in an Elasticsearch index. An ingest pipeline is used to allow data to feed into ELSER before the enhanced data is stored.
Elastic Enterprise Search is deployed on IBM Cloud® Code Engine and is used to spider your website to collect domain-specific data and feed it into Elasticsearch's ingest pipeline.
Kibana is deployed on IBM Cloud Code Engine and becomes the web UI for Elasticsearch and Elastic Enterprise Search. It is used to specify and to set off the web crawler.
IBM watsonx.ai runs a pre-trained machine learning model to answer chatbot requests. The model's API is used to produce chatbot responses given the user prompt and the contextual data collected by running the prompt against the spidered and enhanced data in Elasticsearch.
A simple Python app is deployed on IBM Cloud Code Engine to provide a chatbot web interface. It collects prompts from users, queries the Elasticsearch data and then uses IBM watsonx.ai to produce the response.

Databases for Elasticsearch, IBM Cloud Code Engine, and IBM watsonx are paid products so this tutorial incurs charges.

Prerequisites

An IBM Cloud account.
Terraform - to deploy infrastructure.
A website containing public-facing text content that we will "spider" to bolster the chatbot's expertise.
Docker running on your local machine.

Follow these steps to create an IBM Cloud API key that enables Terraform to provision infrastructure into your account. You can create up to 20 API keys.

For security reasons, the API key is only available to be copied or downloaded at the time of creation. If the API key is lost, you must create a new API key.

Set up an IBM watsonx.ai project

Most of the infrastructure is deployed through Terraform, but IBM watsonx.ai must be set up manually.

IBM watsonx.ai is a studio of integrated tools for working with generative AI capabilities that is powered by foundation models for building machine learning applications. The IBM watsonx.ai component provides a secure and collaborative environment where you access your organization's trusted data, automate AI processes, and deliver AI in your applications.

Follow these steps to set up IBM watsonx.ai:

Sign Up for IBM watsonx.ai as a Service and click on the Get Started link for watsonx.ai. Select a region and log in.
Create a project within watsonx.ai. In the Projects box, click + and Create Project. Name you project and supply the IBM Cloud® Object Storage instance that will be used to store the project's state. (If you don't have a Object Storage instance, create one.
In the project's Manage tab, in the General page, make a note of the "Project ID".
In the project's Manage tab, in the Services and Integrations page, click Associate Service. Then, click New Service and choose the Watson Machine Learning option. You will use the Lite plan, so just click Create.

Provision the infrastructure with Terraform

Clone the repo:

git clone https://github.com/IBM/icd-elastic-bot.git
cd icd-elastic-bot
cd terraform

In this directory, create a file called terraform.tfvars containing the following data, but replacing the placeholder (MY_* values) with your own:

ibmcloud_api_key="MY_IBM_CLOUD_API_KEY"
region="eu-gb"
es_username="admin"
es_password="MY_ELASTICSEARCH_PASSWORD"
es_version="8.12"
wx_project_id="MY_WATSONX_PROJECT_ID"

Pick a secure Elasticsearch password that together with the Elasticsearch username will become the credentials required to access Elasticsearch and the Kibana web user interface.

Now deploy the infrastructure with:

terraform init
terraform apply --auto-approve

Terraform will output:

the URL of your Kibana instance.
the URL of the Python app.

Make a note of these values for the next steps.

Feed the data

This step feeds your website's data to your Databases for Elasticsearch instance. You will use the Elastic web crawler. This feature, accessed through Kibana, is used to extract data from any website.

Follow the steps to add your data:

Navigate to your Kibana URL - Terraform output this URL from the previous section. Log in with the Elasticsearch username and password you chose.
In the Kibana UI, in the Search section, choose the Overview option. Click on Crawl URL.
Name your index search-bot (the search- prefix is already present). Click Create index.
Add your website's URL in the Manage domain section of the index. Click Validate domain and then Add domain.
Click Add Inference Pipeline in the Machine Learning Inference Pipeline section and follow the steps. Select .elser_model_1 for the trained ML Model and make sure the model is in a Started state. Select the title field in Select field mappings step and click Add, then Continue, and Create pipeline.
Click on Crawl all domains and then Crawl all domains on this index. Then, wait until the data is collected.

Query the data

Navigate to the Python app's URL in your web browser - this can be found in the output from Terraform as python_endpoint from previous steps.

Start interacting with the model by asking questions, and you will receive answers generated by IBM watsonx.ai. The Python app takes your prompt and searches the Elasticsearch index populated with spidered and enhanced data from the web crawl. It then uses the IBM watsonx.ai API to produce a response from the context provided by the Elasticsearch result.

Conclusion

In this tutorial you created an Databases for Elasticsearch instance that is paired with Kibana and Elastic Enterprise Search hosted on IBM Cloud Code Engine. You then configured Elasticsearch to use its Elser model in a pipeline so that when you spidered a website's contents, its JSON documents were augmented with sparse vector data. You also deployed a Python app that takes user prompts, queries the spidered data in Elasticsearch to gather domain-specific context before sending the prompt and the context to a watsonx.ai large language model to formulate a response to the prompt.

Now you can create your own chat applications using watsonx.ai, enhanced by modelling your own domain-specfic data held in Databases for Elasticsearch.

Your Databases for Elasticsearch incurs charges. After you finish this tutorial, you can remove all the infrastructure by going to the terraform directory of the project and using the command:

terraform destroy