IBM Cloud Docs
Advanced analysis and log-related tasks

Documentation for the classic Watson Assistant experience has moved. For the most up-to-date version, see Advanced analysis and log-related tasks.

Advanced analysis and log-related tasks

Learn about APIs and other tools you can use to access and analyze log data.

Using Jupyter notebooks for analysis

IBM created Jupyter notebooks that you can use to analyze the behavior or your assistant. A Jupyter notebook is a web-based environment for interactive computing. You can run small pieces of code that process your data, and you can immediately view the results of your computation.

You can use the notebooks with English-language skills only.

Analysis notebooks

There is a set of analysis notebooks that you can use with standard Python tools and a set that is designed for optimal use with IBM Watson® Studio.

Watson Studio is a product that provides an environment in which you can pick and choose the tools you need to analyze and visualize data, to cleanse and shape data, to ingest streaming data, or to create, train, and deploy machine learning models. See the product documentation for more details.

The Watson Assistant Continuous Improvement Best Practices Guide describes how to get the most out of these notebooks.

Using the notebooks with Watson Studio

The following notebooks are available:

If you choose to use the notebooks that are designed for use with Watson Studio, the steps are roughly these:

  1. Create a Watson Studio account, create a project, and add a Cloud Object Storage account to it.

  2. From the Watson Studio community, choose a notebook.

    Early in the development process, use the Dialog skill analysis for Watson Assistant notebook to help you get started. It offers the following types of insights:

    • Examines the terms that are correlated with each intent in your training data to find anomalies that might identify problems that you can investigate further.
    • Uses a blind test set that you provide to calculate performance on statistical metrics like Accuracy, Precision, Recall & F1.
    • Offers advanced features that you can use to find the causes of common issues such as why some sentences are often misidentified.

    To learn more about how this notebook can help you improve your dialog, read this Medium.com blog post.

  3. After you deploy a version of the assistant, and have some conversation log data collected, run the Measure Watson Assistant Performance notebook.

  4. Follow the step-by-step instructions provided with the notebooks to analyze a subset of the dialog exchanges from the logs.

    Run the following notebook first:

    • Measure: Gathers metrics that focus on coverage (how often the assistant is confident enough to respond to users) and effectiveness (when the assistant does respond, whether the responses are satisfying user needs).

    The insights are visualized in ways that make it easier to understand areas for improvement in your assistant.

  5. Export a sample set of the logs from ineffective conversations, and then analyze and annotate them.

    For example, indicate whether a response is correct. If correct, mark whether it is helpful. If a response is incorrect, then identify the root cause, the wrong intent or entity was detected, for example, or the wrong dialog node was triggered. After identifying the root cause, indicate what the correct choice would have been.

  6. Feed the annotated spreadsheet to the Analyze Watson Assistant Effectiveness notebook.

    • Effectiveness: Performs a deeper analysis of your logs to help you understand the steps you can take to improve your assistant.
  7. Use the Dialog Flow Analysis for Watson Assistant notebook to review your dialog. The notebook can help you pinpoint the dialog nodes where customers most frequently abandon the conversation.

    For more information about how this notebook can help you analyze and assess abandonment, read this Medium.com blog post.

This process helps you to understand the steps you can take to improve your assistant.

Using the notebooks with standard Python tools

If you choose to use standard Python tools to run the notebooks, you can get the notebooks from GitHub.

Again, the Watson Assistant Continuous Improvement Best Practices Guide outlines which notebook to use at each stage of your improvement process.

Using the logs API

You can use the /logs API to list events from the transcripts of conversations that occurred between your users and your assistant. For conversations created by using the v2 /message API, use the instance-level endpoint to list log events in all workspaces, and then filter by Assistant ID. For more information about filtering logs, see Filter query reference.

The API logs messages that are exchanged in conversations that are defined by a dialog skill only.

The number of days that logs are stored differs by service plan type. See Log limits for details.

For a Python script you can run to export logs and convert them to CSV format, download the export_logs_py.py file from the Watson Assistant GitHub) repository.

Understanding logs-related terminology

First, review the definitions of terms that are associated with Watson Assistant logs:

  • Assistant: An application - sometimes referred to as a 'chat bot' - that implements your Watson Assistant content.
  • Assistant ID: The unique identifier of an assistant.
  • Conversation: A set of messages consisting of the messages that an individual user sends to your assistant, and the messages your assistant sends back.
  • Conversation ID: Unique identifier that is added to individual message calls to link related message exchanges together. App developers using the V1 version of the Watson Assistant API add this value to the message calls in a conversation by including the ID in the metadata of the context object.
  • Customer ID: A unique ID that can be used to label customer data such that it can be subsequently deleted if the customer requests the removal of their data.
  • Deployment ID: A unique label that app developers using the V1 version of the Watson Assistant API pass with each user message to help identify the deployment environment that produced the message.
  • Instance: Your deployment of Watson Assistant, accessible with unique credentials. A {{site.data.keassistant_classic_shortnshort}} instance might contain multiple assistants.
  • Message: A message is a single utterance a user sends to the assistant.
  • Skill ID: The unique identifier of a skill.
  • User: A user is anyone who interacts with your assistant; often these are your customers.
  • User ID: A unique label that is used to track the level of service usage of a specific user.
  • Workspace ID: The unique identifier of a workspace. Although any workspaces that you created before November 9 are shown as skills in the product user interface, a skill and a workspace are not the same thing. A skill is effectively a wrapper for a V1 workspace.

Important: The User ID property is not equivalent to the Customer ID property, though both can be passed with the message. The User ID field is used to track levels of usage for billing purposes, whereas the Customer ID field is used to support the labeling and subsequent deletion of messages that are associated with end users. Customer ID is used consistently across all Watson services and is specified in the X-Watson-Metadata header. User ID is used exclusively by the Watson Assistant service and is passed in the context object of each /message API call.

Associating message data with a user for deletion

There might come a time when you want to completely remove a set of your user's data from a Watson Assistant instance. When the delete feature is used, then the Overview metrics will no longer reflect those deleted messages; for example, they will have fewer Total Conversations.

Before you begin

To delete messages for one or more individuals, you first need to associate a message with a unique Customer ID for each individual. To specify the Customer ID for any message sent using the /message API, include the X-Watson-Metadata: customer_id property in your header. You can pass multiple Customer ID entries with semicolon separated field=value pairs, using customer_id, as in the following example:

curl -X POST -u "apikey:3Df... ...Y7Pc9" \
 --header \
   "Content-Type: application/json" \
   "X-Watson-Metadata: customer_id={first-customer-ID};customer_id={second-customer-ID}" \
 --data "{\"input\":{\"text\":\"hello\"}}" \
 "{url}/v2/assistants/{assistant_id}/sessions/{session_id}/message?version=2019-02-28"

where {url} is the appropriate URL for your instance. For more details, see Service endpoint }.

The customer_id string cannot include the semicolon (;) or equal sign (=) characters. You are responsible for ensuring that each Customer ID parameter is unique across your customers.

To delete messages using customer_id values, see the Information security topic.