Sources

Unlock the full potential of your chatbot by integrating external sources such as websites and documents with an external LLM (Large Language Model) engine. This section guides you on how to effortlessly add sources to complement your chatbot's knowledge and capabilities.

Whether you want your chatbot to access real-time data, reference academic documents, or stay up-to-date with the latest news, our step-by-step instructions help you seamlessly connect these external resources to your chatbot. You'll be able to harness the vast knowledge of the LLM engine and make your chatbot an even more valuable asset to your users.

By the end of this guide, you will have a chatbot that not only understands user queries but can also provide accurate and up-to-date information from the web and various documents. Elevate your chatbot's capabilities with BotDistrikt and start delivering even more personalized and relevant responses.

To add a source:

  1. Navigate to Sources on the left-hand navigation panel

  2. Click OpenAI in "Link an LLM engine such as OpenAI to proceed."

  3. The Dashboard navigates to Integrations --> Artificial Intelligence --> OpenAI

  4. Add a valid OpenAI API and Save.

  1. Ensure Generate Embeddings for Text Responses is toggled on.

  2. Navigate (again) to Sources and the Sources Dashboard now includes the ability to add training websites/landing pages and document sources, specific to your bot.

Websites

To add a website, click on Websites --> New Source Website

  1. Enter a Crawl Base URL (base URL that the chatbot will crawl to find responses)

  2. Click Crawl

  3. Or enter your website's Sitemap

  4. Click Load Sitemap - you will see a list of the URLs available on the website:

  1. Click Add to add individual resources, or

  2. Bulk Add to add resources in bulk.

Click on Train Added Resources at the end of the list to add the current loaded URLs to your bot's knowledge base.

You will be redirected to your Sources dashboard. The added resources are viewable as a list under Websites.

You can toggle the resource as Active, view the current Training status, view Responses, Update a resource, or Delete a URL.

Documents

To view the documents dashboard, click Sources --> Documents (tab)

Click New Source Document to add a training document and title. BotDistrikt parses the document into the following information:

  1. Specifies Document Type.

  2. Number of characters in the document.

  3. Adjusting the Chunk Size changes the character count in a chunk. You can toggle chunk size from the slider or the Use Recommended button. A large chunk size retains more context while a small chunk size captures more granular semantic data.

  4. Chunk Overlap from one chunk to another. You get improved context, a better training model for longer documents, and enhanced coherence. You can increase chunk size if the document consists of numerous unrelated topics.

  5. The Output shows the number of responses that will be generated based on the Chunk Size and Chunk Overlap settings.

  6. Click Show Example Responses to view sample extracted responses.

  7. Click Train AI to train your bot with the document.

The screen redirects to the main Sources (Documents) dashboard.

To ensure the embeddings are created, go to Responses --> Text and confirm whether the embeddings are created.

The Embeddings column displays the LLM name (OpenAI, Vertex AI, etc.). A tick signifies successfully created embedding and a cross signifies no embedding creation for that response. The Tags column indicate the source (document/website)

Troubleshooting:

Last updated