RAG Query

Overview

The Vector Search (RAG) Node enhances LLM-generated responses by retrieving relevant information from a specified RAG container before formulating an answer. This retrieval-augmented generation (RAG) approach allows AI models to generate factually accurate, up-to-date, and context-aware responses, making it ideal for knowledge-based applications such as:

✅ Customer Support Assistants – Retrieve documentation and past resolutions to provide accurate troubleshooting. ✅ AI-Powered Documentation Search – Enhance LLM responses by retrieving technical guides, user manuals, and FAQs. ✅ Enterprise Knowledge Management – Search through internal databases and return relevant company policies, reports, and guidelines. ✅ Personalized Recommendations – Retrieve historical user interactions to customize AI-generated responses.

How It Works

1️⃣ Retrieves relevant document embeddings from the RAG container. 2️⃣ Enriches the user query with retrieved data before passing it to an LLM. 3️⃣ Processes the combined input using the selected AI model. 4️⃣ Returns an AI-generated response based on both retrieved context and model reasoning.

🔹 Example Use-Case: A technical support chatbot retrieves documentation on network errors before generating a troubleshooting response for a user.

Configurations

Field

Description

RAG Container

Select the RAG container that stores relevant documents or indexed knowledge. This acts as the source of retrieved context.

System Prompt

Define instructions that guide the model’s behavior when generating a response. Similar to the Prompt Node, this ensures responses follow a specific format and tone.

Query

The user input that will be enriched with retrieved information before being processed by the LLM. Can be dynamically set using $agent.query.

Filters (Optional)

Apply metadata filters to narrow down retrieval results (e.g., filter documents by category, tag, or source). These filters must be configured in the RAG datastore as well.

Response Format

Choose between: Plain Text (default) for natural language responses or JSON for structured outputs. JSON format is recommended when structured data needs to be extracted.

Temperature

Adjusts the randomness of responses: Lower values (e.g., 0.1) → More predictable outputs, Higher values (e.g., 0.9) → More creative outputs.

Number of Conversation Turns

Defines how much context from past interactions should be retained for better continuity.

Execution Flow:

1️⃣ The Vector Search Node queries the RAG container for relevant context. 2️⃣ The retrieved documents are used to enrich the user’s query before sending it to the LLM. 3️⃣ The LLM processes the query + retrieved information, ensuring the response is grounded in factual data. 4️⃣ The node returns a response along with source citations when applicable.

Output Format:

Plain Text Response (Default)

{
  "content": "A 502 Bad Gateway error often indicates a communication issue between servers. Here are some troubleshooting steps: 1. Restart your web server and proxy server. 2. Check your server logs for connection errors. 3. Verify DNS settings and firewall rules. 4. If using a cloud provider, check for outages. If the issue persists, contact your hosting provider for further assistance.",
  "sources": [
    { "title": "502 Error Troubleshooting Guide", "url": "https://docs.company.com/errors/502" }
  ],
  "llmQuery": "How do I fix a 502 Bad Gateway error on my web server?"
}

Example Use-Cases

Use-Case 1: AI-Powered Knowledge Base for IT Support

A technical support chatbot retrieves relevant troubleshooting guides from a RAG container before generating AI-powered responses.

Configuration:

Field

Value

RAG Container

it_support_docs

System Prompt

"You are an IT support assistant helping users troubleshoot common technical issues. Provide clear, step-by-step guidance based on retrieved documentation. If no relevant information is found, recommend escalating the issue to support."

Query

$agent.query (automatically retrieves the user’s question)

Response Format

Plain Text

Temperature

0.3

Number of Conversation Turns

2

Example User Query:

💬 "How do I fix a 502 Bad Gateway error on my web server?"

Generated AI Response:

{
  "content": "A 502 Bad Gateway error often indicates a communication issue between servers. Here are some troubleshooting steps: 
  1. Restart your web server and proxy server. 
  2. Check your server logs for connection errors. 
  3. Verify DNS settings and firewall rules. 
  4. If using a cloud provider, check for outages. If the issue persists, contact your hosting provider for further assistance.",
  "sources": [
    { "title": "502 Error Troubleshooting Guide", "url": "https://docs.company.com/errors/502" }
  ],
  "llmQuery": "How do I fix a 502 Bad Gateway error on my web server?"
}

Use-Case 2: AI-Powered Legal Document Search

A legal AI assistant retrieves relevant contract clauses before summarizing legal documents.

Configuration:

Field

Value

RAG Container

legal_documents

System Prompt

"You are an AI legal assistant. Retrieve and summarize relevant clauses from legal contracts. If no relevant clause is found, state so clearly."

Query

"What are the termination conditions for this contract?"

Response Format

JSON

Generated AI Response:

{
  "content": "The termination clause states that either party may terminate the contract with a 30-day notice. Early termination may incur a penalty of 15% of the remaining contract value.",
  "sources": [
    { "title": "Sample Contract - Termination Clause", "url": "https://docs.company.com/legal/contracts/termination" }
  ]
}

Use-Case 3: Personalized Financial Advisory

A financial AI assistant retrieves historical investment strategies before generating personalized recommendations.

Configuration:

Field

Value

RAG Container

investment_strategies

System Prompt

"You are an AI financial advisor. Retrieve past investment strategies based on the user's profile and suggest a personalized plan."

Query

"What’s the best investment plan for someone with a high-risk appetite?"

Response Format

Plain Text

Generated AI Response:

{
  "content": "Based on historical investment strategies, high-risk investors have benefited from a diversified portfolio that includes 60% stocks, 30% crypto assets, and 10% bonds. However, individual risk factors should be considered before making investment decisions.",
  "sources": [
    { "title": "High-Risk Investment Strategies", "url": "https://docs.company.com/finance/investments/high-risk" }
  ]
}

Key Takeaways for Developers

✅ Enhances LLM Accuracy with Contextual Retrieval – The Vector Search (RAG) Node ensures that AI-generated responses are grounded in real data by retrieving relevant documents from RAG containers before processing the query.

✅ Supports Knowledge-Based AI Applications – Ideal for customer support chatbots, documentation search, legal research, and financial advisory, where contextual accuracy is crucial.

✅ Retrieves and Enriches Information Before AI Processing – Unlike a standard Prompt Node, this node first retrieves relevant documents before sending the enriched query to the LLM, improving relevance and factual correctness.

✅ Flexible Configuration for Structured Responses – Developers can choose between Plain Text or JSON response formats, making it suitable for both conversational AI and structured data extraction.

✅ Includes Metadata Filtering for Targeted Retrieval – Supports filters on document metadata, allowing developers to fine-tune retrieval and avoid irrelevant results.

✅ Ensures Traceability with Source Citations – Responses include document sources, making it easier for users to verify where the information came from, increasing AI reliability and trust.

By leveraging the Vector Search (RAG) Node, developers can integrate knowledge-aware AI assistants that provide fact-based, personalized, and domain-specific responses, transforming workflows into intelligent, data-driven systems. 🚀

PreviousPrompt NextVector Search

Last updated 4 months ago