RAG Query
Overview
The Vector Search (RAG) Node enhances LLM-generated responses by retrieving relevant information from a specified RAG container before formulating an answer. This retrieval-augmented generation (RAG) approach allows AI models to generate factually accurate, up-to-date, and context-aware responses, making it ideal for knowledge-based applications such as:
✅ Customer Support Assistants – Retrieve documentation and past resolutions to provide accurate troubleshooting. ✅ AI-Powered Documentation Search – Enhance LLM responses by retrieving technical guides, user manuals, and FAQs. ✅ Enterprise Knowledge Management – Search through internal databases and return relevant company policies, reports, and guidelines. ✅ Personalized Recommendations – Retrieve historical user interactions to customize AI-generated responses.
How It Works
1️⃣ Retrieves relevant document embeddings from the RAG container. 2️⃣ Enriches the user query with retrieved data before passing it to an LLM. 3️⃣ Processes the combined input using the selected AI model. 4️⃣ Returns an AI-generated response based on both retrieved context and model reasoning.
🔹 Example Use-Case: A technical support chatbot retrieves documentation on network errors before generating a troubleshooting response for a user.
Configurations
RAG Container
Select the RAG container that stores relevant documents or indexed knowledge. This acts as the source of retrieved context.
System Prompt
Define instructions that guide the model’s behavior when generating a response. Similar to the Prompt Node, this ensures responses follow a specific format and tone.
Query
The user input that will be enriched with retrieved information before being processed by the LLM. Can be dynamically set using $agent.query
.
Filters (Optional)
Apply metadata filters to narrow down retrieval results (e.g., filter documents by category, tag, or source). These filters must be configured in the RAG datastore as well.
Response Format
Choose between: Plain Text (default) for natural language responses or JSON for structured outputs. JSON format is recommended when structured data needs to be extracted.
Temperature
Adjusts the randomness of responses: Lower values (e.g., 0.1) → More predictable outputs, Higher values (e.g., 0.9) → More creative outputs.
Number of Conversation Turns
Defines how much context from past interactions should be retained for better continuity.
Execution Flow:
1️⃣ The Vector Search Node queries the RAG container for relevant context. 2️⃣ The retrieved documents are used to enrich the user’s query before sending it to the LLM. 3️⃣ The LLM processes the query + retrieved information, ensuring the response is grounded in factual data. 4️⃣ The node returns a response along with source citations when applicable.
Output Format:
Plain Text Response (Default)
Example Use-Cases
Use-Case 1: AI-Powered Knowledge Base for IT Support
A technical support chatbot retrieves relevant troubleshooting guides from a RAG container before generating AI-powered responses.
Configuration:
RAG Container
it_support_docs
System Prompt
"You are an IT support assistant helping users troubleshoot common technical issues. Provide clear, step-by-step guidance based on retrieved documentation. If no relevant information is found, recommend escalating the issue to support."
Query
$agent.query
(automatically retrieves the user’s question)
Response Format
Plain Text
Temperature
0.3
Number of Conversation Turns
2
Example User Query:
💬 "How do I fix a 502 Bad Gateway error on my web server?"
Generated AI Response:
Use-Case 2: AI-Powered Legal Document Search
A legal AI assistant retrieves relevant contract clauses before summarizing legal documents.
Configuration:
RAG Container
legal_documents
System Prompt
"You are an AI legal assistant. Retrieve and summarize relevant clauses from legal contracts. If no relevant clause is found, state so clearly."
Query
"What are the termination conditions for this contract?"
Response Format
JSON
Generated AI Response:
Use-Case 3: Personalized Financial Advisory
A financial AI assistant retrieves historical investment strategies before generating personalized recommendations.
Configuration:
RAG Container
investment_strategies
System Prompt
"You are an AI financial advisor. Retrieve past investment strategies based on the user's profile and suggest a personalized plan."
Query
"What’s the best investment plan for someone with a high-risk appetite?"
Response Format
Plain Text
Generated AI Response:
Key Takeaways for Developers
✅ Enhances LLM Accuracy with Contextual Retrieval – The Vector Search (RAG) Node ensures that AI-generated responses are grounded in real data by retrieving relevant documents from RAG containers before processing the query.
✅ Supports Knowledge-Based AI Applications – Ideal for customer support chatbots, documentation search, legal research, and financial advisory, where contextual accuracy is crucial.
✅ Retrieves and Enriches Information Before AI Processing – Unlike a standard Prompt Node, this node first retrieves relevant documents before sending the enriched query to the LLM, improving relevance and factual correctness.
✅ Flexible Configuration for Structured Responses – Developers can choose between Plain Text or JSON response formats, making it suitable for both conversational AI and structured data extraction.
✅ Includes Metadata Filtering for Targeted Retrieval – Supports filters on document metadata, allowing developers to fine-tune retrieval and avoid irrelevant results.
✅ Ensures Traceability with Source Citations – Responses include document sources, making it easier for users to verify where the information came from, increasing AI reliability and trust.
By leveraging the Vector Search (RAG) Node, developers can integrate knowledge-aware AI assistants that provide fact-based, personalized, and domain-specific responses, transforming workflows into intelligent, data-driven systems. 🚀
Last updated