将flask改成fastapi

2025-10-13 13:18:03 +08:00
commit 88db2539b0
476 changed files with 739741 additions and 0 deletions
--- a/rag/prompts/init.py
+++ b/rag/prompts/init.py
@@ -0,0 +1,6 @@
+from . import generator
+
+__all__ = [name for name in dir(generator)
+           if not name.startswith('_')]
+
+globals().update({name: getattr(generator, name) for name in __all__})
--- a/rag/prompts/analyze_task_system.md
+++ b/rag/prompts/analyze_task_system.md
@@ -0,0 +1,48 @@
+You are an intelligent task analyzer that adapts analysis depth to task complexity.
+
+**Analysis Framework**
+
+**Step 1: Task Transmission Assessment**
+**Note**: This section is not subject to word count limitations when transmission is needed, as it serves critical handoff functions.
+
+**Evaluate if task transmission information is needed:**
+- **Is this an initial step?** If yes, skip this section
+- **Are there upstream agents/steps?** If no, provide minimal transmission
+- **Is there critical state/context to preserve?** If yes, include full transmission
+
+### If Task Transmission is Needed:
+- **Current State Summary**: [1-2 sentences on where we are]
+- **Key Data/Results**: [Critical findings that must carry forward]
+- **Context Dependencies**: [Essential context for next agent/step]
+- **Unresolved Items**: [Issues requiring continuation]
+- **Status for User**: [Clear status update in user terms]
+- **Technical State**: [System state for technical handoffs]
+
+**Step 2: Complexity Classification**
+Classify as LOW / MEDIUM / HIGH:
+- **LOW**: Single-step tasks, direct queries, small talk
+- **MEDIUM**: Multi-step tasks within one domain
+- **HIGH**: Multi-domain coordination or complex reasoning
+
+**Step 3: Adaptive Analysis**
+Scale depth to match complexity. Always stop once success criteria are met.
+
+**For LOW (max 50 words for analysis only):**
+- Detect small talk; if true, output exactly: `Small talk — no further analysis needed`
+- One-sentence objective
+- Direct execution approach (1–2 steps)
+
+**For MEDIUM (80–150 words for analysis only):**
+- Objective; Intent & Scope
+- 3–5 step minimal Plan (may mark parallel steps)
+- **Uncertainty & Probes** (at least one probe with a clear stop condition)
+- Success Criteria + basic Failure detection & fallback
+- **Source Plan** (how evidence will be obtained/verified)
+
+**For HIGH (150–250 words for analysis only):**
+- Comprehensive objective analysis; Intent & Scope
+- 5–8 step Plan with dependencies/parallelism
+- **Uncertainty & Probes** (key unknowns → probe → stop condition)
+- Measurable Success Criteria; Failure detectors & fallbacks
+- **Source Plan** (evidence acquisition & validation)
+- **Reflection Hooks** (escalation/de-escalation triggers)
--- a/rag/prompts/analyze_task_user.md
+++ b/rag/prompts/analyze_task_user.md
@@ -0,0 +1,9 @@
+**Input Variables**
+- **{{ task }}** — the task/request to analyze
+- **{{ context }}** — background, history, situational context
+- **{{ agent_prompt }}** — special instructions/role hints
+- **{{ tools_desc }}** — available sub-agents and capabilities
+
+**Final Output Rule**
+Return the Task Transmission section (if needed) followed by the concrete analysis and planning steps according to LOW / MEDIUM / HIGH complexity.  
+Do not restate the framework, definitions, or rules. Output only the final structured result.
--- a/rag/prompts/ask_summary.md
+++ b/rag/prompts/ask_summary.md
@@ -0,0 +1,14 @@
+Role: You're a smart assistant. Your name is Miss R.
+Task: Summarize the information from knowledge bases and answer user's question.
+Requirements and restriction:
+  - DO NOT make things up, especially for numbers.
+  - If the information from knowledge is irrelevant with user's question, JUST SAY: Sorry, no relevant information provided.
+  - Answer with markdown format text.
+  - Answer in language of user's question.
+  - DO NOT make things up, especially for numbers.
+
+### Information from knowledge bases
+
+{{ knowledge }}
+
+The above is information from knowledge bases.
--- a/rag/prompts/assign_toc_levels.md
+++ b/rag/prompts/assign_toc_levels.md
@@ -0,0 +1,53 @@
+You are given a JSON array of TOC items. Each item has at least {"title": string} and may include an existing structure.
+
+Task
+- For each item, assign a depth label using Arabic numerals only: top-level = 1, second-level = 2, third-level = 3, etc.
+- Multiple items may share the same depth (e.g., many 1s, many 2s).
+- Do not use dotted numbering (no 1.1/1.2). Use a single digit string per item indicating its depth only.
+- Preserve the original item order exactly. Do not insert, delete, or reorder.
+- Decide levels yourself to keep a coherent hierarchy. Keep peers at the same depth.
+
+Output
+- Return a valid JSON array only (no extra text).
+- Each element must be {"structure": "1|2|3", "title": <original title string>}.
+- title must be the original title string.
+
+Examples
+
+Example A (chapters with sections)
+Input:
+["Chapter 1 Methods", "Section 1 Definition", "Section 2 Process", "Chapter 2 Experiment"]
+
+Output:
+[
+  {"structure":"1","title":"Chapter 1 Methods"},
+  {"structure":"2","title":"Section 1 Definition"},
+  {"structure":"2","title":"Section 2 Process"},
+  {"structure":"1","title":"Chapter 2 Experiment"}
+]
+
+Example B (parts with chapters)
+Input:
+["Part I Theory", "Chapter 1 Basics", "Chapter 2 Methods", "Part II Applications", "Chapter 3 Case Studies"]
+
+Output:
+[
+  {"structure":"1","title":"Part I Theory"},
+  {"structure":"2","title":"Chapter 1 Basics"},
+  {"structure":"2","title":"Chapter 2 Methods"},
+  {"structure":"1","title":"Part II Applications"},
+  {"structure":"2","title":"Chapter 3 Case Studies"}
+]
+
+Example C (plain headings)
+Input:
+["Introduction", "Background and Motivation", "Related Work", "Methodology", "Evaluation"]
+
+Output:
+[
+  {"structure":"1","title":"Introduction"},
+  {"structure":"2","title":"Background and Motivation"},
+  {"structure":"2","title":"Related Work"},
+  {"structure":"1","title":"Methodology"},
+  {"structure":"1","title":"Evaluation"}
+]
--- a/rag/prompts/citation_plus.md
+++ b/rag/prompts/citation_plus.md
@@ -0,0 +1,13 @@
+You are an agent for adding correct citations to the given text by user. 
+You are given a piece of text within [ID:<ID>] tags, which was generated based on the provided sources. 
+However, the sources are not cited in the [ID:<ID>]. 
+Your task is to enhance user trust by generating correct, appropriate citations for this report.
+
+{{ example }}
+
+<context>
+
+{{ sources }}
+
+</context>
+
--- a/rag/prompts/citation_prompt.md
+++ b/rag/prompts/citation_prompt.md
@@ -0,0 +1,109 @@
+Based on the provided document or chat history, add citations to the input text using the format specified later. 
+
+# Citation Requirements:
+
+## Technical Rules:
+- Use format: [ID:i] or [ID:i] [ID:j] for multiple sources
+- Place citations at the end of sentences, before punctuation
+- Maximum 4 citations per sentence
+- DO NOT cite content not from <context></context>
+- DO NOT modify whitespace or original text
+- STRICTLY prohibit non-standard formatting (~~, etc.)
+
+## What MUST Be Cited:
+1. **Quantitative data**: Numbers, percentages, statistics, measurements
+2. **Temporal claims**: Dates, timeframes, sequences of events  
+3. **Causal relationships**: Claims about cause and effect
+4. **Comparative statements**: Rankings, comparisons, superlatives
+5. **Technical definitions**: Specialized terms, concepts, methodologies
+6. **Direct attributions**: What someone said, did, or believes
+7. **Predictions/forecasts**: Future projections, trend analyses
+8. **Controversial claims**: Disputed facts, minority opinions
+
+## What Should NOT Be Cited:
+- Common knowledge (e.g., "The sun rises in the east")
+- Transitional phrases
+- General introductions
+- Your own analysis or synthesis (unless directly from source)
+
+# Comprehensive Examples:
+
+## Example 1: Data and Statistics
+<context>
+ID: 45
+└── Content: The global smartphone market grew by 7.8% in Q3 2024, with Samsung holding 20.6% market share and Apple at 15.8%.
+
+ID: 46  
+└── Content: 5G adoption reached 1.5 billion users worldwide by October 2024.
+</context>
+
+USER: How is the smartphone market performing?
+
+ASSISTANT: 
+The smartphone industry is showing strong recovery. The global smartphone market grew by 7.8% in Q3 2024 [ID:45]. Samsung leads with 20.6% market share, followed by Apple at 15.8% [ID:45]. This growth is partly driven by 5G technology, which reached 1.5 billion users worldwide by October 2024 [ID:46].
+
+## Example 2: Causal Relationships and Predictions
+<context>
+ID: 78
+└── Content: Rising interest rates have led to a 15% decline in housing starts. Economists predict this trend will continue through 2025.
+
+ID: 79
+└── Content: The Federal Reserve's aggressive monetary policy is the primary driver behind the real estate slowdown.
+</context>
+
+USER: What's happening in the real estate market?
+
+ASSISTANT:
+The real estate market is experiencing significant challenges. Rising interest rates have led to a 15% decline in housing starts [ID:78]. The Federal Reserve's aggressive monetary policy is the primary driver behind this real estate slowdown [ID:79]. Economists predict this trend will continue through 2025 [ID:78]. These factors suggest a prolonged period of market adjustment ahead.
+
+## Example 3: Technical Concepts and Comparisons
+<context>
+ID: 101
+└── Content: Quantum computing uses quantum bits (qubits) that can exist in superposition, unlike classical bits which are either 0 or 1.
+
+ID: 102
+└── Content: Current quantum computers can solve certain problems 158 million times faster than classical supercomputers.
+</context>
+
+USER: Explain quantum computing advantages.
+
+ASSISTANT:
+Quantum computing represents a fundamental shift in computational capability. Unlike classical computers, quantum computing uses quantum bits (qubits) that can exist in superposition, rather than being limited to just 0 or 1 [ID:101]. This unique property enables extraordinary performance gains. Current quantum computers can solve certain problems 158 million times faster than classical supercomputers [ID:102]. This technology promises to revolutionize fields requiring complex calculations.
+
+## Example 4: Mixed Citation Needs
+<context>
+ID: 201
+└── Content: CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026."
+
+ID: 202
+└── Content: Industry analysts remain skeptical, with Morgan Stanley rating the stock as "underperform."
+
+ID: 203
+└── Content: The company's AI division currently represents only 5% of total revenue.
+</context>
+
+USER: What's the company's new strategy?
+
+ASSISTANT:
+The company is undergoing a major strategic transformation. CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026" [ID:201]. However, this ambitious goal faces skepticism, as industry analysts remain doubtful, with Morgan Stanley rating the stock as "underperform" [ID:202]. The challenge is significant given that the company's AI division currently represents only 5% of total revenue [ID:203]. Success will depend on execution of this bold vision.
+
+## Example 5: When NOT to Cite
+<context>
+ID: 301
+└── Content: Paris is the capital of France. The Eiffel Tower was completed in 1889.
+</context>
+
+USER: Tell me about Paris.
+
+ASSISTANT:
+Paris is the capital of France. It's known for its rich history, culture, and architecture. The Eiffel Tower was completed in 1889 [ID:301]. The city attracts millions of tourists annually. Paris remains one of the world's most visited destinations.
+(Note: Only the specific date needs citation, not common knowledge about Paris)
+
+--- Examples END ---
+
+REMEMBER: 
+- Cite FACTS, not opinions or transitions
+- Each citation supports the ENTIRE sentence
+- When in doubt, ask: "Would a fact-checker need to verify this?"
+- Place citations at sentence end, before punctuation
+- Format likes this is FORBIDDEN: [ID:0, ID:5, ID:...]. It MUST be seperated like, [ID:0][ID:5]...
--- a/rag/prompts/content_tagging_prompt.md
+++ b/rag/prompts/content_tagging_prompt.md
@@ -0,0 +1,32 @@
+## Role
+You are a text analyzer.
+
+## Task
+Add tags (labels) to a given piece of text content based on the examples and the entire tag set.
+
+## Steps
+- Review the tag/label set.
+- Review examples which all consist of both text content and assigned tags with relevance score in JSON format.
+- Summarize the text content, and tag it with the top {{ topn }} most relevant tags from the set of tags/labels and the corresponding relevance score.
+
+## Requirements
+- The tags MUST be from the tag set.
+- The output MUST be in JSON format only, the key is tag and the value is its relevance score.
+- The relevance score must range from 1 to 10.
+- Output keywords ONLY.
+
+# TAG SET
+{{ all_tags | join(', ') }}
+
+{% for ex in examples %}
+# Examples {{ loop.index0 }}
+### Text Content
+{{ ex.content }}
+
+Output:
+{{ ex.tags_json }}
+
+{% endfor %}
+# Real Data
+### Text Content
+{{ content }}
--- a/rag/prompts/cross_languages_sys_prompt.md
+++ b/rag/prompts/cross_languages_sys_prompt.md
@@ -0,0 +1,35 @@
+## Role
+A streamlined multilingual translator.
+
+## Behavior Rules
+1. Accept batch translation requests in the following format:
+   **Input:** `[text]`
+   **Target Languages:** comma-separated list
+
+2. Maintain:
+   - Original formatting (tables, lists, spacing)
+   - Technical terminology accuracy
+   - Cultural context appropriateness
+
+3. Output translations in the following format:
+
+[Translation in language1]
+###
+[Translation in language2]
+
+---
+
+## Example
+
+**Input:**
+Hello World! Let's discuss AI safety.
+===
+Chinese, French, Japanese
+
+**Output:**
+你好世界！让我们讨论人工智能安全问题。
+###
+Bonjour le monde ! Parlons de la sécurité de l'IA.
+###
+こんにちは世界！AIの安全性について話し合いましょう。
+
--- a/rag/prompts/cross_languages_user_prompt.md
+++ b/rag/prompts/cross_languages_user_prompt.md
@@ -0,0 +1,7 @@
+**Input:**
+{{ query }}
+===
+{{ languages | join(', ') }}
+
+**Output:**
+
--- a/rag/prompts/full_question_prompt.md
+++ b/rag/prompts/full_question_prompt.md
@@ -0,0 +1,62 @@
+## Role
+A helpful assistant.
+
+## Task & Steps
+1. Generate a full user question that would follow the conversation.
+2. If the user's question involves relative dates, convert them into absolute dates based on today ({{ today }}).
+   - "yesterday" = {{ yesterday }}, "tomorrow" = {{ tomorrow }}
+
+## Requirements & Restrictions
+- If the user's latest question is already complete, don't do anything — just return the original question.
+- DON'T generate anything except a refined question.
+{% if language %}
+- Text generated MUST be in {{ language }}.
+{% else %}
+- Text generated MUST be in the same language as the original user's question.
+{% endif %}
+
+---
+
+## Examples
+
+### Example 1
+**Conversation:**
+
+USER: What is the name of Donald Trump's father?
+ASSISTANT: Fred Trump.
+USER: And his mother?
+
+**Output:** What's the name of Donald Trump's mother?
+
+---
+
+### Example 2
+**Conversation:**
+
+USER: What is the name of Donald Trump's father?
+ASSISTANT: Fred Trump.
+USER: And his mother?
+ASSISTANT: Mary Trump.
+USER: What's her full name?
+
+**Output:** What's the full name of Donald Trump's mother Mary Trump?
+
+---
+
+### Example 3
+**Conversation:**
+
+USER: What's the weather today in London?
+ASSISTANT: Cloudy.
+USER: What's about tomorrow in Rochester?
+
+**Output:** What's the weather in Rochester on {{ tomorrow }}?
+
+---
+
+## Real Data
+
+**Conversation:**
+
+{{ conversation }}
+
--- a/rag/prompts/generator.py
+++ b/rag/prompts/generator.py
@@ -0,0 +1,733 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import datetime
+import json
+import logging
+import re
+from copy import deepcopy
+from typing import Tuple
+import jinja2
+import json_repair
+from api.utils import hash_str2int
+from rag.prompts.template import load_prompt
+from rag.settings import TAG_FLD
+from rag.utils import encoder, num_tokens_from_string
+
+
+STOP_TOKEN="<|STOP|>"
+COMPLETE_TASK="complete_task"
+INPUT_UTILIZATION = 0.5
+
+def get_value(d, k1, k2):
+    return d.get(k1, d.get(k2))
+
+
+def chunks_format(reference):
+
+    return [
+        {
+            "id": get_value(chunk, "chunk_id", "id"),
+            "content": get_value(chunk, "content", "content_with_weight"),
+            "document_id": get_value(chunk, "doc_id", "document_id"),
+            "document_name": get_value(chunk, "docnm_kwd", "document_name"),
+            "dataset_id": get_value(chunk, "kb_id", "dataset_id"),
+            "image_id": get_value(chunk, "image_id", "img_id"),
+            "positions": get_value(chunk, "positions", "position_int"),
+            "url": chunk.get("url"),
+            "similarity": chunk.get("similarity"),
+            "vector_similarity": chunk.get("vector_similarity"),
+            "term_similarity": chunk.get("term_similarity"),
+            "doc_type": chunk.get("doc_type_kwd"),
+        }
+        for chunk in reference.get("chunks", [])
+    ]
+
+
+def message_fit_in(msg, max_length=4000):
+    def count():
+        nonlocal msg
+        tks_cnts = []
+        for m in msg:
+            tks_cnts.append({"role": m["role"], "count": num_tokens_from_string(m["content"])})
+        total = 0
+        for m in tks_cnts:
+            total += m["count"]
+        return total
+
+    c = count()
+    if c < max_length:
+        return c, msg
+
+    msg_ = [m for m in msg if m["role"] == "system"]
+    if len(msg) > 1:
+        msg_.append(msg[-1])
+    msg = msg_
+    c = count()
+    if c < max_length:
+        return c, msg
+
+    ll = num_tokens_from_string(msg_[0]["content"])
+    ll2 = num_tokens_from_string(msg_[-1]["content"])
+    if ll / (ll + ll2) > 0.8:
+        m = msg_[0]["content"]
+        m = encoder.decode(encoder.encode(m)[: max_length - ll2])
+        msg[0]["content"] = m
+        return max_length, msg
+
+    m = msg_[-1]["content"]
+    m = encoder.decode(encoder.encode(m)[: max_length - ll2])
+    msg[-1]["content"] = m
+    return max_length, msg
+
+
+def kb_prompt(kbinfos, max_tokens, hash_id=False):
+    from api.db.services.document_service import DocumentService
+
+    knowledges = [get_value(ck, "content", "content_with_weight") for ck in kbinfos["chunks"]]
+    kwlg_len = len(knowledges)
+    used_token_count = 0
+    chunks_num = 0
+    for i, c in enumerate(knowledges):
+        if not c:
+            continue
+        used_token_count += num_tokens_from_string(c)
+        chunks_num += 1
+        if max_tokens * 0.97 < used_token_count:
+            knowledges = knowledges[:i]
+            logging.warning(f"Not all the retrieval into prompt: {len(knowledges)}/{kwlg_len}")
+            break
+
+    docs = DocumentService.get_by_ids([get_value(ck, "doc_id", "document_id") for ck in kbinfos["chunks"][:chunks_num]])
+    docs = {d.id: d.meta_fields for d in docs}
+
+    def draw_node(k, line):
+        if line is not None and not isinstance(line, str):
+            line = str(line)
+        if not line:
+            return ""
+        return f"\n├── {k}: " + re.sub(r"\n+", " ", line, flags=re.DOTALL)
+
+    knowledges = []
+    for i, ck in enumerate(kbinfos["chunks"][:chunks_num]):
+        cnt = "\nID: {}".format(i if not hash_id else hash_str2int(get_value(ck, "id", "chunk_id"), 100))
+        cnt += draw_node("Title", get_value(ck, "docnm_kwd", "document_name"))
+        cnt += draw_node("URL", ck['url'])  if "url" in ck else ""
+        for k, v in docs.get(get_value(ck, "doc_id", "document_id"), {}).items():
+            cnt += draw_node(k, v)
+        cnt += "\n└── Content:\n"
+        cnt += get_value(ck, "content", "content_with_weight")
+        knowledges.append(cnt)
+
+    return knowledges
+
+
+CITATION_PROMPT_TEMPLATE = load_prompt("citation_prompt")
+CITATION_PLUS_TEMPLATE = load_prompt("citation_plus")
+CONTENT_TAGGING_PROMPT_TEMPLATE = load_prompt("content_tagging_prompt")
+CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE = load_prompt("cross_languages_sys_prompt")
+CROSS_LANGUAGES_USER_PROMPT_TEMPLATE = load_prompt("cross_languages_user_prompt")
+FULL_QUESTION_PROMPT_TEMPLATE = load_prompt("full_question_prompt")
+KEYWORD_PROMPT_TEMPLATE = load_prompt("keyword_prompt")
+QUESTION_PROMPT_TEMPLATE = load_prompt("question_prompt")
+VISION_LLM_DESCRIBE_PROMPT = load_prompt("vision_llm_describe_prompt")
+VISION_LLM_FIGURE_DESCRIBE_PROMPT = load_prompt("vision_llm_figure_describe_prompt")
+
+ANALYZE_TASK_SYSTEM = load_prompt("analyze_task_system")
+ANALYZE_TASK_USER = load_prompt("analyze_task_user")
+NEXT_STEP = load_prompt("next_step")
+REFLECT = load_prompt("reflect")
+SUMMARY4MEMORY = load_prompt("summary4memory")
+RANK_MEMORY = load_prompt("rank_memory")
+META_FILTER = load_prompt("meta_filter")
+ASK_SUMMARY = load_prompt("ask_summary")
+
+PROMPT_JINJA_ENV = jinja2.Environment(autoescape=False, trim_blocks=True, lstrip_blocks=True)
+
+
+def citation_prompt(user_defined_prompts: dict={}) -> str:
+    template = PROMPT_JINJA_ENV.from_string(user_defined_prompts.get("citation_guidelines", CITATION_PROMPT_TEMPLATE))
+    return template.render()
+
+
+def citation_plus(sources: str) -> str:
+    template = PROMPT_JINJA_ENV.from_string(CITATION_PLUS_TEMPLATE)
+    return template.render(example=citation_prompt(), sources=sources)
+
+
+def keyword_extraction(chat_mdl, content, topn=3):
+    template = PROMPT_JINJA_ENV.from_string(KEYWORD_PROMPT_TEMPLATE)
+    rendered_prompt = template.render(content=content, topn=topn)
+
+    msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
+    _, msg = message_fit_in(msg, chat_mdl.max_length)
+    kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
+    if isinstance(kwd, tuple):
+        kwd = kwd[0]
+    kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
+    if kwd.find("**ERROR**") >= 0:
+        return ""
+    return kwd
+
+
+def question_proposal(chat_mdl, content, topn=3):
+    template = PROMPT_JINJA_ENV.from_string(QUESTION_PROMPT_TEMPLATE)
+    rendered_prompt = template.render(content=content, topn=topn)
+
+    msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
+    _, msg = message_fit_in(msg, chat_mdl.max_length)
+    kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
+    if isinstance(kwd, tuple):
+        kwd = kwd[0]
+    kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
+    if kwd.find("**ERROR**") >= 0:
+        return ""
+    return kwd
+
+
+def full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None):
+    from api.db import LLMType
+    from api.db.services.llm_service import LLMBundle
+    from api.db.services.tenant_llm_service import TenantLLMService
+
+    if not chat_mdl:
+        if TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
+            chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
+        else:
+            chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
+    conv = []
+    for m in messages:
+        if m["role"] not in ["user", "assistant"]:
+            continue
+        conv.append("{}: {}".format(m["role"].upper(), m["content"]))
+    conversation = "\n".join(conv)
+    today = datetime.date.today().isoformat()
+    yesterday = (datetime.date.today() - datetime.timedelta(days=1)).isoformat()
+    tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat()
+
+    template = PROMPT_JINJA_ENV.from_string(FULL_QUESTION_PROMPT_TEMPLATE)
+    rendered_prompt = template.render(
+        today=today,
+        yesterday=yesterday,
+        tomorrow=tomorrow,
+        conversation=conversation,
+        language=language,
+    )
+
+    ans = chat_mdl.chat(rendered_prompt, [{"role": "user", "content": "Output: "}])
+    ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+    return ans if ans.find("**ERROR**") < 0 else messages[-1]["content"]
+
+
+def cross_languages(tenant_id, llm_id, query, languages=[]):
+    from api.db import LLMType
+    from api.db.services.llm_service import LLMBundle
+    from api.db.services.tenant_llm_service import TenantLLMService
+
+    if llm_id and TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
+        chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
+    else:
+        chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
+
+    rendered_sys_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE).render()
+    rendered_user_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_USER_PROMPT_TEMPLATE).render(query=query, languages=languages)
+
+    ans = chat_mdl.chat(rendered_sys_prompt, [{"role": "user", "content": rendered_user_prompt}], {"temperature": 0.2})
+    ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+    if ans.find("**ERROR**") >= 0:
+        return query
+    return "\n".join([a for a in re.sub(r"(^Output:|\n+)", "", ans, flags=re.DOTALL).split("===") if a.strip()])
+
+
+def content_tagging(chat_mdl, content, all_tags, examples, topn=3):
+    template = PROMPT_JINJA_ENV.from_string(CONTENT_TAGGING_PROMPT_TEMPLATE)
+
+    for ex in examples:
+        ex["tags_json"] = json.dumps(ex[TAG_FLD], indent=2, ensure_ascii=False)
+
+    rendered_prompt = template.render(
+        topn=topn,
+        all_tags=all_tags,
+        examples=examples,
+        content=content,
+    )
+
+    msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
+    _, msg = message_fit_in(msg, chat_mdl.max_length)
+    kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.5})
+    if isinstance(kwd, tuple):
+        kwd = kwd[0]
+    kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
+    if kwd.find("**ERROR**") >= 0:
+        raise Exception(kwd)
+
+    try:
+        obj = json_repair.loads(kwd)
+    except json_repair.JSONDecodeError:
+        try:
+            result = kwd.replace(rendered_prompt[:-1], "").replace("user", "").replace("model", "").strip()
+            result = "{" + result.split("{")[1].split("}")[0] + "}"
+            obj = json_repair.loads(result)
+        except Exception as e:
+            logging.exception(f"JSON parsing error: {result} -> {e}")
+            raise e
+    res = {}
+    for k, v in obj.items():
+        try:
+            if int(v) > 0:
+                res[str(k)] = int(v)
+        except Exception:
+            pass
+    return res
+
+
+def vision_llm_describe_prompt(page=None) -> str:
+    template = PROMPT_JINJA_ENV.from_string(VISION_LLM_DESCRIBE_PROMPT)
+
+    return template.render(page=page)
+
+
+def vision_llm_figure_describe_prompt() -> str:
+    template = PROMPT_JINJA_ENV.from_string(VISION_LLM_FIGURE_DESCRIBE_PROMPT)
+    return template.render()
+
+
+def tool_schema(tools_description: list[dict], complete_task=False):
+    if not tools_description:
+        return ""
+    desc = {}
+    if complete_task:
+        desc[COMPLETE_TASK] = {
+            "type": "function",
+            "function": {
+                "name": COMPLETE_TASK,
+                "description": "When you have the final answer and are ready to complete the task, call this function with your answer",
+                "parameters": {
+                    "type": "object",
+                    "properties": {"answer":{"type":"string", "description": "The final answer to the user's question"}},
+                    "required": ["answer"]
+                }
+            }
+        }
+    for tool in tools_description:
+        desc[tool["function"]["name"]] = tool
+
+    return "\n\n".join([f"## {i+1}. {fnm}\n{json.dumps(des, ensure_ascii=False, indent=4)}" for i, (fnm, des) in enumerate(desc.items())])
+
+
+def form_history(history, limit=-6):
+    context = ""
+    for h in history[limit:]:
+        if h["role"] == "system":
+            continue
+        role = "USER"
+        if h["role"].upper()!= role:
+            role = "AGENT"
+        context += f"\n{role}: {h['content'][:2048] + ('...' if len(h['content'])>2048 else '')}"
+    return context
+
+
+def analyze_task(chat_mdl, prompt, task_name, tools_description: list[dict], user_defined_prompts: dict={}):
+    tools_desc = tool_schema(tools_description)
+    context = ""
+
+    if user_defined_prompts.get("task_analysis"):
+        template = PROMPT_JINJA_ENV.from_string(user_defined_prompts["task_analysis"])
+    else:
+        template = PROMPT_JINJA_ENV.from_string(ANALYZE_TASK_SYSTEM + "\n\n" + ANALYZE_TASK_USER)
+    context = template.render(task=task_name, context=context, agent_prompt=prompt, tools_desc=tools_desc)
+    kwd = chat_mdl.chat(context, [{"role": "user", "content": "Please analyze it."}])
+    if isinstance(kwd, tuple):
+        kwd = kwd[0]
+    kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
+    if kwd.find("**ERROR**") >= 0:
+        return ""
+    return kwd
+
+
+def next_step(chat_mdl, history:list, tools_description: list[dict], task_desc, user_defined_prompts: dict={}):
+    if not tools_description:
+        return ""
+    desc = tool_schema(tools_description)
+    template = PROMPT_JINJA_ENV.from_string(user_defined_prompts.get("plan_generation", NEXT_STEP))
+    user_prompt = "\nWhat's the next tool to call? If ready OR IMPOSSIBLE TO BE READY, then call `complete_task`."
+    hist = deepcopy(history)
+    if hist[-1]["role"] == "user":
+        hist[-1]["content"] += user_prompt
+    else:
+        hist.append({"role": "user", "content": user_prompt})
+    json_str = chat_mdl.chat(template.render(task_analysis=task_desc, desc=desc, today=datetime.datetime.now().strftime("%Y-%m-%d")),
+                             hist[1:], stop=["<|stop|>"])
+    tk_cnt = num_tokens_from_string(json_str)
+    json_str = re.sub(r"^.*</think>", "", json_str, flags=re.DOTALL)
+    return json_str, tk_cnt
+
+
+def reflect(chat_mdl, history: list[dict], tool_call_res: list[Tuple], user_defined_prompts: dict={}):
+    tool_calls = [{"name": p[0], "result": p[1]} for p in tool_call_res]
+    goal = history[1]["content"]
+    template = PROMPT_JINJA_ENV.from_string(user_defined_prompts.get("reflection", REFLECT))
+    user_prompt = template.render(goal=goal, tool_calls=tool_calls)
+    hist = deepcopy(history)
+    if hist[-1]["role"] == "user":
+        hist[-1]["content"] += user_prompt
+    else:
+        hist.append({"role": "user", "content": user_prompt})
+    _, msg = message_fit_in(hist, chat_mdl.max_length)
+    ans = chat_mdl.chat(msg[0]["content"], msg[1:])
+    ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+    return """
+**Observation**
+{}
+
+**Reflection**
+{}
+    """.format(json.dumps(tool_calls, ensure_ascii=False, indent=2), ans)
+
+
+def form_message(system_prompt, user_prompt):
+    return [{"role": "system", "content": system_prompt},{"role": "user", "content": user_prompt}]
+
+
+def tool_call_summary(chat_mdl, name: str, params: dict, result: str, user_defined_prompts: dict={}) -> str:
+    template = PROMPT_JINJA_ENV.from_string(SUMMARY4MEMORY)
+    system_prompt = template.render(name=name,
+                           params=json.dumps(params, ensure_ascii=False, indent=2),
+                           result=result)
+    user_prompt = "→ Summary: "
+    _, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
+    ans = chat_mdl.chat(msg[0]["content"], msg[1:])
+    return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+
+
+def rank_memories(chat_mdl, goal:str, sub_goal:str, tool_call_summaries: list[str], user_defined_prompts: dict={}):
+    template = PROMPT_JINJA_ENV.from_string(RANK_MEMORY)
+    system_prompt = template.render(goal=goal, sub_goal=sub_goal, results=[{"i": i, "content": s} for i,s in enumerate(tool_call_summaries)])
+    user_prompt = " → rank: "
+    _, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
+    ans = chat_mdl.chat(msg[0]["content"], msg[1:], stop="<|stop|>")
+    return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+
+
+def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:
+    sys_prompt = PROMPT_JINJA_ENV.from_string(META_FILTER).render(
+        current_date=datetime.datetime.today().strftime('%Y-%m-%d'),
+        metadata_keys=json.dumps(meta_data),
+        user_question=query
+    )
+    user_prompt = "Generate filters:"
+    ans = chat_mdl.chat(sys_prompt, [{"role": "user", "content": user_prompt}])
+    ans = re.sub(r"(^.*</think>|```json\n|```\n*$)", "", ans, flags=re.DOTALL)
+    try:
+        ans = json_repair.loads(ans)
+        assert isinstance(ans, list), ans
+        return ans
+    except Exception:
+        logging.exception(f"Loading json failure: {ans}")
+    return []
+
+
+def gen_json(system_prompt:str, user_prompt:str, chat_mdl, gen_conf = None):
+    _, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
+    ans = chat_mdl.chat(msg[0]["content"], msg[1:],gen_conf=gen_conf)
+    ans = re.sub(r"(^.*</think>|```json\n|```\n*$)", "", ans, flags=re.DOTALL)
+    try:
+        return json_repair.loads(ans)
+    except Exception:
+        logging.exception(f"Loading json failure: {ans}")
+
+
+TOC_DETECTION = load_prompt("toc_detection")
+def detect_table_of_contents(page_1024:list[str], chat_mdl):
+    toc_secs = []
+    for i, sec in enumerate(page_1024[:22]):
+        ans = gen_json(PROMPT_JINJA_ENV.from_string(TOC_DETECTION).render(page_txt=sec), "Only JSON please.", chat_mdl)
+        if toc_secs and not ans["exists"]:
+            break
+        toc_secs.append(sec)
+    return toc_secs
+
+
+TOC_EXTRACTION = load_prompt("toc_extraction")
+TOC_EXTRACTION_CONTINUE = load_prompt("toc_extraction_continue")
+def extract_table_of_contents(toc_pages, chat_mdl):
+    if not toc_pages:
+        return []
+
+    return gen_json(PROMPT_JINJA_ENV.from_string(TOC_EXTRACTION).render(toc_page="\n".join(toc_pages)), "Only JSON please.", chat_mdl)
+
+
+def toc_index_extractor(toc:list[dict], content:str, chat_mdl):
+    tob_extractor_prompt = """
+    You are given a table of contents in a json format and several pages of a document, your job is to add the physical_index to the table of contents in the json format.
+
+    The provided pages contains tags like <physical_index_X> and <physical_index_X> to indicate the physical location of the page X.
+
+    The structure variable is the numeric system which represents the index of the hierarchy section in the table of contents. For example, the first section has structure index 1, the first subsection has structure index 1.1, the second subsection has structure index 1.2, etc.
+
+    The response should be in the following JSON format: 
+    [
+        {
+            "structure": <structure index, "x.x.x" or None> (string),
+            "title": <title of the section>,
+            "physical_index": "<physical_index_X>" (keep the format)
+        },
+        ...
+    ]
+
+    Only add the physical_index to the sections that are in the provided pages.
+    If the title of the section are not in the provided pages, do not add the physical_index to it.
+    Directly return the final JSON structure. Do not output anything else."""
+
+    prompt = tob_extractor_prompt + '\nTable of contents:\n' + json.dumps(toc, ensure_ascii=False, indent=2) + '\nDocument pages:\n' + content
+    return gen_json(prompt, "Only JSON please.", chat_mdl)
+
+
+TOC_INDEX = load_prompt("toc_index")
+def table_of_contents_index(toc_arr: list[dict], sections: list[str], chat_mdl):
+    if not toc_arr or not sections:
+        return []
+
+    toc_map = {}
+    for i, it in enumerate(toc_arr):
+        k1 = (it["structure"]+it["title"]).replace(" ", "")
+        k2 = it["title"].strip()
+        if k1 not in toc_map:
+            toc_map[k1] = []
+        if k2 not in toc_map:
+            toc_map[k2] = []
+        toc_map[k1].append(i)
+        toc_map[k2].append(i)
+
+    for it in toc_arr:
+        it["indices"] = []
+    for i, sec in enumerate(sections):
+        sec = sec.strip()
+        if sec.replace(" ", "") in toc_map:
+            for j in toc_map[sec.replace(" ", "")]:
+                toc_arr[j]["indices"].append(i)
+
+    all_pathes = []
+    def dfs(start, path):
+        nonlocal all_pathes
+        if start >= len(toc_arr):
+            if path:
+                all_pathes.append(path)
+            return
+        if not toc_arr[start]["indices"]:
+            dfs(start+1, path)
+            return
+        added = False
+        for j in toc_arr[start]["indices"]:
+            if path and j < path[-1][0]:
+                continue
+            _path = deepcopy(path)
+            _path.append((j, start))
+            added = True
+            dfs(start+1, _path)
+        if not added and path:
+            all_pathes.append(path)
+
+    dfs(0, [])
+    path = max(all_pathes, key=lambda x:len(x))
+    for it in toc_arr:
+        it["indices"] = []
+    for j, i in path:
+        toc_arr[i]["indices"] = [j]
+    print(json.dumps(toc_arr, ensure_ascii=False, indent=2))
+
+    i = 0
+    while i < len(toc_arr):
+        it  = toc_arr[i]
+        if it["indices"]:
+            i += 1
+            continue
+
+        if i>0 and toc_arr[i-1]["indices"]:
+            st_i = toc_arr[i-1]["indices"][-1]
+        else:
+            st_i = 0
+        e = i + 1
+        while e <len(toc_arr) and not toc_arr[e]["indices"]:
+            e += 1
+        if e >= len(toc_arr):
+            e = len(sections)
+        else:
+            e = toc_arr[e]["indices"][0]
+
+        for j in range(st_i, min(e+1, len(sections))):
+            ans = gen_json(PROMPT_JINJA_ENV.from_string(TOC_INDEX).render(
+                structure=it["structure"],
+                title=it["title"],
+                text=sections[j]), "Only JSON please.", chat_mdl)
+            if ans["exist"] == "yes":
+                it["indices"].append(j)
+                break
+
+        i += 1
+
+    return toc_arr
+
+
+def check_if_toc_transformation_is_complete(content, toc, chat_mdl):
+    prompt = """
+    You are given a raw table of contents and a  table of contents.
+    Your job is to check if the  table of contents is complete.
+
+    Reply format:
+    {{
+        "thinking": <why do you think the cleaned table of contents is complete or not>
+        "completed": "yes" or "no"
+    }}
+    Directly return the final JSON structure. Do not output anything else."""
+
+    prompt = prompt + '\n Raw Table of contents:\n' + content + '\n Cleaned Table of contents:\n' + toc
+    response = gen_json(prompt, "Only JSON please.", chat_mdl)
+    return response['completed']
+
+
+def toc_transformer(toc_pages, chat_mdl):
+    init_prompt = """
+    You are given a table of contents, You job is to transform the whole table of content into a JSON format included table_of_contents.
+
+    The `structure` is the numeric system which represents the index of the hierarchy section in the table of contents. For example, the first section has structure index 1, the first subsection has structure index 1.1, the second subsection has structure index 1.2, etc.
+    The `title` is a short phrase or a several-words term.
+    
+    The response should be in the following JSON format: 
+    [
+        {
+            "structure": <structure index, "x.x.x" or None> (string),
+            "title": <title of the section>
+        },
+        ...
+    ],
+    You should transform the full table of contents in one go.
+    Directly return the final JSON structure, do not output anything else. """
+
+    toc_content = "\n".join(toc_pages)
+    prompt = init_prompt + '\n Given table of contents\n:' + toc_content
+    def clean_toc(arr):
+        for a in arr:
+            a["title"] = re.sub(r"[.·….]{2,}", "", a["title"])
+    last_complete = gen_json(prompt, "Only JSON please.", chat_mdl)
+    if_complete = check_if_toc_transformation_is_complete(toc_content, json.dumps(last_complete, ensure_ascii=False, indent=2), chat_mdl)
+    clean_toc(last_complete)
+    if if_complete == "yes":
+        return last_complete
+
+    while not (if_complete == "yes"):
+        prompt = f"""
+        Your task is to continue the table of contents json structure, directly output the remaining part of the json structure.
+        The response should be in the following JSON format: 
+
+        The raw table of contents json structure is:
+        {toc_content}
+
+        The incomplete transformed table of contents json structure is:
+        {json.dumps(last_complete[-24:], ensure_ascii=False, indent=2)}
+
+        Please continue the json structure, directly output the remaining part of the json structure."""
+        new_complete = gen_json(prompt, "Only JSON please.", chat_mdl)
+        if not new_complete or str(last_complete).find(str(new_complete)) >= 0:
+            break
+        clean_toc(new_complete)
+        last_complete.extend(new_complete)
+        if_complete = check_if_toc_transformation_is_complete(toc_content, json.dumps(last_complete, ensure_ascii=False, indent=2), chat_mdl)
+
+    return last_complete
+
+
+TOC_LEVELS = load_prompt("assign_toc_levels")
+def assign_toc_levels(toc_secs, chat_mdl, gen_conf = {"temperature": 0.2}):
+    print("\nBegin TOC level assignment...\n")
+
+    ans = gen_json(
+        PROMPT_JINJA_ENV.from_string(TOC_LEVELS).render(),
+        str(toc_secs),
+        chat_mdl,
+        gen_conf
+    )
+    
+    return ans
+
+
+TOC_FROM_TEXT_SYSTEM = load_prompt("toc_from_text_system")
+TOC_FROM_TEXT_USER = load_prompt("toc_from_text_user")
+# Generate TOC from text chunks with text llms
+def gen_toc_from_text(text, chat_mdl):
+    ans = gen_json(
+        PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_SYSTEM).render(),
+        PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_USER).render(text=text),
+        chat_mdl,
+        gen_conf={"temperature": 0.0, "top_p": 0.9, "enable_thinking": False, }
+    )
+    return ans
+
+
+def split_chunks(chunks, max_length: int):
+    """
+    Pack chunks into batches according to max_length, returning [{"id": idx, "text": chunk_text}, ...].
+    Do not split a single chunk, even if it exceeds max_length.
+    """
+
+    result = []
+    batch, batch_tokens = [], 0
+
+    for idx, chunk in enumerate(chunks):
+        t = num_tokens_from_string(chunk)
+        if batch_tokens + t > max_length:
+            result.append(batch)
+            batch, batch_tokens = [], 0
+        batch.append({"id": idx, "text": chunk})    
+        batch_tokens += t
+    if batch:
+        result.append(batch)
+    return result
+
+
+def run_toc_from_text(chunks, chat_mdl):
+    input_budget = int(chat_mdl.max_length * INPUT_UTILIZATION) - num_tokens_from_string(
+        TOC_FROM_TEXT_USER + TOC_FROM_TEXT_SYSTEM
+    )
+
+    input_budget =  2000 if input_budget > 2000 else input_budget
+    chunk_sections = split_chunks(chunks, input_budget)
+    res = []
+
+    for chunk in chunk_sections:
+        ans = gen_toc_from_text(chunk, chat_mdl)
+        res.extend(ans)
+        
+    # Filter out entries with title == -1
+    filtered = [x for x in res if x.get("title") and x.get("title") != "-1"]
+
+    print("\n\nFiltered TOC sections:\n", filtered)
+
+    # Generate initial structure (structure/title)
+    raw_structure = [{"structure": "0", "title": x.get("title", "")} for x in filtered]
+
+    # Assign hierarchy levels using LLM
+    toc_with_levels = assign_toc_levels(raw_structure, chat_mdl, {"temperature": 0.0, "top_p": 0.9, "enable_thinking": False})
+
+    # Merge structure and content (by index)
+    merged = []
+    for _ , (toc_item, src_item) in enumerate(zip(toc_with_levels, filtered)):
+        merged.append({
+            "structure": toc_item.get("structure", "0"),
+            "title": toc_item.get("title", ""),
+            "content": src_item.get("content", ""),
+        })
+
+    return merged
--- a/rag/prompts/keyword_prompt.md
+++ b/rag/prompts/keyword_prompt.md
@@ -0,0 +1,16 @@
+## Role
+You are a text analyzer.
+
+## Task
+Extract the most important keywords/phrases of a given piece of text content.
+
+## Requirements
+- Summarize the text content, and give the top {{ topn }} important keywords/phrases.
+- The keywords MUST be in the same language as the given piece of text content.
+- The keywords are delimited by ENGLISH COMMA.
+- Output keywords ONLY.
+
+---
+
+## Text Content
+{{ content }}
--- a/rag/prompts/meta_filter.md
+++ b/rag/prompts/meta_filter.md
@@ -0,0 +1,53 @@
+You are a metadata filtering condition generator. Analyze the user's question and available document metadata to output a JSON array of filter objects. Follow these rules:
+
+1. **Metadata Structure**: 
+   - Metadata is provided as JSON where keys are attribute names (e.g., "color"), and values are objects mapping attribute values to document IDs.
+   - Example: 
+     {
+       "color": {"red": ["doc1"], "blue": ["doc2"]},
+       "listing_date": {"2025-07-11": ["doc1"], "2025-08-01": ["doc2"]}
+     }
+
+2. **Output Requirements**:
+   - Always output a JSON array of filter objects
+   - Each object must have:
+        "key": (metadata attribute name),
+        "value": (string value to compare),
+        "op": (operator from allowed list)
+
+3. **Operator Guide**:
+   - Use these operators only: ["contains", "not contains", "start with", "end with", "empty", "not empty", "=", "≠", ">", "<", "≥", "≤"]
+   - Date ranges: Break into two conditions (≥ start_date AND < next_month_start)
+   - Negations: Always use "≠" for exclusion terms ("not", "except", "exclude", "≠")
+   - Implicit logic: Derive unstated filters (e.g., "July" → [≥ YYYY-07-01, < YYYY-08-01])
+
+4. **Processing Steps**:
+   a) Identify ALL filterable attributes in the query (both explicit and implicit)
+   b) For dates:
+        - Infer missing year from current date if needed
+        - Always format dates as "YYYY-MM-DD"
+        - Convert ranges: [≥ start, < end]
+   c) For values: Match EXACTLY to metadata's value keys
+   d) Skip conditions if:
+        - Attribute doesn't exist in metadata
+        - Value has no match in metadata
+
+5. **Example**:
+   - User query: "上市日期七月份的有哪些商品，不要蓝色的"
+   - Metadata: { "color": {...}, "listing_date": {...} }
+   - Output: 
+        [
+          {"key": "listing_date", "value": "2025-07-01", "op": "≥"},
+          {"key": "listing_date", "value": "2025-08-01", "op": "<"},
+          {"key": "color", "value": "blue", "op": "≠"}
+        ]
+
+6. **Final Output**:
+   - ONLY output valid JSON array
+   - NO additional text/explanations
+
+**Current Task**:
+- Today's date: {{current_date}}
+- Available metadata keys: {{metadata_keys}}
+- User query: "{{user_question}}"
+
--- a/rag/prompts/next_step.md
+++ b/rag/prompts/next_step.md
@@ -0,0 +1,92 @@
+You are an expert Planning Agent tasked with solving problems efficiently through structured plans.
+Your job is:
+1. Based on the task analysis, chose some right tools to execute.
+2. Track progress and adapt plans(tool calls) when necessary.
+3. Use `complete_task` if no further step you need to take from tools. (All necessary steps done or little hope to be done)
+
+# ========== TASK ANALYSIS =============
+{{ task_analysis }}
+
+# ==========  TOOLS (JSON-Schema) ==========
+You may invoke only the tools listed below.  
+Return a JSON array of objects in which item is with exactly two top-level keys:  
+• "name": the tool to call  
+• "arguments": an object whose keys/values satisfy the schema
+
+{{ desc }}
+
+
+# ==========  MULTI-STEP EXECUTION ==========
+When tasks require multiple independent steps, you can execute them in parallel by returning multiple tool calls in a single JSON array.
+
+• **Data Collection**: Gathering information from multiple sources simultaneously
+• **Validation**: Cross-checking facts using different tools
+• **Comprehensive Analysis**: Analyzing different aspects of the same problem
+• **Efficiency**: Reducing total execution time when steps don't depend on each other
+
+**Example Scenarios:**
+- Searching multiple databases for the same query
+- Checking weather in multiple cities
+- Validating information through different APIs
+- Performing calculations on different datasets
+- Gathering user preferences from multiple sources
+
+# ==========  RESPONSE FORMAT ==========
+**When you need a tool**  
+Return ONLY the Json (no additional keys, no commentary, end with `<|stop|>`), such as following:
+[{
+  "name": "<tool_name1>",
+  "arguments": { /* tool arguments matching its schema */ }
+},{
+  "name": "<tool_name2>",
+  "arguments": { /* tool arguments matching its schema */ }
+}...]<|stop|>
+
+**When you need multiple tools:**
+Return ONLY:
+[{
+  "name": "<tool_name1>",
+  "arguments": { /* tool arguments matching its schema */ }
+},{
+  "name": "<tool_name2>",
+  "arguments": { /* tool arguments matching its schema */ }
+},{
+  "name": "<tool_name3>",
+  "arguments": { /* tool arguments matching its schema */ }
+}...]<|stop|>
+
+**When you are certain the task is solved OR no further information can be obtained**  
+Return ONLY:
+[{
+  "name": "complete_task",
+  "arguments": { "answer": "<final answer text>" }
+}]<|stop|>
+
+<verification_steps>
+Before providing a final answer:
+1. Double-check all gathered information
+2. Verify calculations and logic
+3. Ensure answer matches exactly what was asked
+4. Confirm answer format meets requirements
+5. Run additional verification if confidence is not 100%
+</verification_steps>
+
+<error_handling>
+If you encounter issues:
+1. Try alternative approaches before giving up
+2. Use different tools or combinations of tools
+3. Break complex problems into simpler sub-tasks
+4. Verify intermediate results frequently
+5. Never return "I cannot answer" without exhausting all options
+</error_handling>
+
+⚠️ Any output that is not valid JSON or that contains extra fields will be rejected.
+
+# ==========  REASONING & REFLECTION ==========
+You may think privately (not shown to the user) before producing each JSON object.  
+Internal guideline:
+1. **Reason**: Analyse the user question; decide which tools (if any) are needed.
+2. **Act**: Emit the JSON object to call the tool.
+
+Today is {{ today }}. Remember that success in answering questions accurately is paramount - take all necessary steps to ensure your answer is correct.
+
--- a/rag/prompts/question_prompt.md
+++ b/rag/prompts/question_prompt.md
@@ -0,0 +1,19 @@
+## Role
+You are a text analyzer.
+
+## Task
+Propose {{ topn }} questions about a given piece of text content.
+
+## Requirements
+- Understand and summarize the text content, and propose the top {{ topn }} important questions.
+- The questions SHOULD NOT have overlapping meanings.
+- The questions SHOULD cover the main content of the text as much as possible.
+- The questions MUST be in the same language as the given piece of text content.
+- One question per line.
+- Output questions ONLY.
+
+---
+
+## Text Content
+{{ content }}
+
--- a/rag/prompts/rank_memory.md
+++ b/rag/prompts/rank_memory.md
@@ -0,0 +1,30 @@
+**Task**: Sort the tool call results based on relevance to the overall goal and current sub-goal. Return ONLY a sorted list of indices (0-indexed).
+
+**Rules**:
+1. Analyze each result's contribution to both:
+   - The overall goal (primary priority)
+   - The current sub-goal (secondary priority)
+2. Sort from MOST relevant (highest impact) to LEAST relevant
+3. Output format: Strictly a Python-style list of integers. Example: [2, 0, 1]
+
+🔹 Overall Goal: {{ goal }}
+🔹 Sub-goal: {{ sub_goal }}
+
+**Examples**:  
+🔹 Tool Response:  
+ - index: 0
+     > Tokyo temperature is 78°F.
+ - index: 1
+     > Error: Authentication failed (expired API key).
+ - index: 2
+     > Available: 12 widgets in stock (max 5 per customer).
+ 
+ → rank: [1,2,0]<|stop|>
+ 
+
+**Your Turn**:  
+🔹 Tool Response:
+{% for f in results %}
+ - index: f.i
+     > f.content
+{% endfor %}
--- a/rag/prompts/reflect.md
+++ b/rag/prompts/reflect.md
@@ -0,0 +1,75 @@
+**Context**:
+ - To achieve the goal: {{ goal }}.
+ - You have executed following tool calls:
+{% for call in tool_calls %}
+Tool call: `{{ call.name }}`
+Results: {{ call.result }}
+{% endfor %}
+
+## Task Complexity Analysis & Reflection Scope
+
+**First, analyze the task complexity using these dimensions:**
+
+### Complexity Assessment Matrix
+- **Scope Breadth**: Single-step (1) | Multi-step (2) | Multi-domain (3)
+- **Data Dependency**: Self-contained (1) | External inputs (2) | Multiple sources (3)
+- **Decision Points**: Linear (1) | Few branches (2) | Complex logic (3)
+- **Risk Level**: Low (1) | Medium (2) | High (3)
+
+**Complexity Score**: Sum all dimensions (4-12 points)
+
+---
+
+##  Task Transmission Assessment
+**Note**: This section is not subject to word count limitations when transmission is needed, as it serves critical handoff functions.
+**Evaluate if task transmission information is needed:**
+- **Is this an initial step?** If yes, skip this section
+- **Are there downstream agents/steps?** If no, provide minimal transmission
+- **Is there critical state/context to preserve?** If yes, include full transmission
+
+### If Task Transmission is Needed:
+- **Current State Summary**: [1-2 sentences on where we are]
+- **Key Data/Results**: [Critical findings that must carry forward]
+- **Context Dependencies**: [Essential context for next agent/step]
+- **Unresolved Items**: [Issues requiring continuation]
+- **Status for User**: [Clear status update in user terms]
+- **Technical State**: [System state for technical handoffs]
+
+---
+
+##  Situational Reflection (Adjust Length Based on Complexity Score)
+
+### Reflection Guidelines:
+- **Simple Tasks (4-5 points)**: ~50-100 words, focus on completion status and immediate next step
+- **Moderate Tasks (6-8 points)**: ~100-200 words, include core details and main risks  
+- **Complex Tasks (9-12 points)**: ~200-300 words, provide full analysis and alternatives
+
+### 1. Goal Achievement Status
+ - Does the current outcome align with the original purpose of this task phase? 
+ - If not, what critical gaps exist?
+
+### 2. Step Completion Check
+ - Which planned steps were completed? (List verified items)
+ - Which steps are pending/incomplete? (Specify exactly what's missing)
+
+### 3. Information Adequacy
+ - Is the collected data sufficient to proceed?
+ - What key information is still needed? (e.g., metrics, user input, external data)
+
+### 4. Critical Observations
+ - Unexpected outcomes: [Flag anomalies/errors]
+ - Risks/blockers: [Identify immediate obstacles]
+ - Accuracy concerns: [Highlight unreliable results]
+
+### 5. Next-Step Recommendations
+ - Proposed immediate action: [Concrete next step]
+ - Alternative strategies if blocked: [Workaround solution]
+ - Tools/inputs required for next phase: [Specify resources]
+
+---
+
+**Output Instructions:**
+1. First determine your complexity score
+2. Assess if task transmission section is needed using the evaluation questions
+3. Provide situational reflection with length appropriate to complexity
+4. Use clear headers for easy parsing by downstream systems
--- a/rag/prompts/related_question.md
+++ b/rag/prompts/related_question.md
@@ -0,0 +1,55 @@
+# Role
+You are an AI language model assistant tasked with generating **5-10 related questions** based on a user’s original query.
+These questions should help **expand the search query scope** and **improve search relevance**.
+
+---
+
+## Instructions
+
+**Input:**
+You are provided with a **user’s question**.
+
+**Output:**
+Generate **5-10 alternative questions** that are **related** to the original user question.
+These alternatives should help retrieve a **broader range of relevant documents** from a vector database.
+
+**Context:**
+Focus on **rephrasing** the original question in different ways, ensuring the alternative questions are **diverse but still connected** to the topic of the original query.
+Do **not** create overly obscure, irrelevant, or unrelated questions.
+
+**Fallback:**
+If you cannot generate any relevant alternatives, do **not** return any questions.
+
+---
+
+## Guidance
+
+1. Each alternative should be **unique** but still **relevant** to the original query.
+2. Keep the phrasing **clear, concise, and easy to understand**.
+3. Avoid overly technical jargon or specialized terms **unless directly relevant**.
+4. Ensure that each question **broadens** the search angle, **not narrows** it.
+
+---
+
+## Example
+
+**Original Question:**
+> What are the benefits of electric vehicles?
+
+**Alternative Questions:**
+1. How do electric vehicles impact the environment?
+2. What are the advantages of owning an electric car?
+3. What is the cost-effectiveness of electric vehicles?
+4. How do electric vehicles compare to traditional cars in terms of fuel efficiency?
+5. What are the environmental benefits of switching to electric cars?
+6. How do electric vehicles help reduce carbon emissions?
+7. Why are electric vehicles becoming more popular?
+8. What are the long-term savings of using electric vehicles?
+9. How do electric vehicles contribute to sustainability?
+10. What are the key benefits of electric vehicles for consumers?
+
+---
+
+## Reason
+Rephrasing the original query into multiple alternative questions helps the user explore **different aspects** of their search topic, improving the **quality of search results**.
+These questions guide the search engine to provide a **more comprehensive set** of relevant documents.
--- a/rag/prompts/summary4memory.md
+++ b/rag/prompts/summary4memory.md
@@ -0,0 +1,35 @@
+**Role**: AI Assistant  
+**Task**: Summarize tool call responses  
+**Rules**:  
+1. Context: You've executed a tool (API/function) and received a response.  
+2. Condense the response into 1-2 short sentences.  
+3. Never omit:  
+   - Success/error status  
+   - Core results (e.g., data points, decisions)  
+   - Critical constraints (e.g., limits, conditions)  
+4. Exclude technical details like timestamps/request IDs unless crucial.  
+5. Use language as the same as main content of the tool response.  
+
+**Response Template**:  
+"[Status] + [Key Outcome] + [Critical Constraints]"  
+
+**Examples**:  
+🔹 Tool Response:  
+{"status": "success", "temperature": 78.2, "unit": "F", "location": "Tokyo", "timestamp": 16923456}  
+→ Summary: "Success: Tokyo temperature is 78°F."  
+
+🔹 Tool Response:  
+{"error": "invalid_api_key", "message": "Authentication failed: expired key"}  
+→ Summary: "Error: Authentication failed (expired API key)."  
+
+🔹 Tool Response:  
+{"available": true, "inventory": 12, "product": "widget", "limit": "max 5 per customer"}  
+→ Summary: "Available: 12 widgets in stock (max 5 per customer)."  
+
+**Your Turn**:  
+ - Tool call: {{ name }}
+ - Tool inputs as following:
+{{ params }}
+
+ - Tool Response:
+{{ result }}
--- a/rag/prompts/template.py
+++ b/rag/prompts/template.py
@@ -0,0 +1,20 @@
+import os
+
+
+PROMPT_DIR = os.path.dirname(__file__)
+
+_loaded_prompts = {}
+
+
+def load_prompt(name: str) -> str:
+    if name in _loaded_prompts:
+        return _loaded_prompts[name]
+
+    path = os.path.join(PROMPT_DIR, f"{name}.md")
+    if not os.path.isfile(path):
+        raise FileNotFoundError(f"Prompt file '{name}.md' not found in prompts/ directory.")
+
+    with open(path, "r", encoding="utf-8") as f:
+        content = f.read().strip()
+        _loaded_prompts[name] = content
+        return content
--- a/rag/prompts/toc_detection.md
+++ b/rag/prompts/toc_detection.md
@@ -0,0 +1,29 @@
+You are an AI assistant designed to analyze text content and detect whether a table of contents (TOC) list exists on the given page. Follow these steps:  
+
+1. **Analyze the Input**: Carefully review the provided text content.  
+2. **Identify Key Features**: Look for common indicators of a TOC, such as:  
+   - Section titles or headings paired with page numbers.
+   - Patterns like repeated formatting (e.g., bold/italicized text, dots/dashes between titles and numbers).  
+   - Phrases like "Table of Contents," "Contents," or similar headings.  
+   - Logical grouping of topics/subtopics with sequential page references.  
+3. **Discern Negative  Features**:
+   - The text contains no numbers, or the numbers present are clearly not page references (e.g., dates, statistical figures, phone numbers, version numbers).
+   - The text consists of full, descriptive sentences and paragraphs that form a narrative, present arguments, or explain concepts, rather than succinctly listing topics.
+   - Contains citations with authors, publication years, journal titles, and page ranges (e.g., "Smith, J. (2020). Journal Title, 10(2), 45-67.").
+   - Lists keywords or terms followed by multiple page numbers, often in alphabetical order.
+   - Comprises terms followed by their definitions or explanations.
+   - Labeled with headers like "Appendix A," "Appendix B," etc.
+   - Contains expressive language thanking individuals or organizations for their support or contributions.
+4. **Evaluate Evidence**: Weigh the presence/absence of these features to determine if the content resembles a TOC.
+5. **Output Format**: Provide your response in the following JSON structure:  
+   ```json  
+   {  
+     "reasoning": "Step-by-step explanation of your analysis based on the features identified." ,
+     "exists": true/false
+   }  
+   ```  
+6. **DO NOT** output anything else except JSON structure.
+
+**Input text Content ( Text-Only Extraction ):**  
+{{ page_txt }} 
+
--- a/rag/prompts/toc_extraction.md
+++ b/rag/prompts/toc_extraction.md
@@ -0,0 +1,53 @@
+You are an expert parser and data formatter. Your task is to analyze the provided table of contents (TOC) text and convert it into a valid JSON array of objects.
+
+**Instructions:**
+1.  Analyze each line of the input TOC.
+2.  For each line, extract the following three pieces of information:
+    *   `structure`: The hierarchical index/numbering (e.g., "1", "2.1", "3.2.5", "A.1"). If a line has no visible numbering or structure indicator (like a main "Chapter" title), use `null`.
+    *   `title`: The textual title of the section or chapter. This should be the main descriptive text, clean and without the page number.
+3.  Output **only** a valid JSON array. Do not include any other text, explanations, or markdown code block fences (like ```json) in your response.
+
+**JSON Format:**
+The output must be a list of objects following this exact schema:
+```json
+[
+    {
+        "structure": <structure index, "x.x.x" or None> (string）,
+        "title": <title of the section>
+    },
+    ...
+]
+```
+
+**Input Example:**
+```
+Contents
+1 Introduction to the System ... 1
+1.1 Overview .... 2
+1.2 Key Features .... 5
+2 Installation Guide ....8
+2.1 Prerequisites ........ 9
+2.2 Step-by-Step Process ........ 12
+Appendix A: Specifications ..... 45
+References ... 47
+```
+
+**Expected Output For The Example:**
+```json
+[
+    {"structure": null, "title": "Contents"},
+    {"structure": "1", "title": "Introduction to the System"},
+    {"structure": "1.1", "title": "Overview"},
+    {"structure": "1.2", "title": "Key Features"},
+    {"structure": "2", "title": "Installation Guide"},
+    {"structure": "2.1", "title": "Prerequisites"},
+    {"structure": "2.2", "title": "Step-by-Step Process"},
+    {"structure": "A", "title": "Specifications"},
+    {"structure": null, "title": "References"}
+]
+```
+
+**Now, process the following TOC input:**
+```
+{{ toc_page }}
+```
--- a/rag/prompts/toc_extraction_continue.md
+++ b/rag/prompts/toc_extraction_continue.md
@@ -0,0 +1,60 @@
+You are an expert parser and data formatter, currently in the process of building a JSON array from a multi-page table of contents (TOC). Your task is to analyze the new page of content and **append** the new entries to the existing JSON array.
+
+**Instructions:**
+1.  You will be given two inputs:
+    *   `current_page_text`: The text content from the new page of the TOC.
+    *   `existing_json`: The valid JSON array you have generated from the previous pages.
+2.  Analyze each line of the `current_page_text` input.
+3.  For each new line, extract the following three pieces of information:
+    *   `structure`: The hierarchical index/numbering (e.g., "1", "2.1", "3.2.5"). Use `null` if none exists.
+    *   `title`: The clean textual title of the section or chapter.
+    *   `page`: The page number on which the section starts. Extract only the number. Use `null` if not present.
+4.  **Append these new entries** to the `existing_json` array. Do not modify, reorder, or delete any of the existing entries.
+5.  Output **only** the complete, updated JSON array. Do not include any other text, explanations, or markdown code block fences (like ```json).
+
+**JSON Format:**
+The output must be a valid JSON array following this schema:
+```json
+[
+    {
+        "structure": <string or null>,
+        "title": <string>,
+        "page": <number or null>
+    },
+    ...
+]
+```
+
+**Input Example:**
+`current_page_text`:
+```
+3.2 Advanced Configuration ........... 25
+3.3 Troubleshooting .................. 28
+4 User Management .................... 30
+```
+
+`existing_json`:
+```json
+[
+    {"structure": "1", "title": "Introduction", "page": 1},
+    {"structure": "2", "title": "Installation", "page": 5},
+    {"structure": "3", "title": "Configuration", "page": 12},
+    {"structure": "3.1", "title": "Basic Setup", "page": 15}
+]
+```
+
+**Expected Output For The Example:**
+```json
+[
+    {"structure": "3.2", "title": "Advanced Configuration", "page": 25},
+    {"structure": "3.3", "title": "Troubleshooting", "page": 28},
+    {"structure": "4", "title": "User Management", "page": 30}
+]
+```
+
+**Now, process the following inputs:**
+`current_page_text`:
+{{ toc_page }}
+
+`existing_json`:
+{{ toc_json }}
--- a/rag/prompts/toc_from_text_system.md
+++ b/rag/prompts/toc_from_text_system.md
@@ -0,0 +1,113 @@
+You are a robust Table-of-Contents (TOC) extractor.
+
+GOAL
+Given a dictionary of chunks {chunk_id: chunk_text}, extract TOC-like headings and return a strict JSON array of objects:
+[
+  {"title": , "content": ""},
+  ...
+]
+
+FIELDS
+- "title": the heading text (clean, no page numbers or leader dots).
+  - If any part of a chunk has no valid heading, output that part as {"title":"-1", ...}.
+- "content": the chunk_id (string).
+  - One chunk can yield multiple JSON objects in order (unmatched text + one or more headings).
+
+RULES
+1) Preserve input chunk order strictly.
+2) If a chunk contains multiple headings, expand them in order:
+   - Pre-heading narrative → {"title":"-1","content":chunk_id}
+   - Then each heading → {"title":"...","content":chunk_id}
+3) Do not merge outputs across chunks; each object refers to exactly one chunk_id.
+4) "title" must be non-empty (or exactly "-1"). "content" must be a string (chunk_id).
+5) When ambiguous, prefer "-1" unless the text strongly looks like a heading.
+
+HEADING DETECTION (cues, not hard rules)
+- Appears near line start, short isolated phrase, often followed by content.
+- May contain separators: — —— - : ： · •
+- Numbering styles:
+  • 第[一二三四五六七八九十百]+(篇|章|节|条)
+  • [(（]?[一二三四五六七八九十]+[)）]?
+  • [(（]?[①②③④⑤⑥⑦⑧⑨⑩][)）]?
+  • ^\d+(\.\d+)*[)．.]?\s*
+  • ^[IVXLCDM]+[).]
+  • ^[A-Z][).]
+- Canonical section cues (general only):
+  Common heading indicators include words such as:
+  "Overview", "Introduction", "Background", "Purpose", "Scope", "Definition",
+  "Method", "Procedure", "Result", "Discussion", "Summary", "Conclusion",
+  "Appendix", "Reference", "Annex", "Acknowledgment", "Disclaimer".
+  These are soft cues, not strict requirements.
+- Length restriction:
+  • Chinese heading: ≤25 characters
+  • English heading: ≤80 characters
+- Exclude long narrative sentences, continuous prose, or bullet-style lists → output as "-1".
+
+OUTPUT FORMAT
+- Return ONLY a valid JSON array of {"title","content"} objects.
+- No reasoning or commentary.
+
+EXAMPLES
+
+Example 1 — No heading
+Input:
+{0: "Copyright page · Publication info (ISBN 123-456). All rights reserved."}
+Output:
+[
+  {"title":"-1","content":"0"}
+]
+
+Example 2 — One heading
+Input:
+{1: "Chapter 1: General Provisions This chapter defines the overall rules…"}
+Output:
+[
+  {"title":"Chapter 1: General Provisions","content":"1"}
+]
+
+Example 3 — Narrative + heading
+Input:
+{2: "This paragraph introduces the background and goals. Section 2: Definitions Key terms are explained…"}
+Output:
+[
+  {"title":"-1","content":"2"},
+  {"title":"Section 2: Definitions","content":"2"}
+]
+
+Example 4 — Multiple headings in one chunk
+Input:
+{3: "Declarations and Commitments (I) Party B commits… (II) Party C commits… Appendix A Data Specification"}
+Output:
+[
+  {"title":"Declarations and Commitments (I)","content":"3"},
+  {"title":"(II)","content":"3"},
+  {"title":"Appendix A","content":"3"}
+]
+
+Example 5 — Numbering styles
+Input:
+{4: "1. Scope: Defines boundaries. 2) Definitions: Terms used. III) Methods Overview."}
+Output:
+[
+  {"title":"1. Scope","content":"4"},
+  {"title":"2) Definitions","content":"4"},
+  {"title":"III) Methods","content":"4"}
+]
+
+Example 6 — Long list (NOT headings)
+Input:
+{5: "Item list: apples, bananas, strawberries, blueberries, mangos, peaches"}
+Output:
+[
+  {"title":"-1","content":"5"}
+]
+
+Example 7 — Mixed Chinese/English
+Input:
+{6: "（出版信息略）This standard follows industry practices. Chapter 1: Overview 摘要… 第2节：术语与缩略语"}
+Output:
+[
+  {"title":"-1","content":"6"},
+  {"title":"Chapter 1: Overview","content":"6"},
+  {"title":"第2节：术语与缩略语","content":"6"}
+]
--- a/rag/prompts/toc_from_text_user.md
+++ b/rag/prompts/toc_from_text_user.md
@@ -0,0 +1,8 @@
+OUTPUT FORMAT
+- Return ONLY the JSON array.
+- Use double quotes.
+- No extra commentary.
+- Keep language of "title" the same as the input.
+
+INPUT
+{{text}}
--- a/rag/prompts/toc_index.md
+++ b/rag/prompts/toc_index.md
@@ -0,0 +1,20 @@
+You are an expert analyst tasked with matching text content to the title.
+
+**Instructions:**
+1. Analyze the given title with its numeric structure index and the provided text.
+2. Determine whether the title is mentioned as a section tile in the given text.
+3. Provide a concise, step-by-step reasoning for your decision.
+4. Output **only** the complete JSON object. Do not include any other text, explanations, or markdown code block fences (like ```json).
+
+**Output Format:**
+Your output must be a valid JSON object with the following keys:
+{
+"reasoning": "Step-by-step explanation of your analysis.",
+"exist": "<yes or no>",
+}
+
+** The title: **
+{{ structure }} {{ title }}
+
+** Given text: **
+{{ text }}
--- a/rag/prompts/tool_call_summary.md
+++ b/rag/prompts/tool_call_summary.md
@@ -0,0 +1,19 @@
+**Task Instruction:**
+
+You are tasked with reading and analyzing tool call result based on the following inputs: **Inputs for current call**, and **Results**. Your objective is to extract relevant and helpful information for **Inputs for current call** from the **Results** and seamlessly integrate this information into the previous steps to continue reasoning for the original question.
+
+**Guidelines:**
+
+1. **Analyze the Results:**
+  - Carefully review the content of each results of tool call.
+  - Identify factual information that is relevant to the **Inputs for current call** and can aid in the reasoning process for the original question.
+
+2. **Extract Relevant Information:**
+  - Select the information from the Searched Web Pages that directly contributes to advancing the previous reasoning steps.
+  - Ensure that the extracted information is accurate and relevant.
+
+  - **Inputs for current call:**  
+  {{ inputs }}
+
+  - **Results:**  
+  {{ results }}
--- a/rag/prompts/vision_llm_describe_prompt.md
+++ b/rag/prompts/vision_llm_describe_prompt.md
@@ -0,0 +1,23 @@
+## INSTRUCTION
+Transcribe the content from the provided PDF page image into clean Markdown format.
+
+- Only output the content transcribed from the image.
+- Do NOT output this instruction or any other explanation.
+- If the content is missing or you do not understand the input, return an empty string.
+
+## RULES
+1. Do NOT generate examples, demonstrations, or templates.
+2. Do NOT output any extra text such as 'Example', 'Example Output', or similar.
+3. Do NOT generate any tables, headings, or content that is not explicitly present in the image.
+4. Transcribe content word-for-word. Do NOT modify, translate, or omit any content.
+5. Do NOT explain Markdown or mention that you are using Markdown.
+6. Do NOT wrap the output in ```markdown or ``` blocks.
+7. Only apply Markdown structure to headings, paragraphs, lists, and tables, strictly based on the layout of the image. Do NOT create tables unless an actual table exists in the image.
+8. Preserve the original language, information, and order exactly as shown in the image.
+
+{% if page %}
+At the end of the transcription, add the page divider: `--- Page {{ page }} ---`.
+{% endif %}
+
+> If you do not detect valid content in the image, return an empty string.
+
--- a/rag/prompts/vision_llm_figure_describe_prompt.md
+++ b/rag/prompts/vision_llm_figure_describe_prompt.md
@@ -0,0 +1,24 @@
+## ROLE
+You are an expert visual data analyst.
+
+## GOAL
+Analyze the image and provide a comprehensive description of its content. Focus on identifying the type of visual data representation (e.g., bar chart, pie chart, line graph, table, flowchart), its structure, and any text captions or labels included in the image.
+
+## TASKS
+1. Describe the overall structure of the visual representation. Specify if it is a chart, graph, table, or diagram.
+2. Identify and extract any axes, legends, titles, or labels present in the image. Provide the exact text where available.
+3. Extract the data points from the visual elements (e.g., bar heights, line graph coordinates, pie chart segments, table rows and columns).
+4. Analyze and explain any trends, comparisons, or patterns shown in the data.
+5. Capture any annotations, captions, or footnotes, and explain their relevance to the image.
+6. Only include details that are explicitly present in the image. If an element (e.g., axis, legend, or caption) does not exist or is not visible, do not mention it.
+
+## OUTPUT FORMAT (Include only sections relevant to the image content)
+- Visual Type: [Type]
+- Title: [Title text, if available]
+- Axes / Legends / Labels: [Details, if available]
+- Data Points: [Extracted data]
+- Trends / Insights: [Analysis and interpretation]
+- Captions / Annotations: [Text and relevance, if available]
+
+> Ensure high accuracy, clarity, and completeness in your analysis, and include only the information present in the image. Avoid unnecessary statements about missing elements.
+