more sophisticated chat like retrieval for llamaindex

This commit is contained in:
2026-02-26 19:02:05 +03:00
parent 468d5fb572
commit 6b3fa1cfaa
3 changed files with 390 additions and 2 deletions

View File

@@ -69,3 +69,27 @@ Chosen data folder: relatve ./../../../data - from the current folder
- [x] Create file `server.py`, with web framework fastapi, for example
- [x] Add POST endpoint "/api/test-query" which will use agent, and retrieve response for query, sent in JSON format, field "query"
# Phase 12 (upgrade from simple retrieval to agent-like chat in LlamaIndex)
- [x] Revisit Phase 5 assumption ("simple retrieval only") and explicitly allow agent/chat orchestration in LlamaIndex for QA over documents.
- [x] Create new module for chat orchestration (for example `agent.py` or `chat_engine.py`) that separates:
1) retrieval of source nodes
2) answer synthesis with explicit prompt
3) response formatting with sources/metadata
- [x] Implement a LlamaIndex-based chat feature (agent-like behavior) using framework-native primitives (chat engine / agent workflow / tool-calling approach supported by installed version), so the model can iteratively query retrieval tools when needed.
- [x] Add a retrieval tool wrapper for document search that returns structured snippets (`filename`, `file_path`, `page_label/page`, `chunk_number`, content preview, score) instead of raw text only.
- [x] Add a grounded answer prompt/template for the LlamaIndex chat path with rules:
- answer only from retrieved context
- if information is missing, say so directly
- prefer exact dates/years and quote filenames/pages where possible
- avoid generic claims not supported by sources
- [x] Add response mode that returns both:
- final answer text
- list of retrieved sources (content snippet + metadata + score)
- [x] Add post-processing for retrieved nodes before synthesis:
- deduplicate near-identical chunks
- drop empty / near-empty chunks
- optionally filter low-information chunks (headers/footers)
- [x] Add optional metadata-aware retrieval improvements (years/events/keywords) parity with LangChain approach (folder near current folder), if feasible in the chosen LlamaIndex primitives.
- [x] Update `server.py` endpoint to use the new agent-like chat path (keep simple retrieval endpoint available as fallback or debug mode).