rag-solution/services/rag/llamaindex/QWEN.md

# RAG Solution with LlamaIndex and Qdrant

## Project Overview

This is a Retrieval Augmented Generation (RAG) solution built using LlamaIndex as the primary framework and Qdrant as the vector storage. The project is designed to load documents from a shared data directory, store them in a vector database, and enable semantic search and chat capabilities using local Ollama models.

### Key Technologies
- **RAG Framework**: LlamaIndex
- **Vector Storage**: Qdrant
- **Embedding Models**: Ollama (configurable via environment variables)
- **Chat Models**: Ollama (configurable via environment variables)
- **Data Directory**: `./../../../data` (relative to project root)
- **Logging**: loguru with file rotation and stdout logging

### Architecture Components
- CLI entry point (`cli.py`)
- Document enrichment module (`enrichment.py`)
- Vector storage configuration (`vector_storage.py`)
- Retrieval module (`retrieval.py`)
- Chat agent (`agent.py`)

## Building and Running

### Prerequisites
1. Python virtual environment (already created in `venv` folder)
2. Ollama running locally on default port 11434
3. Qdrant running locally (REST API on port 6333, gRPC on port 6334)
4. Data files in the `./../../../data` directory

### Setup Process
1. Activate the virtual environment:
   ```bash
   source venv/bin/activate
   ```

2. Install required packages based on the document extensions found in the data directory (see EXTENSIONS.md for details)

3. Configure environment variables in `.env` file (copy from `.env.dist`)

4. Run the CLI to initialize the system:
   ```bash
   python cli.py ping  # Should return "pong"
   ```

### Available Commands
- `ping`: Basic connectivity test
- `enrich`: Load and process documents from the data directory into vector storage
- `chat`: Start an interactive chat session with the RAG system

## Development Conventions

### Logging
- Use `loguru` for all logging
- Log to both file (`logs/dev.log`) with rotation and stdout
- Use appropriate log levels (DEBUG, INFO, WARNING, ERROR)

### Environment Variables
- `OLLAMA_EMBEDDING_MODEL`: Name of the Ollama model to use for embeddings
- `OLLAMA_CHAT_MODEL`: Name of the Ollama model to use for chat functionality
- API keys for external services (OpenRouter option available but commented out)

### Document Processing
- Support multiple file formats based on EXTENSIONS.md
- Use text splitters appropriate for each document type
- Store metadata (filename, page, section, paragraph) with embeddings
- Track processed documents to avoid re-processing (using SQLite if needed)

### Vector Storage
- Collection name: "documents_llamaindex"
- Initialize automatically if not exists
- Support for Ollama embeddings by default
- Optional OpenAI embedding support via OpenRouter (commented out)

## Project Phases

### Phase 1: CLI Entry Point
- [x] Virtual environment setup
- [x] CLI creation with `click` library
- [x] Basic "ping" command implementation

### Phase 2: Framework Installation
- [x] LlamaIndex installation
- [ ] Data folder analysis and EXTENSIONS.md creation
- [ ] Required loader libraries installation

### Phase 3: Vector Storage Setup
- [ ] Qdrant library installation
- [ ] Vector storage initialization module
- [ ] Embedding model configuration with Ollama
- [ ] Collection creation strategy

### Phase 4: Document Enrichment
- [ ] Document loading module with appropriate loaders
- [ ] Text splitting strategies implementation
- [ ] Document tracking mechanism
- [ ] CLI command for enrichment

### Phase 5: Retrieval Feature
- [ ] Retrieval module configuration
- [ ] Query processing with metadata retrieval

### Phase 6: Chat Agent
- [ ] Agent module with Ollama integration
- [ ] Integration with retrieval module
- [ ] CLI command for chat functionality

## File Structure
```
llamaindex/
├── venv/                 # Python virtual environment
├── cli.py               # CLI entry point
├── vector_storage.py    # Vector storage configuration (to be created)
├── enrichment.py        # Document loading and processing (to be created)
├── retrieval.py         # Search and retrieval functionality (to be created)
├── agent.py             # Chat agent implementation (to be created)
├── EXTENSIONS.md        # Supported file extensions and loaders (to be created)
├── .env.dist            # Environment variable template
├── .env                 # Local environment variables (git-ignored)
├── logs/                # Log files directory
│   └── dev.log          # Main log file with rotation
└── PLANNING.md          # Project planning document
```

## Data Directory
The system expects documents to be placed in `./../../../data` relative to the project root. The system will analyze this directory to determine supported file types and appropriate loaders.

## Testing
- Unit tests for individual modules
- Integration tests for end-to-end functionality
- CLI command tests

## Troubleshooting
- Ensure Ollama is running on port 11434
- Verify Qdrant is accessible on ports 6333 (REST) and 6334 (gRPC)
- Check that the data directory contains supported file types
- Review logs in `logs/dev.log` for detailed error information
Cli with ping for llamaindex 2026-02-04 00:59:01 +03:00			`# RAG Solution with LlamaIndex and Qdrant`

			`## Project Overview`

			`This is a Retrieval Augmented Generation (RAG) solution built using LlamaIndex as the primary framework and Qdrant as the vector storage. The project is designed to load documents from a shared data directory, store them in a vector database, and enable semantic search and chat capabilities using local Ollama models.`

			`### Key Technologies`
			`- RAG Framework: LlamaIndex`
			`- Vector Storage: Qdrant`
			`- Embedding Models: Ollama (configurable via environment variables)`
			`- Chat Models: Ollama (configurable via environment variables)`
			- Data Directory: `./../../../data` (relative to project root)
			`- Logging: loguru with file rotation and stdout logging`

			`### Architecture Components`
			- CLI entry point (`cli.py`)
			- Document enrichment module (`enrichment.py`)
			- Vector storage configuration (`vector_storage.py`)
			- Retrieval module (`retrieval.py`)
			- Chat agent (`agent.py`)

			`## Building and Running`

			`### Prerequisites`
			1. Python virtual environment (already created in `venv` folder)
			`2. Ollama running locally on default port 11434`
			`3. Qdrant running locally (REST API on port 6333, gRPC on port 6334)`
			4. Data files in the `./../../../data` directory

			`### Setup Process`
			`1. Activate the virtual environment:`
			```bash
			`source venv/bin/activate`
			```

			`2. Install required packages based on the document extensions found in the data directory (see EXTENSIONS.md for details)`

			3. Configure environment variables in `.env` file (copy from `.env.dist`)

			`4. Run the CLI to initialize the system:`
			```bash
			`python cli.py ping # Should return "pong"`
			```

			`### Available Commands`
			- `ping`: Basic connectivity test
			- `enrich`: Load and process documents from the data directory into vector storage
			- `chat`: Start an interactive chat session with the RAG system

			`## Development Conventions`

			`### Logging`
			- Use `loguru` for all logging
			- Log to both file (`logs/dev.log`) with rotation and stdout
			`- Use appropriate log levels (DEBUG, INFO, WARNING, ERROR)`

			`### Environment Variables`
			- `OLLAMA_EMBEDDING_MODEL`: Name of the Ollama model to use for embeddings
			- `OLLAMA_CHAT_MODEL`: Name of the Ollama model to use for chat functionality
			`- API keys for external services (OpenRouter option available but commented out)`

			`### Document Processing`
			`- Support multiple file formats based on EXTENSIONS.md`
			`- Use text splitters appropriate for each document type`
			`- Store metadata (filename, page, section, paragraph) with embeddings`
			`- Track processed documents to avoid re-processing (using SQLite if needed)`

			`### Vector Storage`
			`- Collection name: "documents_llamaindex"`
			`- Initialize automatically if not exists`
			`- Support for Ollama embeddings by default`
			`- Optional OpenAI embedding support via OpenRouter (commented out)`

			`## Project Phases`

			`### Phase 1: CLI Entry Point`
			`- [x] Virtual environment setup`
			- [x] CLI creation with `click` library
			`- [x] Basic "ping" command implementation`

			`### Phase 2: Framework Installation`
			`- [x] LlamaIndex installation`
			`- [ ] Data folder analysis and EXTENSIONS.md creation`
			`- [ ] Required loader libraries installation`

			`### Phase 3: Vector Storage Setup`
			`- [ ] Qdrant library installation`
			`- [ ] Vector storage initialization module`
			`- [ ] Embedding model configuration with Ollama`
			`- [ ] Collection creation strategy`

			`### Phase 4: Document Enrichment`
			`- [ ] Document loading module with appropriate loaders`
			`- [ ] Text splitting strategies implementation`
			`- [ ] Document tracking mechanism`
			`- [ ] CLI command for enrichment`

			`### Phase 5: Retrieval Feature`
			`- [ ] Retrieval module configuration`
			`- [ ] Query processing with metadata retrieval`

			`### Phase 6: Chat Agent`
			`- [ ] Agent module with Ollama integration`
			`- [ ] Integration with retrieval module`
			`- [ ] CLI command for chat functionality`

			`## File Structure`
			```
			`llamaindex/`
			`├── venv/ # Python virtual environment`
			`├── cli.py # CLI entry point`
			`├── vector_storage.py # Vector storage configuration (to be created)`
			`├── enrichment.py # Document loading and processing (to be created)`
			`├── retrieval.py # Search and retrieval functionality (to be created)`
			`├── agent.py # Chat agent implementation (to be created)`
			`├── EXTENSIONS.md # Supported file extensions and loaders (to be created)`
			`├── .env.dist # Environment variable template`
			`├── .env # Local environment variables (git-ignored)`
			`├── logs/ # Log files directory`
			`│ └── dev.log # Main log file with rotation`
			`└── PLANNING.md # Project planning document`
			```

			`## Data Directory`
			The system expects documents to be placed in `./../../../data` relative to the project root. The system will analyze this directory to determine supported file types and appropriate loaders.

			`## Testing`
			`- Unit tests for individual modules`
			`- Integration tests for end-to-end functionality`
			`- CLI command tests`

			`## Troubleshooting`
			`- Ensure Ollama is running on port 11434`
			`- Verify Qdrant is accessible on ports 6333 (REST) and 6334 (gRPC)`
			`- Check that the data directory contains supported file types`
			- Review logs in `logs/dev.log` for detailed error information