langchain loading documents into vector storage

This commit is contained in:
2026-02-03 20:52:08 +03:00
parent 762ed89843
commit 8d7e39a603
5 changed files with 299 additions and 42 deletions

View File

@@ -24,6 +24,7 @@ rag-solution/services/rag/langchain/
├── app.py # Main application file (currently empty)
├── cli.py # CLI entrypoint with click library
├── EXTENSIONS.md # Supported file extensions and LangChain loaders
├── enrichment.py # Document enrichment module for loading documents to vector storage
├── PLANNING.md # Development roadmap and phases
├── QWEN.md # Current file - project context
├── requirements.txt # Python dependencies
@@ -64,10 +65,10 @@ The project is organized into 6 development phases as outlined in `PLANNING.md`:
- [x] Prepare OpenAI fallback (commented)
### Phase 4: Document Loading Module
- [ ] Create `enrichment.py` for loading documents to vector storage
- [ ] Implement text splitting strategies
- [ ] Add document tracking to prevent re-processing
- [ ] Integrate with CLI
- [x] Create `enrichment.py` for loading documents to vector storage
- [x] Implement text splitting strategies
- [x] Add document tracking to prevent re-processing
- [x] Integrate with CLI
### Phase 5: Retrieval Feature
- [ ] Create `retrieval.py` for querying vector storage