36 Commits

Author SHA1 Message Date
93d538ecc6 Checking properly source of the file for metadata, with instanceof 2026-02-11 16:23:27 +03:00
f5659675ec - main feat: adaptation for async enrichment
- added file_type, this will hold the "таблица", "презентация" and so on types
- file source metadata is now taken either from local source or yandex disk.
2026-02-11 15:46:54 +03:00
7b52887558 Enrichment now processed via chunks. 2 documents -> into the vector storage. Also geussing source from the file extension 2026-02-11 11:23:50 +03:00
1e6ab247b9 Phase 12 done... loading via adaptive collection, yadisk or local 2026-02-10 22:19:27 +03:00
e9dd28ad55 Prep for Phase 12 of loading files for enrichment through the adaptive collections 2026-02-10 21:42:59 +03:00
06a3155b6b Working Yandex Disk integration for loading files. Tests for local and Yandex 2026-02-10 20:42:07 +03:00
63c3e2c5c7 Adaptive Collection, and Phase 11 WIP 2026-02-10 20:12:43 +03:00
447ecaba39 enrichment with years, events 2026-02-10 13:20:19 +03:00
ce62fd50ed Created this MD file to store things we need to look out to 2026-02-09 21:33:03 +03:00
2cb9b39bf2 removed test retrieval feature. off you go 2026-02-09 21:17:42 +03:00
f9c47c772f llamaindex update + unpacking archives in data 2026-02-09 19:00:23 +03:00
0adbc29692 env step for llamaindex 2026-02-05 22:48:39 +03:00
effbc7d00f proper usage of embedding models if defined in .env 2026-02-05 01:07:25 +03:00
31d198afb8 properly loading .env file with dotenv 2026-02-05 00:08:59 +03:00
833aad317a quick fix to use openai instead of ollama, in vetor_storage.py 2026-02-05 00:04:10 +03:00
f87f3c0cdd moved demo.html into demo-ui folder and renamed to index.html for ease of server serving... lol 2026-02-04 23:36:23 +03:00
a6320985dd resolved conflicts in requirements.txt 2026-02-04 23:34:37 +03:00
69e7ecee62 Updated requirements.txt file 2026-02-04 23:13:27 +03:00
8c57921b7f Working demo.html with connection to the api endpoint 2026-02-04 23:13:00 +03:00
9188b672c2 preparations for demo html page 2026-02-04 22:50:24 +03:00
bf3a3735cb openai compatible integration done 2026-02-04 22:30:57 +03:00
ae8c00316e Langchain plan phases for openai integration (openai compaible endpoint), server for retrieving data 2026-02-04 21:34:22 +03:00
ea4ce23cd9 Retrieval and also update on russian language 2026-02-04 16:51:50 +03:00
3dea3605ad Enrichment for llamaindex. It goes for a long time using local model, so better use external model not local, for EMBEDDING 2026-02-04 16:06:01 +03:00
f36108d652 Vector storage Qdrant initialization and configuration 2026-02-04 01:10:07 +03:00
c37aec1d99 File extensions and libraries for llamaindex 2026-02-04 01:02:21 +03:00
fa26d77520 Cli with ping for llamaindex 2026-02-04 00:59:01 +03:00
86fd643e66 Start of work on Llamaindex framework 2026-02-04 00:49:45 +03:00
d354d3dcca Working chat with AI agent with retrieving data 2026-02-04 00:02:53 +03:00
299ee0acb5 Working retrieval with the cli 2026-02-03 23:25:24 +03:00
4cbd5313d2 Working enrichment 2026-02-03 22:55:12 +03:00
8d7e39a603 langchain loading documents into vector storage 2026-02-03 20:52:08 +03:00
762ed89843 langchain vector storage connection and confguration 2026-02-03 20:42:09 +03:00
cd7c96e022 langchain extensions for data files and their libraries 2026-02-03 20:17:13 +03:00
d99433d087 langchain done cli 2026-02-03 19:51:35 +03:00
351fe27cca Initial commit 2026-02-03 19:24:41 +03:00