Compare commits

..

2 Commits

5 changed files with 122 additions and 49 deletions

View File

@@ -1,2 +1,5 @@
OLLAMA_EMBEDDING_MODEL=MODEL OLLAMA_EMBEDDING_MODEL=MODEL
OLLAMA_CHAT_MODEL=MODEL OLLAMA_CHAT_MODEL=MODEL
OPENAI_CHAT_URL=URL
OPENAI_CHAT_KEY=KEY
CHAT_MODEL_STRATEGY=ollama

View File

@@ -39,3 +39,12 @@ Chosen data folder: relatve ./../../../data - from the current folder
- [x] Create file `agent.py`, which will incorporate into itself agent, powered by the chat model. It should use integration with ollama, model specified in .env in property: OLLAMA_CHAT_MODEL - [x] Create file `agent.py`, which will incorporate into itself agent, powered by the chat model. It should use integration with ollama, model specified in .env in property: OLLAMA_CHAT_MODEL
- [x] Integrate this agent with the existing solution for retrieving, with retrieval.py - [x] Integrate this agent with the existing solution for retrieving, with retrieval.py
- [x] Integrate this agent with the cli, as command to start chatting with the agent. If there is a built-in solution for console communication with the agent, initiate this on cli command. - [x] Integrate this agent with the cli, as command to start chatting with the agent. If there is a built-in solution for console communication with the agent, initiate this on cli command.
# Phase 7 (openai integration for chat model)
- [x] Create openai integration, with using .env variables `OPENAI_CHAT_URL`, `OPENAI_CHAT_KEY`. First one for openai compatible URL, second one for Authorization Bearer token.
- [x] Make this integration optional, with using .env variable `CHAT_MODEL_STRATEGY`. There can be 2 options: "ollama", "openai". Ollama currently already done and working, so we should write code for checking which option is chosen in .env, with ollama being the default.
# Phase 8 (http endpoint to retrieve data from the vector storage by query)
- [ ] Create file `server.py`, with web framework fastapi, for example
- [ ] Add POST endpoint "/api/test-query" which will use agent, and retrieve response for query, sent in JSON format, field "query"

View File

@@ -2,7 +2,7 @@
## Project Overview ## Project Overview
This is a Retrieval-Augmented Generation (RAG) solution built using the Langchain framework. The project is designed to load documents from a data directory, store them in a vector database (Qdrant), and enable semantic search and chat capabilities using local LLMs via Ollama. This is a Retrieval-Augmented Generation (RAG) solution built using the Langchain framework. The project is designed to load documents from a data directory, store them in a vector database (Qdrant), and enable semantic search and chat capabilities using local LLMs via Ollama or OpenAI-compatible APIs.
The project follows a phased development approach with CLI entry points for different functionalities like document loading, retrieval, and chat. The project follows a phased development approach with CLI entry points for different functionalities like document loading, retrieval, and chat.
@@ -10,7 +10,7 @@ The project follows a phased development approach with CLI entry points for diff
- **Framework**: Langchain - **Framework**: Langchain
- **Vector Storage**: Qdrant - **Vector Storage**: Qdrant
- **Embeddings**: Ollama (with fallback option for OpenAI via OpenRouter) - **Embeddings**: Ollama (with fallback option for OpenAI via OpenRouter)
- **Chat Models**: Ollama - **Chat Models**: Ollama and OpenAI-compatible APIs
- **Data Directory**: `./../../../data` (relative to project root) - **Data Directory**: `./../../../data` (relative to project root)
- **Virtual Environment**: Python venv in `venv/` directory - **Virtual Environment**: Python venv in `venv/` directory
@@ -35,7 +35,7 @@ rag-solution/services/rag/langchain/
## Dependencies ## Dependencies
The project relies on several key libraries: The project relies on several key libraries:
- `langchain` and related ecosystem (`langchain-community`, `langchain-core`, `langchain-ollama`) - `langchain` and related ecosystem (`langchain-community`, `langchain-core`, `langchain-ollama`, `langchain-openai`)
- `langgraph` for workflow management - `langgraph` for workflow management
- `qdrant-client` for vector storage (to be installed) - `qdrant-client` for vector storage (to be installed)
- `ollama` for local LLM interaction - `ollama` for local LLM interaction
@@ -45,7 +45,7 @@ The project relies on several key libraries:
## Development Phases ## Development Phases
The project is organized into 6 development phases as outlined in `PLANNING.md`: The project is organized into 8 development phases as outlined in `PLANNING.md`:
### Phase 1: CLI Entrypoint ### Phase 1: CLI Entrypoint
- [x] Virtual environment setup - [x] Virtual environment setup
@@ -79,13 +79,25 @@ The project is organized into 6 development phases as outlined in `PLANNING.md`:
- [x] Integrate with retrieval functionality - [x] Integrate with retrieval functionality
- [x] Add CLI command for chat interaction - [x] Add CLI command for chat interaction
### Phase 7: OpenAI Integration for Chat Model
- [x] Create OpenAI-compatible integration using `.env` variables `OPENAI_CHAT_URL` and `OPENAI_CHAT_KEY`
- [x] Make this integration optional using `.env` variable `CHAT_MODEL_STRATEGY` with "ollama" as default
- [x] Allow switching between "ollama" and "openai" strategies
### Phase 8: HTTP Endpoint
- [ ] Create web framework with POST endpoint `/api/test-query` for agent queries
## Environment Configuration ## Environment Configuration
The project uses environment variables for configuration: The project uses environment variables for configuration:
```env ```env
OLLAMA_EMBEDDING_MODEL=MODEL # Name of the Ollama model for embeddings OLLAMA_EMBEDDING_MODEL=MODEL # Name of the Ollama model for embeddings
OLLAMA_CHAT_MODEL=MODEL # Name of the Ollama model for chat OLLAMA_CHAT_MODEL=MODEL # Name of the Ollama model for chat
OPENAI_CHAT_URL=URL # OpenAI-compatible API URL
OPENAI_CHAT_KEY=KEY # Authorization token for OpenAI-compatible API
OPENAI_CHAT_MODEL=MODEL # Name of the OpenAI-compatible model to use
CHAT_MODEL_STRATEGY=ollama # Strategy to use: "ollama" (default) or "openai"
``` ```
## Building and Running ## Building and Running
@@ -176,6 +188,13 @@ The project is in early development phase. The virtual environment is set up and
- Agent uses document retrieval tool to fetch relevant information based on user queries - Agent uses document retrieval tool to fetch relevant information based on user queries
- Implemented proper error handling and conversation history management - Implemented proper error handling and conversation history management
### Phase 7 Implementation Notes
- Enhanced `agent.py` to support both Ollama and OpenAI-compatible chat models
- Added conditional logic to select chat model based on `CHAT_MODEL_STRATEGY` environment variable
- When strategy is "openai", uses `ChatOpenAI` with `OPENAI_CHAT_URL` and `OPENAI_CHAT_KEY` from environment
- When strategy is "ollama" (default), uses existing `ChatOllama` implementation
- Updated CLI chat command to show which model strategy is being used
### Issue Fix Notes ### Issue Fix Notes
- Fixed DocumentRetrievalTool class to properly declare and initialize the retriever field - Fixed DocumentRetrievalTool class to properly declare and initialize the retriever field
- Resolved Pydantic field declaration issue that caused "object has no field" error - Resolved Pydantic field declaration issue that caused "object has no field" error

View File

@@ -8,6 +8,7 @@ from langchain_core.messages import HumanMessage, AIMessage, BaseMessage
from langchain_core.agents import AgentFinish from langchain_core.agents import AgentFinish
from langgraph.prebuilt import create_react_agent from langgraph.prebuilt import create_react_agent
from langchain_ollama import ChatOllama from langchain_ollama import ChatOllama
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate from langchain_core.prompts import ChatPromptTemplate
from loguru import logger from loguru import logger
@@ -69,36 +70,64 @@ def create_chat_agent(
) -> Any: ) -> Any:
""" """
Create a chat agent with document retrieval capabilities. Create a chat agent with document retrieval capabilities.
Args: Args:
collection_name: Name of the Qdrant collection to use collection_name: Name of the Qdrant collection to use
llm_model: Name of the Ollama model to use (defaults to OLLAMA_CHAT_MODEL env var) llm_model: Name of the model to use (defaults to environment variable based on strategy)
Returns: Returns:
Configured chat agent Configured chat agent
""" """
logger.info("Creating chat agent with document retrieval capabilities") logger.info("Creating chat agent with document retrieval capabilities")
# Get the model name from environment if not provided # Determine which model strategy to use
if llm_model is None: chat_model_strategy = os.getenv("CHAT_MODEL_STRATEGY", "ollama").lower()
llm_model = os.getenv("OLLAMA_CHAT_MODEL", "llama3.1")
if chat_model_strategy == "openai":
# Initialize the Ollama chat model # Use OpenAI-compatible API
llm = ChatOllama( openai_chat_url = os.getenv("OPENAI_CHAT_URL")
model=llm_model, openai_chat_key = os.getenv("OPENAI_CHAT_KEY")
base_url="http://localhost:11434", # Default Ollama URL
temperature=0.1, if not openai_chat_url or not openai_chat_key:
) raise ValueError("OPENAI_CHAT_URL and OPENAI_CHAT_KEY must be set when using OpenAI strategy")
# Get the model name from environment if not provided
if llm_model is None:
llm_model = os.getenv("OPENAI_CHAT_MODEL", "gpt-3.5-turbo") # Default to a common model
# Initialize the OpenAI-compatible chat model
llm = ChatOpenAI(
model=llm_model,
openai_api_base=openai_chat_url,
openai_api_key=openai_chat_key,
temperature=0.1,
)
logger.info(f"Using OpenAI-compatible model: {llm_model} via {openai_chat_url}")
else: # Default to ollama
# Use Ollama
# Get the model name from environment if not provided
if llm_model is None:
llm_model = os.getenv("OLLAMA_CHAT_MODEL", "llama3.1")
# Initialize the Ollama chat model
llm = ChatOllama(
model=llm_model,
base_url="http://localhost:11434", # Default Ollama URL
temperature=0.1,
)
logger.info(f"Using Ollama model: {llm_model}")
# Create the document retrieval tool # Create the document retrieval tool
retrieval_tool = DocumentRetrievalTool() retrieval_tool = DocumentRetrievalTool()
# Create the agent with the LLM and tools # Create the agent with the LLM and tools
tools = [retrieval_tool] tools = [retrieval_tool]
agent = create_react_agent(llm, tools) agent = create_react_agent(llm, tools)
logger.info("Chat agent created successfully") logger.info("Chat agent created successfully")
return agent return agent
@@ -110,47 +139,47 @@ def chat_with_agent(
) -> Dict[str, Any]: ) -> Dict[str, Any]:
""" """
Chat with the agent and get a response based on the query and document retrieval. Chat with the agent and get a response based on the query and document retrieval.
Args: Args:
query: The user's query query: The user's query
collection_name: Name of the Qdrant collection to use collection_name: Name of the Qdrant collection to use
llm_model: Name of the Ollama model to use llm_model: Name of the model to use (defaults to environment variable based on strategy)
history: Conversation history (list of messages) history: Conversation history (list of messages)
Returns: Returns:
Dictionary containing the agent's response and metadata Dictionary containing the agent's response and metadata
""" """
logger.info(f"Starting chat with query: {query}") logger.info(f"Starting chat with query: {query}")
# Create the agent # Create the agent
agent = create_chat_agent(collection_name, llm_model) agent = create_chat_agent(collection_name, llm_model)
# Prepare the input for the agent # Prepare the input for the agent
if history is None: if history is None:
history = [] history = []
# Add the user's query to the history # Add the user's query to the history
history.append(HumanMessage(content=query)) history.append(HumanMessage(content=query))
# Prepare the input for the agent executor # Prepare the input for the agent executor
agent_input = { agent_input = {
"messages": history "messages": history
} }
try: try:
# Invoke the agent # Invoke the agent
result = agent.invoke(agent_input) result = agent.invoke(agent_input)
# Extract the agent's response # Extract the agent's response
messages = result.get("messages", []) messages = result.get("messages", [])
ai_message = None ai_message = None
# Find the AI message in the results # Find the AI message in the results
for msg in reversed(messages): for msg in reversed(messages):
if isinstance(msg, AIMessage): if isinstance(msg, AIMessage):
ai_message = msg ai_message = msg
break break
if ai_message is None: if ai_message is None:
# If no AI message was found, return the last message content # If no AI message was found, return the last message content
if messages: if messages:
@@ -160,7 +189,7 @@ def chat_with_agent(
response_content = "I couldn't generate a response to your query." response_content = "I couldn't generate a response to your query."
else: else:
response_content = ai_message.content response_content = ai_message.content
# Create the response dictionary # Create the response dictionary
response = { response = {
"response": response_content, "response": response_content,
@@ -168,10 +197,10 @@ def chat_with_agent(
"history": messages, # Return updated history "history": messages, # Return updated history
"success": True "success": True
} }
logger.info("Chat completed successfully") logger.info("Chat completed successfully")
return response return response
except Exception as e: except Exception as e:
logger.error(f"Error during chat: {str(e)}") logger.error(f"Error during chat: {str(e)}")
return { return {
@@ -188,29 +217,39 @@ def run_chat_loop(
): ):
""" """
Run an interactive chat loop with the agent. Run an interactive chat loop with the agent.
Args: Args:
collection_name: Name of the Qdrant collection to use collection_name: Name of the Qdrant collection to use
llm_model: Name of the Ollama model to use llm_model: Name of the model to use (defaults to environment variable based on strategy)
""" """
logger.info("Starting interactive chat loop") logger.info("Starting interactive chat loop")
print("Chat Agent initialized. Type 'quit' or 'exit' to end the conversation.\n")
# Determine which model strategy is being used and inform the user
chat_model_strategy = os.getenv("CHAT_MODEL_STRATEGY", "ollama").lower()
if chat_model_strategy == "openai":
model_info = os.getenv("OPENAI_CHAT_MODEL", "gpt-3.5-turbo")
print(f"Chat Agent initialized with OpenAI-compatible model: {model_info}")
else:
model_info = os.getenv("OLLAMA_CHAT_MODEL", "llama3.1")
print(f"Chat Agent initialized with Ollama model: {model_info}")
print("Type 'quit' or 'exit' to end the conversation.\n")
history = [] history = []
while True: while True:
try: try:
# Get user input # Get user input
user_input = input("You: ").strip() user_input = input("You: ").strip()
# Check for exit commands # Check for exit commands
if user_input.lower() in ['quit', 'exit', 'q']: if user_input.lower() in ['quit', 'exit', 'q']:
print("Ending chat session. Goodbye!") print("Ending chat session. Goodbye!")
break break
if not user_input: if not user_input:
continue continue
# Get response from the agent # Get response from the agent
response_data = chat_with_agent( response_data = chat_with_agent(
query=user_input, query=user_input,
@@ -218,13 +257,13 @@ def run_chat_loop(
llm_model=llm_model, llm_model=llm_model,
history=history history=history
) )
# Update history with the new messages # Update history with the new messages
history = response_data.get("history", []) history = response_data.get("history", [])
# Print the agent's response # Print the agent's response
print(f"Agent: {response_data.get('response', 'No response generated')}\n") print(f"Agent: {response_data.get('response', 'No response generated')}\n")
except KeyboardInterrupt: except KeyboardInterrupt:
print("\nEnding chat session. Goodbye!") print("\nEnding chat session. Goodbye!")
break break

View File

@@ -28,7 +28,10 @@ langgraph==1.0.5
langgraph-checkpoint==3.0.1 langgraph-checkpoint==3.0.1
langgraph-prebuilt==1.0.5 langgraph-prebuilt==1.0.5
langgraph-sdk==0.3.1 langgraph-sdk==0.3.1
langgraph-tools==1.0.20
langsmith==0.5.2 langsmith==0.5.2
langserve==0.3.0
langchain-openai==0.2.0
marshmallow==3.26.2 marshmallow==3.26.2
multidict==6.7.0 multidict==6.7.0
mypy_extensions==1.1.0 mypy_extensions==1.1.0