|
# GraphRAG API |
|
|
|
This README provides a detailed guide on the `api.py` file, which serves as the API interface for the GraphRAG (Graph Retrieval-Augmented Generation) system. GraphRAG is a powerful tool that combines graph-based knowledge representation with retrieval-augmented generation techniques to provide context-aware responses to queries. |
|
|
|
## Table of Contents |
|
|
|
1. [Overview](#overview) |
|
2. [Setup](#setup) |
|
3. [API Endpoints](#api-endpoints) |
|
4. [Data Models](#data-models) |
|
5. [Core Functionality](#core-functionality) |
|
6. [Usage Examples](#usage-examples) |
|
7. [Configuration](#configuration) |
|
8. [Troubleshooting](#troubleshooting) |
|
|
|
## Overview |
|
|
|
The `api.py` file implements a FastAPI-based server that provides various endpoints for interacting with the GraphRAG system. It supports different types of queries, including direct chat, GraphRAG-specific queries, DuckDuckGo searches, and a combined full-model search. |
|
|
|
Key features: |
|
- Multiple query types (local and global searches) |
|
- Context caching for improved performance |
|
- Background tasks for long-running operations |
|
- Customizable settings through environment variables and config files |
|
- Integration with external services (e.g., Ollama for LLM interactions) |
|
|
|
## Setup |
|
|
|
1. Install dependencies: |
|
``` |
|
pip install -r requirements.txt |
|
``` |
|
|
|
2. Set up environment variables: |
|
Create a `.env` file in the `indexing` directory with the following variables: |
|
``` |
|
LLM_API_BASE=<your_llm_api_base_url> |
|
LLM_MODEL=<your_llm_model> |
|
LLM_PROVIDER=<llm_provider> |
|
EMBEDDINGS_API_BASE=<your_embeddings_api_base_url> |
|
EMBEDDINGS_MODEL=<your_embeddings_model> |
|
EMBEDDINGS_PROVIDER=<embeddings_provider> |
|
INPUT_DIR=./indexing/output |
|
ROOT_DIR=indexing |
|
API_PORT=8012 |
|
``` |
|
|
|
3. Run the API server: |
|
``` |
|
python api.py --host 0.0.0.0 --port 8012 |
|
``` |
|
|
|
## API Endpoints |
|
|
|
### `/v1/chat/completions` (POST) |
|
Main endpoint for chat completions. Supports different models: |
|
- `direct-chat`: Direct interaction with the LLM |
|
- `graphrag-local-search:latest`: Local search using GraphRAG |
|
- `graphrag-global-search:latest`: Global search using GraphRAG |
|
- `duckduckgo-search:latest`: Web search using DuckDuckGo |
|
- `full-model:latest`: Combined search using all available models |
|
|
|
### `/v1/prompt_tune` (POST) |
|
Initiates prompt tuning process in the background. |
|
|
|
### `/v1/prompt_tune_status` (GET) |
|
Retrieves the status and logs of the prompt tuning process. |
|
|
|
### `/v1/index` (POST) |
|
Starts the indexing process for GraphRAG in the background. |
|
|
|
### `/v1/index_status` (GET) |
|
Retrieves the status and logs of the indexing process. |
|
|
|
### `/health` (GET) |
|
Health check endpoint. |
|
|
|
### `/v1/models` (GET) |
|
Lists available models. |
|
|
|
## Data Models |
|
|
|
The API uses several Pydantic models for request and response handling: |
|
|
|
- `Message`: Represents a chat message with role and content. |
|
- `QueryOptions`: Options for GraphRAG queries, including query type, preset, and community level. |
|
- `ChatCompletionRequest`: Request model for chat completions. |
|
- `ChatCompletionResponse`: Response model for chat completions. |
|
- `PromptTuneRequest`: Request model for prompt tuning. |
|
- `IndexingRequest`: Request model for indexing. |
|
|
|
## Core Functionality |
|
|
|
### Context Loading |
|
The `load_context` function loads necessary data for GraphRAG queries, including entities, relationships, reports, text units, and covariates. |
|
|
|
### Search Engine Setup |
|
`setup_search_engines` initializes both local and global search engines using the loaded context data. |
|
|
|
### Query Execution |
|
Different query types are handled by separate functions: |
|
- `run_direct_chat`: Sends queries directly to the LLM. |
|
- `run_graphrag_query`: Executes GraphRAG queries (local or global). |
|
- `run_duckduckgo_search`: Performs web searches using DuckDuckGo. |
|
- `run_full_model_search`: Combines results from all search types. |
|
|
|
### Background Tasks |
|
Long-running tasks like prompt tuning and indexing are executed as background tasks to prevent blocking the API. |
|
|
|
## Usage Examples |
|
|
|
### Sending a GraphRAG Query |
|
```python |
|
import requests |
|
|
|
url = "http://localhost:8012/v1/chat/completions" |
|
payload = { |
|
"model": "graphrag-local-search:latest", |
|
"messages": [{"role": "user", "content": "What is GraphRAG?"}], |
|
"query_options": { |
|
"query_type": "local-search", |
|
"selected_folder": "your_indexed_folder", |
|
"community_level": 2, |
|
"response_type": "Multiple Paragraphs" |
|
} |
|
} |
|
response = requests.post(url, json=payload) |
|
print(response.json()) |
|
``` |
|
|
|
### Starting Indexing Process |
|
```python |
|
import requests |
|
|
|
url = "http://localhost:8012/v1/index" |
|
payload = { |
|
"llm_model": "your_llm_model", |
|
"embed_model": "your_embed_model", |
|
"root": "./indexing", |
|
"verbose": True, |
|
"emit": ["parquet", "csv"] |
|
} |
|
response = requests.post(url, json=payload) |
|
print(response.json()) |
|
``` |
|
|
|
## Configuration |
|
|
|
The API can be configured through: |
|
1. Environment variables |
|
2. A `config.yaml` file (path specified by `GRAPHRAG_CONFIG` environment variable) |
|
3. Command-line arguments when starting the server |
|
|
|
Key configuration options: |
|
- `llm_model`: The language model to use |
|
- `embedding_model`: The embedding model for vector representations |
|
- `community_level`: Depth of community analysis in GraphRAG |
|
- `token_limit`: Maximum tokens for context |
|
- `api_key`: API key for LLM service |
|
- `api_base`: Base URL for LLM API |
|
- `api_type`: Type of API (e.g., "openai") |
|
|
|
## Troubleshooting |
|
|
|
1. If you encounter connection errors with Ollama, ensure the service is running and accessible. |
|
2. For "context loading failed" errors, check that the indexed data is present in the specified output folder. |
|
3. If prompt tuning or indexing processes fail, review the logs using the respective status endpoints. |
|
4. For performance issues, consider adjusting the `community_level` and `token_limit` settings. |
|
|
|
For more detailed information on GraphRAG's indexing and querying processes, refer to the official GraphRAG documentation. |