Skip to content

retrieval

Retrieval, RAG configuration, and web search models.

Classes

CollectionNameForm

Bases: BaseModel

Form for specifying a collection name.

Attributes

collection_name
collection_name: Optional[str] = None

The name of the collection.

ProcessUrlForm

Bases: CollectionNameForm

Form for processing a URL.

Attributes

url
url: str

The URL to process.

SearchForm

Bases: BaseModel

Form for search queries.

Attributes

queries
queries: List[str]

List of search queries.

OpenAIConfigForm

Bases: BaseModel

Configuration for OpenAI embedding model.

Attributes

url
url: str

The base URL for the OpenAI API.

key
key: str

The API key for the OpenAI API.

OllamaConfigForm

Bases: BaseModel

Configuration for Ollama embedding model.

Attributes

url
url: str

The base URL for the Ollama API.

key
key: str

The API key for the Ollama API.

AzureOpenAIConfigForm

Bases: BaseModel

Configuration for Azure OpenAI embedding model.

Attributes

url
url: str

The base URL for the Azure OpenAI API.

key
key: str

The API key for the Azure OpenAI API.

version
version: str

The API version for the Azure OpenAI API.

EmbeddingModelUpdateForm

Bases: BaseModel

Form for updating the embedding model configuration.

Attributes

openai_config
openai_config: Optional[OpenAIConfigForm] = None

Configuration for OpenAI embedding model.

ollama_config
ollama_config: Optional[OllamaConfigForm] = None

Configuration for Ollama embedding model.

azure_openai_config
azure_openai_config: Optional[AzureOpenAIConfigForm] = None

Configuration for Azure OpenAI embedding model.

RAG_EMBEDDING_ENGINE
RAG_EMBEDDING_ENGINE: str

The embedding engine to use (e.g., 'ollama', 'openai').

RAG_EMBEDDING_MODEL
RAG_EMBEDDING_MODEL: str

The embedding model to use.

RAG_EMBEDDING_BATCH_SIZE
RAG_EMBEDDING_BATCH_SIZE: Optional[int] = 1

The batch size for embedding generation.

ENABLE_ASYNC_EMBEDDING
ENABLE_ASYNC_EMBEDDING: Optional[bool] = True

Whether to enable asynchronous embedding generation.

RAG_EMBEDDING_CONCURRENT_REQUESTS
RAG_EMBEDDING_CONCURRENT_REQUESTS: Optional[int] = 0

The number of concurrent embedding requests.

WebConfig

Bases: BaseModel

Configuration for web search and retrieval.

Attributes

ENABLE_WEB_SEARCH: Optional[bool] = None

Whether to enable web search.

ENABLE_WEB_SEARCH_CONFIRMATION
ENABLE_WEB_SEARCH_CONFIRMATION: Optional[bool] = None

Whether users must confirm before a web search runs. When enabled, the client UI shows the WEB_SEARCH_CONFIRMATION_CONTENT message and requires acknowledgement before the search proceeds.

WEB_SEARCH_CONFIRMATION_CONTENT
WEB_SEARCH_CONFIRMATION_CONTENT: Optional[str] = None

Confirmation message shown to users before a web search runs, when ENABLE_WEB_SEARCH_CONFIRMATION is enabled. Defaults to 'Your query will be sent to the configured web search provider.'.

WEB_SEARCH_ENGINE
WEB_SEARCH_ENGINE: Optional[str] = None

The web search engine to use.

WEB_SEARCH_TRUST_ENV
WEB_SEARCH_TRUST_ENV: Optional[bool] = None

Whether to trust the environment variables for web search.

WEB_SEARCH_RESULT_COUNT
WEB_SEARCH_RESULT_COUNT: Optional[int] = None

The number of web search results to retrieve.

WEB_SEARCH_CONCURRENT_REQUESTS
WEB_SEARCH_CONCURRENT_REQUESTS: Optional[int] = None

The number of concurrent web search requests.

WEB_LOADER_CONCURRENT_REQUESTS
WEB_LOADER_CONCURRENT_REQUESTS: Optional[int] = None

The number of concurrent web loader requests.

WEB_SEARCH_DOMAIN_FILTER_LIST
WEB_SEARCH_DOMAIN_FILTER_LIST: Optional[List[str]] = []

List of domains to filter from web search results.

WEB_FETCH_MAX_CONTENT_LENGTH
WEB_FETCH_MAX_CONTENT_LENGTH: Optional[int] = None

Maximum content length in characters for web fetch results. Content exceeding this is truncated.

BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL
BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL: Optional[
    bool
] = None

Whether to bypass embedding and retrieval for web search results.

BYPASS_WEB_SEARCH_WEB_LOADER
BYPASS_WEB_SEARCH_WEB_LOADER: Optional[bool] = None

Whether to bypass the web loader for web search results.

OLLAMA_CLOUD_WEB_SEARCH_API_KEY
OLLAMA_CLOUD_WEB_SEARCH_API_KEY: Optional[str] = None

API key for Ollama Cloud web search.

SEARXNG_QUERY_URL
SEARXNG_QUERY_URL: Optional[str] = None

The query URL for SearXNG.

SEARXNG_LANGUAGE
SEARXNG_LANGUAGE: Optional[str] = None

The language for SearXNG.

YACY_QUERY_URL
YACY_QUERY_URL: Optional[str] = None

The query URL for YaCy.

YACY_USERNAME
YACY_USERNAME: Optional[str] = None

The username for YaCy.

YACY_PASSWORD
YACY_PASSWORD: Optional[str] = None

The password for YaCy.

GOOGLE_PSE_API_KEY
GOOGLE_PSE_API_KEY: Optional[str] = None

API key for Google Programmable Search Engine.

GOOGLE_PSE_ENGINE_ID
GOOGLE_PSE_ENGINE_ID: Optional[str] = None

Engine ID for Google Programmable Search Engine.

BRAVE_SEARCH_API_KEY
BRAVE_SEARCH_API_KEY: Optional[str] = None

API key for Brave Search.

BRAVE_SEARCH_CONTEXT_TOKENS
BRAVE_SEARCH_CONTEXT_TOKENS: Optional[int] = None

Maximum number of context tokens returned by Brave Search. Defaults to 8192.

KAGI_SEARCH_API_KEY
KAGI_SEARCH_API_KEY: Optional[str] = None

API key for Kagi Search.

MOJEEK_SEARCH_API_KEY
MOJEEK_SEARCH_API_KEY: Optional[str] = None

API key for Mojeek Search.

YOUCOM_API_KEY
YOUCOM_API_KEY: Optional[str] = None

API key for You.com Search.

BOCHA_SEARCH_API_KEY
BOCHA_SEARCH_API_KEY: Optional[str] = None

API key for Bocha Search.

SERPSTACK_API_KEY
SERPSTACK_API_KEY: Optional[str] = None

API key for Serpstack.

SERPSTACK_HTTPS
SERPSTACK_HTTPS: Optional[bool] = None

Whether to use HTTPS for Serpstack.

SERPER_API_KEY
SERPER_API_KEY: Optional[str] = None

API key for Serper.

SERPHOUSE_API_KEY
SERPHOUSE_API_KEY: Optional[str] = None

API key for SERPHouse Search (used when WEB_SEARCH_ENGINE == 'serphouse').

SERPHOUSE_DOMAIN
SERPHOUSE_DOMAIN: Optional[str] = None

Search domain passed to SERPHouse as the domain query parameter, e.g. 'google.com' or 'bing.com'. Defaults to 'google.com' if empty.

SERPLY_API_KEY
SERPLY_API_KEY: Optional[str] = None

API key for Serply.

TAVILY_API_KEY
TAVILY_API_KEY: Optional[str] = None

API key for Tavily.

SEARCHAPI_API_KEY
SEARCHAPI_API_KEY: Optional[str] = None

API key for SearchAPI.

SEARCHAPI_ENGINE
SEARCHAPI_ENGINE: Optional[str] = None

The engine to use for SearchAPI.

SERPAPI_API_KEY
SERPAPI_API_KEY: Optional[str] = None

API key for SerpAPI.

SERPAPI_ENGINE
SERPAPI_ENGINE: Optional[str] = None

The engine to use for SerpAPI.

JINA_API_KEY
JINA_API_KEY: Optional[str] = None

API key for Jina.

JINA_API_BASE_URL
JINA_API_BASE_URL: Optional[str] = None

Base URL for Jina API.

BING_SEARCH_V7_ENDPOINT
BING_SEARCH_V7_ENDPOINT: Optional[str] = None

The endpoint for Bing Search V7.

BING_SEARCH_V7_SUBSCRIPTION_KEY
BING_SEARCH_V7_SUBSCRIPTION_KEY: Optional[str] = None

The subscription key for Bing Search V7.

EXA_API_KEY
EXA_API_KEY: Optional[str] = None

API key for Exa.

PERPLEXITY_API_KEY
PERPLEXITY_API_KEY: Optional[str] = None

API key for Perplexity.

PERPLEXITY_MODEL
PERPLEXITY_MODEL: Optional[str] = None

The model to use for Perplexity.

PERPLEXITY_SEARCH_CONTEXT_USAGE
PERPLEXITY_SEARCH_CONTEXT_USAGE: Optional[str] = None

The search context usage for Perplexity.

PERPLEXITY_SEARCH_API_URL
PERPLEXITY_SEARCH_API_URL: Optional[str] = None

The search API URL for Perplexity.

MICROSOFT_WEB_IQ_API_BASE_URL
MICROSOFT_WEB_IQ_API_BASE_URL: Optional[str] = None

Base URL for the Microsoft Web IQ API (used for both search and page browsing), selected when WEB_SEARCH_ENGINE == 'microsoft_web_iq' or WEB_LOADER_ENGINE == 'microsoft_web_iq'. Defaults to 'https://api.microsoft.ai/v3'.

MICROSOFT_WEB_IQ_API_KEY
MICROSOFT_WEB_IQ_API_KEY: Optional[str] = None

API key for the Microsoft Web IQ API (sent as the x-apikey header).

MICROSOFT_WEB_IQ_LANGUAGE
MICROSOFT_WEB_IQ_LANGUAGE: Optional[str] = None

Language code forwarded to the Microsoft Web IQ API as the language field, e.g. 'en'. Defaults to 'en'.

SOUGOU_API_SID
SOUGOU_API_SID: Optional[str] = None

The SID for Sougou API.

SOUGOU_API_SK
SOUGOU_API_SK: Optional[str] = None

The SK for Sougou API.

WEB_LOADER_ENGINE
WEB_LOADER_ENGINE: Optional[str] = None

The web loader engine to use.

WEB_LOADER_TIMEOUT
WEB_LOADER_TIMEOUT: Optional[str] = None

The timeout for the web loader.

ENABLE_WEB_LOADER_SSL_VERIFICATION
ENABLE_WEB_LOADER_SSL_VERIFICATION: Optional[bool] = None

Whether to enable SSL verification for the web loader.

PLAYWRIGHT_WS_URL
PLAYWRIGHT_WS_URL: Optional[str] = None

The WebSocket URL for Playwright.

PLAYWRIGHT_TIMEOUT
PLAYWRIGHT_TIMEOUT: Optional[int] = None

The timeout for Playwright.

FIRECRAWL_API_KEY
FIRECRAWL_API_KEY: Optional[str] = None

API key for Firecrawl.

FIRECRAWL_API_BASE_URL
FIRECRAWL_API_BASE_URL: Optional[str] = None

The base URL for Firecrawl.

FIRECRAWL_TIMEOUT
FIRECRAWL_TIMEOUT: Optional[int] = None

The timeout for Firecrawl.

TAVILY_EXTRACT_DEPTH
TAVILY_EXTRACT_DEPTH: Optional[str] = None

The extract depth for Tavily.

EXTERNAL_WEB_SEARCH_URL
EXTERNAL_WEB_SEARCH_URL: Optional[str] = None

The URL for external web search.

EXTERNAL_WEB_SEARCH_API_KEY
EXTERNAL_WEB_SEARCH_API_KEY: Optional[str] = None

The API key for external web search.

EXTERNAL_WEB_LOADER_URL
EXTERNAL_WEB_LOADER_URL: Optional[str] = None

The URL for external web loader.

EXTERNAL_WEB_LOADER_API_KEY
EXTERNAL_WEB_LOADER_API_KEY: Optional[str] = None

The API key for external web loader.

YOUTUBE_LOADER_LANGUAGE
YOUTUBE_LOADER_LANGUAGE: Optional[List[str]] = None

List of languages for YouTube loader.

YOUTUBE_LOADER_PROXY_URL
YOUTUBE_LOADER_PROXY_URL: Optional[str] = None

The proxy URL for YouTube loader.

YOUTUBE_LOADER_TRANSLATION
YOUTUBE_LOADER_TRANSLATION: Optional[str] = None

The translation language for YouTube loader.

DDGS_BACKEND
DDGS_BACKEND: Optional[str] = None

The backend for DDGS.

YANDEX_WEB_SEARCH_URL
YANDEX_WEB_SEARCH_URL: Optional[str] = None

The URL for Yandex web search.

YANDEX_WEB_SEARCH_API_KEY
YANDEX_WEB_SEARCH_API_KEY: Optional[str] = None

API key for Yandex Search.

YANDEX_WEB_SEARCH_CONFIG
YANDEX_WEB_SEARCH_CONFIG: Optional[str] = None

JSON configuration string for Yandex search.

Dict Fields (when parsed as JSON): - query (dict, optional): Query configuration options. - searchType (str, optional): Search type, e.g., 'SEARCH_TYPE_COM'. - Additional Yandex API parameters may be included.

Defaults to '{"query": {"searchType": "SEARCH_TYPE_COM"}}' if not specified.

LINKUP_API_KEY
LINKUP_API_KEY: Optional[str] = None

API key for Linkup Search.

LINKUP_SEARCH_PARAMS
LINKUP_SEARCH_PARAMS: Optional[Dict] = None

Parameters for Linkup search.

Dict Fields
  • url (str, optional): Override endpoint URL. Defaults to 'https://api.linkup.so/v1/search'.
  • depth (str, optional): Search depth. Typical values: 'standard', 'deep'. Defaults to 'standard'.
  • outputType (str, optional): Output type. Typical values: 'sourcedAnswer', 'searchResults'. Defaults to 'sourcedAnswer'.
  • Additional Linkup API parameters may be included.

The dictionary is forwarded to the Linkup Search API as the JSON body (with q and maxResults injected automatically). The special url key, if present, is popped and used as the request endpoint instead of the default. See the Linkup API documentation for additional supported parameters.

ConfigForm

Bases: BaseModel

Configuration form for retrieval settings.

Attributes

RAG_TEMPLATE
RAG_TEMPLATE: Optional[str] = None

Template for RAG.

TOP_K
TOP_K: Optional[int] = None

Top K results to retrieve.

BYPASS_EMBEDDING_AND_RETRIEVAL
BYPASS_EMBEDDING_AND_RETRIEVAL: Optional[bool] = None

Whether to bypass embedding and retrieval.

RAG_FULL_CONTEXT
RAG_FULL_CONTEXT: Optional[bool] = None

Whether to use full context for RAG.

ENABLE_RAG_HYBRID_SEARCH: Optional[bool] = None

Whether to enable hybrid search.

ENABLE_RAG_HYBRID_SEARCH_ENRICHED_TEXTS
ENABLE_RAG_HYBRID_SEARCH_ENRICHED_TEXTS: Optional[bool] = (
    None
)

Whether to enable enriched texts for hybrid search.

TOP_K_RERANKER
TOP_K_RERANKER: Optional[int] = None

Top K results for reranker.

RELEVANCE_THRESHOLD
RELEVANCE_THRESHOLD: Optional[float] = None

Relevance threshold for search results.

HYBRID_BM25_WEIGHT
HYBRID_BM25_WEIGHT: Optional[float] = None

Weight for BM25 in hybrid search.

CONTENT_EXTRACTION_ENGINE
CONTENT_EXTRACTION_ENGINE: Optional[str] = None

Engine for content extraction.

PDF_EXTRACT_IMAGES
PDF_EXTRACT_IMAGES: Optional[bool] = None

Whether to extract images from PDFs.

PDF_LOADER_MODE
PDF_LOADER_MODE: Optional[str] = None

Mode for PDF loading. 'page' creates one document per page, 'single' combines all pages into one document for better chunking across page boundaries.

DATALAB_MARKER_API_KEY
DATALAB_MARKER_API_KEY: Optional[str] = None

API key for DataLab Marker.

DATALAB_MARKER_API_BASE_URL
DATALAB_MARKER_API_BASE_URL: Optional[str] = None

Base URL for DataLab Marker API.

DATALAB_MARKER_ADDITIONAL_CONFIG
DATALAB_MARKER_ADDITIONAL_CONFIG: Optional[str] = None

Additional configuration for DataLab Marker.

DATALAB_MARKER_SKIP_CACHE
DATALAB_MARKER_SKIP_CACHE: Optional[bool] = None

Whether to skip cache for DataLab Marker.

DATALAB_MARKER_FORCE_OCR
DATALAB_MARKER_FORCE_OCR: Optional[bool] = None

Whether to force OCR for DataLab Marker.

DATALAB_MARKER_PAGINATE
DATALAB_MARKER_PAGINATE: Optional[bool] = None

Whether to paginate results for DataLab Marker.

DATALAB_MARKER_STRIP_EXISTING_OCR
DATALAB_MARKER_STRIP_EXISTING_OCR: Optional[bool] = None

Whether to strip existing OCR for DataLab Marker.

DATALAB_MARKER_DISABLE_IMAGE_EXTRACTION
DATALAB_MARKER_DISABLE_IMAGE_EXTRACTION: Optional[bool] = (
    None
)

Whether to disable image extraction for DataLab Marker.

DATALAB_MARKER_FORMAT_LINES
DATALAB_MARKER_FORMAT_LINES: Optional[bool] = None

Whether to format lines for DataLab Marker.

DATALAB_MARKER_USE_LLM
DATALAB_MARKER_USE_LLM: Optional[bool] = None

Whether to use LLM for DataLab Marker.

DATALAB_MARKER_OUTPUT_FORMAT
DATALAB_MARKER_OUTPUT_FORMAT: Optional[str] = None

Output format for DataLab Marker.

EXTERNAL_DOCUMENT_LOADER_URL
EXTERNAL_DOCUMENT_LOADER_URL: Optional[str] = None

URL for external document loader.

EXTERNAL_DOCUMENT_LOADER_API_KEY
EXTERNAL_DOCUMENT_LOADER_API_KEY: Optional[str] = None

API key for external document loader.

EXTERNAL_DOCUMENT_LOADER_HEADERS
EXTERNAL_DOCUMENT_LOADER_HEADERS: Optional[
    Dict[str, str]
] = None

Extra HTTP headers appended to requests sent to the external document loader server, in addition to the auto-injected Content-Type, Authorization (built from EXTERNAL_DOCUMENT_LOADER_API_KEY), and X-Filename. Only used when CONTENT_EXTRACTION_ENGINE == 'external'.

Values are strings (any non-string value is coerced to a string at request time) and may contain template tokens that are substituted per uploaded file before the request is sent.

Dict Fields
  • <header-name> (str, optional): Any HTTP header name mapped to a string value. Values support case-sensitive template tokens that are replaced at request time, including {{FILE_ID}}, {{FILE_NAME}}, {{FILE_CONTENT_TYPE}}, {{CHAT_ID}}, {{MESSAGE_ID}}, {{USER_MESSAGE_ID}}, {{USER_MESSAGE_PARENT_ID}}, {{USER_ID}}, {{USER_NAME}}, {{USER_EMAIL}}, {{USER_ROLE}}, {{USER_AGENT}}, and {{TASK}}.

Defaults to {} when unset. Example: {"X-OpenWebUI-File-Id": "{{FILE_ID}}"}.

TIKA_SERVER_URL
TIKA_SERVER_URL: Optional[str] = None

URL for Tika server.

DOCLING_SERVER_URL
DOCLING_SERVER_URL: Optional[str] = None

URL for Docling server.

DOCLING_API_KEY
DOCLING_API_KEY: Optional[str] = None

API key for Docling.

DOCLING_PARAMS
DOCLING_PARAMS: Optional[Dict] = None

Parameters for Docling.

Dict Fields
  • image_export_mode (str, optional): How images should be exported. Defaults to "placeholder" if not specified.
  • Additional VLM (Vision Language Model) pipeline parameters may be supported by the Docling API.

This dictionary is passed directly to the Docling API's /v1/convert/file endpoint. See the Docling API documentation for additional supported parameters.

DOCUMENT_INTELLIGENCE_ENDPOINT
DOCUMENT_INTELLIGENCE_ENDPOINT: Optional[str] = None

Endpoint for Document Intelligence.

DOCUMENT_INTELLIGENCE_KEY
DOCUMENT_INTELLIGENCE_KEY: Optional[str] = None

Key for Document Intelligence.

DOCUMENT_INTELLIGENCE_MODEL
DOCUMENT_INTELLIGENCE_MODEL: Optional[str] = None

Model for Document Intelligence.

MISTRAL_OCR_API_BASE_URL
MISTRAL_OCR_API_BASE_URL: Optional[str] = None

Base URL for Mistral OCR API.

MISTRAL_OCR_API_KEY
MISTRAL_OCR_API_KEY: Optional[str] = None

API key for Mistral OCR.

MISTRAL_OCR_USE_BASE64
MISTRAL_OCR_USE_BASE64: Optional[bool] = None

When True (and CONTENT_EXTRACTION_ENGINE == 'mistral_ocr'), send the PDF as a base64 data URL inline instead of first uploading it to Mistral and referencing the uploaded file. Defaults to False.

PADDLEOCR_VL_BASE_URL
PADDLEOCR_VL_BASE_URL: Optional[str] = None

Base URL for PaddleOCR VL service. Defaults to 'http://localhost:8080'.

PADDLEOCR_VL_TOKEN
PADDLEOCR_VL_TOKEN: Optional[str] = None

Authentication token for PaddleOCR VL service.

MINERU_API_MODE
MINERU_API_MODE: Optional[str] = None

API mode for MinerU.

MINERU_API_URL
MINERU_API_URL: Optional[str] = None

URL for MinerU API.

MINERU_API_KEY
MINERU_API_KEY: Optional[str] = None

API key for MinerU.

MINERU_API_TIMEOUT
MINERU_API_TIMEOUT: Optional[str] = None

The timeout for the MinerU API.

MINERU_PARAMS
MINERU_PARAMS: Optional[Dict] = None

Parameters for MinerU.

Dict Fields
  • enable_ocr (bool, optional): Enable OCR processing. Defaults to False.
  • enable_formula (bool, optional): Enable formula processing. Defaults to True.
  • enable_table (bool, optional): Enable table processing. Defaults to True.
  • language (str, optional): Language code for processing. Defaults to "en".
  • model_version (str, optional): Model version to use. Defaults to "pipeline".
  • page_ranges (str, optional): Page ranges to process. Defaults to empty string.

This dictionary is passed directly to the MinerU API for document parsing configuration.

MINERU_FILE_EXTENSIONS
MINERU_FILE_EXTENSIONS: Optional[List[str]] = None

List of file extensions that MinerU is allowed to process (e.g., ['pdf']).

Files uploaded to the system with extensions in this list are routed through the MinerU content extraction engine when CONTENT_EXTRACTION_ENGINE is set to mineru. Frontend typically accepts a comma-separated string (e.g., 'pdf') and splits it into a list. Defaults to ['pdf'] if not specified.

RAG_RERANKING_MODEL
RAG_RERANKING_MODEL: Optional[str] = None

Model for RAG reranking.

RAG_RERANKING_ENGINE
RAG_RERANKING_ENGINE: Optional[str] = None

Engine for RAG reranking.

RAG_RERANKING_BATCH_SIZE
RAG_RERANKING_BATCH_SIZE: Optional[int] = None

Batch size for reranking operations. Defaults to 32.

RAG_EXTERNAL_RERANKER_URL
RAG_EXTERNAL_RERANKER_URL: Optional[str] = None

URL for external reranker.

RAG_EXTERNAL_RERANKER_API_KEY
RAG_EXTERNAL_RERANKER_API_KEY: Optional[str] = None

API key for external reranker.

RAG_EXTERNAL_RERANKER_TIMEOUT
RAG_EXTERNAL_RERANKER_TIMEOUT: Optional[str] = None

The timeout for the external reranker.

TEXT_SPLITTER
TEXT_SPLITTER: Optional[str] = None

Text splitter to use.

RAG_TOKENIZER_MODEL
RAG_TOKENIZER_MODEL: Optional[str] = None

HuggingFace tokenizer model id (or local path) used by the 'token_transformers' text splitter (TEXT_SPLITTER == 'token_transformers') for token-based chunking. A bare name with no path/slash is prefixed with 'sentence-transformers/'. When empty, the backend falls back to the configured embedding model's tokenizer and raises if none is available.

ENABLE_MARKDOWN_HEADER_TEXT_SPLITTER
ENABLE_MARKDOWN_HEADER_TEXT_SPLITTER: Optional[bool] = None

Whether to enable markdown header text splitter.

CHUNK_SIZE
CHUNK_SIZE: Optional[int] = None

Size of text chunks.

CHUNK_MIN_SIZE_TARGET
CHUNK_MIN_SIZE_TARGET: Optional[int] = None

Minimum target size for text chunks.

CHUNK_OVERLAP
CHUNK_OVERLAP: Optional[int] = None

Overlap between text chunks.

FILE_MAX_SIZE
FILE_MAX_SIZE: Optional[int] = None

Maximum size of uploaded files.

FILE_MAX_COUNT
FILE_MAX_COUNT: Optional[int] = None

Maximum count of uploaded files.

FILE_IMAGE_COMPRESSION_WIDTH
FILE_IMAGE_COMPRESSION_WIDTH: Optional[int] = None

Width for image compression.

FILE_IMAGE_COMPRESSION_HEIGHT
FILE_IMAGE_COMPRESSION_HEIGHT: Optional[int] = None

Height for image compression.

ALLOWED_FILE_EXTENSIONS
ALLOWED_FILE_EXTENSIONS: Optional[List[str]] = None

List of allowed file extensions.

ENABLE_GOOGLE_DRIVE_INTEGRATION
ENABLE_GOOGLE_DRIVE_INTEGRATION: Optional[bool] = None

Whether to enable Google Drive integration.

ENABLE_ONEDRIVE_INTEGRATION
ENABLE_ONEDRIVE_INTEGRATION: Optional[bool] = None

Whether to enable OneDrive integration.

web
web: Optional[WebConfig] = None

Web search configuration.

ProcessFileForm

Bases: BaseModel

Form for processing a file.

Attributes

file_id
file_id: str

The ID of the file to process.

content
content: Optional[str] = None

The content of the file.

collection_name
collection_name: Optional[str] = None

The name of the collection.

ProcessTextForm

Bases: BaseModel

Form for processing text.

Attributes

name
name: str

The name of the text.

content
content: str

The text content.

collection_name
collection_name: Optional[str] = None

The name of the collection.

QueryDocForm

Bases: BaseModel

Form for querying a document.

Attributes

collection_name
collection_name: str

The name of the collection to query.

query
query: str

The search query.

k
k: Optional[int] = None

Number of results to retrieve.

k_reranker
k_reranker: Optional[int] = None

Number of results to rerank.

r
r: Optional[float] = None

Relevance threshold.

hybrid
hybrid: Optional[bool] = None

Whether to use hybrid search.

hybrid_bm25_weight
hybrid_bm25_weight: Optional[float] = None

Weight for BM25 in hybrid search.

QueryCollectionsForm

Bases: BaseModel

Form for querying multiple collections.

Attributes

collection_names
collection_names: List[str]

List of collection names to query.

query
query: str

The search query.

k
k: Optional[int] = None

Number of results to retrieve.

k_reranker
k_reranker: Optional[int] = None

Number of results to rerank.

r
r: Optional[float] = None

Relevance threshold.

hybrid
hybrid: Optional[bool] = None

Whether to use hybrid search.

hybrid_bm25_weight
hybrid_bm25_weight: Optional[float] = None

Weight for BM25 in hybrid search.

enable_enriched_texts
enable_enriched_texts: Optional[bool] = None

Whether to enable enriched texts.

DeleteForm

Bases: BaseModel

Form for deleting a file from a collection.

Attributes

collection_name
collection_name: str

The name of the collection.

file_id
file_id: str

The ID of the file to delete.

BatchProcessFilesForm

Bases: BaseModel

Form for batch processing files.

Attributes

files
files: List[FileModel]

List of files to process.

collection_name
collection_name: str

The name of the collection.

BatchProcessFilesResult

Bases: BaseModel

Result of a batch file processing operation.

Attributes

file_id
file_id: str

The ID of the file.

status
status: str

The status of the processing.

error
error: Optional[str] = None

The error message if processing failed.

BatchProcessFilesResponse

Bases: BaseModel

Response for batch process files request.

Attributes

results
results: List[BatchProcessFilesResult]

List of successful results.

errors

List of failed results.