Documentation Overview
Welcome to the OpenWebUI documentation landing page. This parent README provides links and summaries of key guides for:
- Document Creation & Batch Optimization: Best practices for authoring and preparing high-quality documents and uploading them in batches.
- Content Retrieval Management: Guidelines for configuring and tuning ingestion, chunking, embedding, and extraction pipelines.
Use this page as your starting point to navigate to detailed instructions tailored to your workflow.
1. Document Creation & Batch Optimization
A comprehensive guide to ensure your documents are crafted and prepared for efficient ingestion:
- Focus on Essential Content: Techniques to keep documents concise, structured, and retrieval-friendly.
- Minimize Historical Noise: Strategies to prune outdated background and avoid retrieval inaccuracies.
- Pre-Upload & Batch Preparation: Supported formats, naming conventions, manifests, file sizing, and splitting for batch uploads.
- Quality Checks: Pre-chunk simulations, search testing, and peer reviews to validate content accuracy.
- Maintenance & Cleanup: Archival, temp file cleanup, and version retention policies.
See full guide: Document Creation & Batch Optimization Best Practices
2. Content Retrieval Management
Detailed recommendations for configuring OpenWebUI’s extraction, chunking, and embedding subsystems:
- Chunk Size & Overlap: Character- and token-based guidelines and tuning tips for different document types.
- Model Limits & Context Windows: Typical OpenAI and other model context limits to inform chunk strategy.
- Content Extraction Engines: Overview of Default, Tika, Mistral OCR, Document Intelligence, Docling, and External backends.
- Embedding Engines & Models: Supported engines (SentenceTransformers, OpenAI, Ollama), model dimensions, and batch size best practices.
See full guide: Content Retrieval Configuration Best Practices
3. How to Use This Documentation
- Determine Your Workflow: Identify whether you’re authoring new documents or tuning ingestion pipelines.
- Follow the Relevant Guide: Click the link above to navigate to detailed steps.
- Implement & Validate: Apply best practices, then test with sample documents and queries.
- Iterate & Monitor: Use monitoring tools and logs to refine configurations over time.
Updated 16 days ago