Skip to content

n0nuser/LocalRAG

Repository files navigation

localrag

Offline-first RAG system. Your documents, your models, your machine.

What It Is

LocalRAG ingests your local documents, stores embeddings in a local ChromaDB database, and answers questions using Ollama models. No cloud services required.

Documentation

Quick Start (uv + local Ollama)

  1. Install Ollama on your system. See docs/ollama.md (official guide: ollama.com, downloads: ollama.com/download).

  2. Install dependencies:

uv sync
  1. Start Ollama:
ollama serve
  1. Pull models (or let localrag setup do it):
ollama pull nomic-embed-text
ollama pull llama3.2
  1. Ingest docs and ask a question:
uv run localrag ingest ./docs
uv run localrag query "What are the key topics in these documents?"

API

Run the API:

uv run uvicorn localrag.api.main:app --reload

Then open:

  • http://127.0.0.1:8000/docs for interactive docs
  • GET /health
  • POST /ingest
  • POST /ingest/directory
  • POST /query (SSE streaming)

Configuration

Copy .env.example to .env and tweak values:

cp .env.example .env

Main keys:

  • OLLAMA_BASE_URL
  • OLLAMA_EMBED_MODEL
  • OLLAMA_LLM_MODEL
  • CHROMA_PERSIST_PATH
  • CHROMA_COLLECTION_NAME
  • CHUNK_CHARS
  • CHUNK_OVERLAP_CHARS
  • INGEST_RECURSIVE
  • RAG_TOP_K

Docker

With Compose, Ollama runs in a container—you can skip a host Ollama install for that workflow. For background on Ollama itself, still see docs/ollama.md.

docker compose up --build

After startup, pull models in the Ollama container:

docker exec -it <ollama_container_name> ollama pull nomic-embed-text
docker exec -it <ollama_container_name> ollama pull llama3.2

About

Offline-first RAG system. Your documents, your models, your machine.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors