🚀 Semantic Knowledge Engine (RAG) – Spring Boot + pgvector + Spring AI

A production-ready Retrieval-Augmented Generation (RAG) system built using Java 17, Spring Boot, PostgreSQL (pgvector), and Spring AI.

This project demonstrates how to build an end-to-end AI-powered semantic search and question-answering system using modern enterprise Java technologies.

Built with Spring AI, OpenAI embeddings, and pgvector for scalable semantic retrieval.

🧠 What This Project Does

Ingests large documents
Splits them into semantic chunks
Generates vector embeddings
Stores them in PostgreSQL using pgvector
Performs similarity search
Uses an LLM to generate contextual answers
Returns structured responses with source attribution

🏗️ Architecture Overview

                ┌────────────────────┐
                │   Client / UI      │
                └─────────┬──────────┘
                          │
                          ▼
                ┌────────────────────┐
                │  Query Controller  │
                └─────────┬──────────┘
                          │
                          ▼
                ┌────────────────────┐
                │   Query Service    │
                └─────────┬──────────┘
                          │
        ┌─────────────────┼─────────────────┐
        ▼                                   ▼
┌────────────────────┐            ┌────────────────────┐
│ Embedding Service  │            │  Search Service    │
│ (OpenAI / SpringAI)│            │ (pgvector SQL)     │
└─────────┬──────────┘            └─────────┬──────────┘
          │                                 │
          ▼                                 ▼
   ┌───────────────┐                ┌──────────────────┐
   │ Vector Query  │                │ Knowledge Chunks │
   │ (Question)    │                │ (PostgreSQL)     │
   └───────────────┘                └──────────────────┘

                          │
                          ▼
                ┌────────────────────┐
                │    ChatClient      │
                │   (Spring AI)      │
                └─────────┬──────────┘
                          ▼
                ┌────────────────────┐
                │   Final Response   │
                │ Answer + Sources   │
                └────────────────────┘

🔄 RAG Pipeline (Implemented)

1. Ingestion Pipeline

Document → Chunking → Embeddings → PostgreSQL (pgvector)

Steps:

Accept raw text/document
Split into chunks (with overlap)
Generate embeddings
Store:
- documentId
- chunkIndex
- content
- embedding

2. Search Pipeline

User Question → Embedding → Vector Search → Top K Chunks

Convert question → embedding
Perform similarity search using pgvector
Retrieve top relevant chunks

3. Query (LLM) Pipeline

Chunks → Context → LLM → Answer

Combine retrieved chunks
Send structured prompt to LLM
Generate final answer

4. Final Response Structure

{
  "answer": "...",
  "sources": [
    {
      "docId": "...",
      "index": 0
    }
  ]
}

✅ Key Design Decision:

LLM generates only the answer
Sources come from retrieval layer (no hallucination)

🧩 Key Components

🔹 DocumentChunkService

Splits large documents into chunks
Maintains:
- documentId
- chunkIndex

🔹 EmbeddingService (Interface-based)

Supports:
- Manual OpenAI REST
- Spring AI Embeddings
Easily swappable

🔹 SearchService

Uses JdbcTemplate
Executes pgvector similarity queries
Returns ranked chunks

🔹 QueryService

Core RAG orchestration
Builds prompt
Calls LLM via ChatClient
Assembles response

🔹 VectorUtil

Converts float[] → pgvector format
Stateless and thread-safe

🛠️ Tech Stack

Java 17
Spring Boot 3.x
PostgreSQL + pgvector
Spring AI (LLM integration)
OpenAI Embeddings (text-embedding-3-small)
JdbcTemplate (for vector queries)
Lombok

⚙️ Setup Instructions

1. Clone Repo

git clone <your-repo-url>
cd semantic-knowledge-engine

2. PostgreSQL Setup

flyway migration scripts for this setup are included and run on startup.

CREATE EXTENSION vector;

CREATE TABLE knowledge_chunks (
    id UUID PRIMARY KEY,
    document_id TEXT,
    chunk_index INT,
    content TEXT,
    embedding VECTOR(1536)
);

3. Application Config

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/yourdb
    username: postgres
    password: password

  ai:
    openai:
      api-key: YOUR_API_KEY
      chat:
        options:
          model: gpt-4o-mini
          temperature: 0.2

🚀 API Endpoints

1. Ingest Document

POST /rag/ingest

Body:

Raw text document

2. Search (Semantic Retrieval)

POST /rag/search

Body:

"What is Spring Boot?"

3. Query (Full RAG)

POST /rag/query

Body:

"What is Spring Boot?"

🧪 How to Test

Step 1: Ingest Large Document

Use sample:

Architecture docs
Kafka / Microservices notes
Any long technical content

Step 2: Run Queries

Example Questions:

What is Spring Boot?
How does Kafka scale?
What is event-driven architecture?

Step 3: Validate Output

Check:

✅ Answer is coherent ✅ Sources are returned ✅ Chunks are relevant

📈 Key Features

✅ End-to-end RAG pipeline
✅ Vector search using pgvector
✅ Source attribution (no hallucinated metadata)
✅ Pluggable embedding strategy
✅ JDBC-based high-performance retrieval
✅ Clean separation of concerns

🔥 Future Enhancements

Neighbor chunk stitching (context continuity)
Hybrid search (keyword + vector)
Streaming LLM responses
UI with source highlighting
Multi-document filtering
Re-ranking models

💡 Why This Project Matters

This project demonstrates:

Real-world AI system design
Integration of LLMs with enterprise Java
Understanding of vector databases
Production-grade RAG architecture

👨‍💻 Author

Built as a hands-on exploration of modern AI systems using Java and Spring ecosystem to bridge traditional backend engineering with AI-driven applications.

⭐ If you found this useful

Star ⭐ the repo and share!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.mvn/wrapper		.mvn/wrapper
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
img.png		img.png
img_1.png		img_1.png
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Folders and files

Latest commit

History

Repository files navigation

🚀 Semantic Knowledge Engine (RAG) – Spring Boot + pgvector + Spring AI

🧠 What This Project Does

🏗️ Architecture Overview

🔄 RAG Pipeline (Implemented)

1. Ingestion Pipeline

Steps:

2. Search Pipeline

3. Query (LLM) Pipeline

4. Final Response Structure

✅ Key Design Decision:

🧩 Key Components

🔹 DocumentChunkService

🔹 EmbeddingService (Interface-based)

🔹 SearchService

🔹 QueryService

🔹 VectorUtil

🛠️ Tech Stack

⚙️ Setup Instructions

1. Clone Repo

2. PostgreSQL Setup

3. Application Config

🚀 API Endpoints

1. Ingest Document

2. Search (Semantic Retrieval)

3. Query (Full RAG)

🧪 How to Test

Step 1: Ingest Large Document

Step 2: Run Queries

Example Questions:

Step 3: Validate Output

📈 Key Features

🔥 Future Enhancements

💡 Why This Project Matters

👨‍💻 Author

⭐ If you found this useful

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages