Skip to content

Murali-SpringAI/semantic-knowledge-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Semantic Knowledge Engine (RAG) – Spring Boot + pgvector + Spring AI

A production-ready Retrieval-Augmented Generation (RAG) system built using Java 17, Spring Boot, PostgreSQL (pgvector), and Spring AI.

This project demonstrates how to build an end-to-end AI-powered semantic search and question-answering system using modern enterprise Java technologies.

img.png

Built with Spring AI, OpenAI embeddings, and pgvector for scalable semantic retrieval.

🧠 What This Project Does

  • Ingests large documents
  • Splits them into semantic chunks
  • Generates vector embeddings
  • Stores them in PostgreSQL using pgvector
  • Performs similarity search
  • Uses an LLM to generate contextual answers
  • Returns structured responses with source attribution

🏗️ Architecture Overview

                ┌────────────────────┐
                │   Client / UI      │
                └─────────┬──────────┘
                          │
                          ▼
                ┌────────────────────┐
                │  Query Controller  │
                └─────────┬──────────┘
                          │
                          ▼
                ┌────────────────────┐
                │   Query Service    │
                └─────────┬──────────┘
                          │
        ┌─────────────────┼─────────────────┐
        ▼                                   ▼
┌────────────────────┐            ┌────────────────────┐
│ Embedding Service  │            │  Search Service    │
│ (OpenAI / SpringAI)│            │ (pgvector SQL)     │
└─────────┬──────────┘            └─────────┬──────────┘
          │                                 │
          ▼                                 ▼
   ┌───────────────┐                ┌──────────────────┐
   │ Vector Query  │                │ Knowledge Chunks │
   │ (Question)    │                │ (PostgreSQL)     │
   └───────────────┘                └──────────────────┘

                          │
                          ▼
                ┌────────────────────┐
                │    ChatClient      │
                │   (Spring AI)      │
                └─────────┬──────────┘
                          ▼
                ┌────────────────────┐
                │   Final Response   │
                │ Answer + Sources   │
                └────────────────────┘

🔄 RAG Pipeline (Implemented)

1. Ingestion Pipeline

Document → Chunking → Embeddings → PostgreSQL (pgvector)

Steps:

  • Accept raw text/document

  • Split into chunks (with overlap)

  • Generate embeddings

  • Store:

    • documentId
    • chunkIndex
    • content
    • embedding

2. Search Pipeline

User Question → Embedding → Vector Search → Top K Chunks
  • Convert question → embedding
  • Perform similarity search using pgvector
  • Retrieve top relevant chunks

3. Query (LLM) Pipeline

Chunks → Context → LLM → Answer
  • Combine retrieved chunks
  • Send structured prompt to LLM
  • Generate final answer

4. Final Response Structure

{
  "answer": "...",
  "sources": [
    {
      "docId": "...",
      "index": 0
    }
  ]
}

✅ Key Design Decision:

  • LLM generates only the answer
  • Sources come from retrieval layer (no hallucination)

🧩 Key Components

🔹 DocumentChunkService

  • Splits large documents into chunks

  • Maintains:

    • documentId
    • chunkIndex

🔹 EmbeddingService (Interface-based)

  • Supports:

    • Manual OpenAI REST
    • Spring AI Embeddings
  • Easily swappable


🔹 SearchService

  • Uses JdbcTemplate
  • Executes pgvector similarity queries
  • Returns ranked chunks

🔹 QueryService

  • Core RAG orchestration
  • Builds prompt
  • Calls LLM via ChatClient
  • Assembles response

🔹 VectorUtil

  • Converts float[] → pgvector format
  • Stateless and thread-safe

🛠️ Tech Stack

  • Java 17
  • Spring Boot 3.x
  • PostgreSQL + pgvector
  • Spring AI (LLM integration)
  • OpenAI Embeddings (text-embedding-3-small)
  • JdbcTemplate (for vector queries)
  • Lombok

⚙️ Setup Instructions

1. Clone Repo

git clone <your-repo-url>
cd semantic-knowledge-engine

2. PostgreSQL Setup

flyway migration scripts for this setup are included and run on startup.

CREATE EXTENSION vector;

CREATE TABLE knowledge_chunks (
    id UUID PRIMARY KEY,
    document_id TEXT,
    chunk_index INT,
    content TEXT,
    embedding VECTOR(1536)
);

3. Application Config

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/yourdb
    username: postgres
    password: password

  ai:
    openai:
      api-key: YOUR_API_KEY
      chat:
        options:
          model: gpt-4o-mini
          temperature: 0.2

🚀 API Endpoints

1. Ingest Document

POST /rag/ingest

Body:

Raw text document

2. Search (Semantic Retrieval)

POST /rag/search

Body:

"What is Spring Boot?"

3. Query (Full RAG)

POST /rag/query

Body:

"What is Spring Boot?"

🧪 How to Test

Step 1: Ingest Large Document

Use sample:

  • Architecture docs
  • Kafka / Microservices notes
  • Any long technical content

Step 2: Run Queries

Example Questions:

What is Spring Boot?
How does Kafka scale?
What is event-driven architecture?

Step 3: Validate Output

Check:

✅ Answer is coherent ✅ Sources are returned ✅ Chunks are relevant


📈 Key Features

  • ✅ End-to-end RAG pipeline
  • ✅ Vector search using pgvector
  • ✅ Source attribution (no hallucinated metadata)
  • ✅ Pluggable embedding strategy
  • ✅ JDBC-based high-performance retrieval
  • ✅ Clean separation of concerns

🔥 Future Enhancements

  • Neighbor chunk stitching (context continuity)
  • Hybrid search (keyword + vector)
  • Streaming LLM responses
  • UI with source highlighting
  • Multi-document filtering
  • Re-ranking models

💡 Why This Project Matters

This project demonstrates:

  • Real-world AI system design
  • Integration of LLMs with enterprise Java
  • Understanding of vector databases
  • Production-grade RAG architecture

👨‍💻 Author

Built as a hands-on exploration of modern AI systems using Java and Spring ecosystem to bridge traditional backend engineering with AI-driven applications.


⭐ If you found this useful

Star ⭐ the repo and share!