Addressing Hallucinations in Generative AI Agents using Observability and Dual Memory Knowledge Graphs

Implementation repository for the paper:

Matharaarachchi et al., 2026 — Knowledge-Based Systems
Addressing Hallucinations in Generative AI Agents using Observability and Dual Memory Knowledge Graphs
https://www.sciencedirect.com/science/article/pii/S0950705126002121

Core Idea

This framework reduces hallucinations in agentic LLM systems using:

Observability logging
Diagnostics modules
- Root Cause Analysis (RCA)
- Knowledge-Based Verification (KBV)
- Human-In-the-Loop review (HIL)
Dual Memory Knowledge Graph
- Experience memory (successful traces)
- Insight memory (failure explanations)
Reasoning agents
- ReAct
- Reflexion

Repository Structure


agents/               # ReAct + Reflexion agents (baseline + dual memory)
classifier/           # Intent / entity / attribute classification
diagnostics/          # RCA, KBV, HIL
eval/                 # Experimental evaluation
knowledge_graph/      # Dual memory Neo4j KG
log_transformation/   # LangSmith → ReAct trace pipeline
scripts/              # End-to-end execution scripts
common/               # Shared config, models, logging

Setup Instructions

Requirements

Python 3.10+
Neo4j (vector index support)
LangSmith account
Azure OpenAI LLM provider configured

Create virtual environment

python -m venv .venv

Activate it:

source .venv/bin/activate

Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

Configure environment variables

Create a .env file following the .env.example file:

OPENAI_API_KEY=...
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=...
NEO4J_DATABASE=neo4j
LANGSMITH_PROJECT_ID=...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=...
AZURE_OPENAI_DEPLOYMENT_NAME=...

Full Pipeline Execution

Below is the recommended end-to-end workflow.

Step 1 — Export LangSmith runs

python scripts/run_export.py

Output:

output/langsmith_runs_<...>.json

Step 2 — Convert to ReAct trace format

python scripts/run_format_trace.py

Output:

output/langsmith_runs_<...>.react.txt

Step 3 — Run Root Cause Analysis (RCA)

python scripts/run_rca.py

Output:

output/langsmith_runs_<...>.rca.json

Step 4 — Run Knowledge-Based Verification (KBV)

python scripts/run_kbv.py

Output:

output/langsmith_runs_<...>.kbv.json

Step 5 — Human Review (HIL)

python scripts/run_hil_streamlit.py

This launches the Streamlit interface for rating traces.

Output:

output/langsmith_runs_<...>.hil.json

Step 6 — Classify traces (intent, attributes, entities)

python scripts/run_classify.py

Outputs:

output/langsmith_runs_<...>.classified.json
output/langsmith_runs_<...>.classified_insights.json

Step 7 — Insert into Dual Memory Knowledge Graph

Make sure Neo4j is running.

python scripts/run_insert_obs.py

This:

Creates vector index
Inserts embeddings
Stores experience + insight memory

Running Agents

ReAct Baseline / Dual Memory

python scripts/run_react_agent.py

Reflexion Baseline / Dual Memory

python scripts/run_reflexion_agent.py

📊 Evaluation

Evaluation compares:

ReAct
Reflexion
ReAct + Dual Memory
Reflexion + Dual Memory

Metrics:

Exact match accuracy
Relevancy
Faithfulness
Consistency
Latency
Cost

Results are written to CSV files in eval/data default.

📄 Citation

If you use this repository:

@article{matharaarachchi2026addressing,
  title={Addressing Hallucinations in Generative AI Agents using Observability and Dual Memory Knowledge Graphs},
  author={Matharaarachchi, Amali and Moraliyage, Harsha and Mills, Nishan and Gamage, Gihan and De Silva, Daswin and Manic, Milos},
  journal={Knowledge-Based Systems},
  pages={115469},
  year={2026},
  publisher={Elsevier}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Addressing Hallucinations in Generative AI Agents using Observability and Dual Memory Knowledge Graphs

Core Idea

Repository Structure

Setup Instructions

Requirements

Create virtual environment

Install dependencies

Configure environment variables

Full Pipeline Execution

Step 1 — Export LangSmith runs

Step 2 — Convert to ReAct trace format

Step 3 — Run Root Cause Analysis (RCA)

Step 4 — Run Knowledge-Based Verification (KBV)

Step 5 — Human Review (HIL)

Step 6 — Classify traces (intent, attributes, entities)

Step 7 — Insert into Dual Memory Knowledge Graph

Running Agents

ReAct Baseline / Dual Memory

Reflexion Baseline / Dual Memory

📊 Evaluation

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
classifier		classifier
common		common
diagnostics		diagnostics
eval		eval
knowledge_graph		knowledge_graph
log_transformation		log_transformation
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Addressing Hallucinations in Generative AI Agents using Observability and Dual Memory Knowledge Graphs

Core Idea

Repository Structure

Setup Instructions

Requirements

Create virtual environment

Install dependencies

Configure environment variables

Full Pipeline Execution

Step 1 — Export LangSmith runs

Step 2 — Convert to ReAct trace format

Step 3 — Run Root Cause Analysis (RCA)

Step 4 — Run Knowledge-Based Verification (KBV)

Step 5 — Human Review (HIL)

Step 6 — Classify traces (intent, attributes, entities)

Step 7 — Insert into Dual Memory Knowledge Graph

Running Agents

ReAct Baseline / Dual Memory

Reflexion Baseline / Dual Memory

📊 Evaluation

📄 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages