Skip to content

PrajwalAmte/AutoSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

AutoSQL

Self-optimizing SQL query pipeline. Inspired by karpathy/autoresearch.

LLM rewrites query → run it → measure (speed + correctness) → keep if better → repeat

One file. Zero fluff.


The Problem

Slow SQL queries are everywhere. Correlated subqueries, redundant scans, missing pre-aggregations — they kill app performance and nobody catches them until prod is on fire.

AutoSQL fixes this automatically:

  1. You give it a query
  2. It runs the query, records the baseline time and result hash
  3. An LLM rewrites the query trying to make it faster
  4. It runs the new query, checks the result is identical, measures the speedup
  5. Keeps the best version, feeds the history back into the next iteration
  6. Repeats until the budget is spent

Quickstart

# 1. Install dependency
pip install groq

# 2. Get a free API key from https://console.groq.com
export GROQ_API_KEY=gsk_...

# 3. Optimize your query
python AutoSQL.py --query slow.sql --db myapp.sqlite

# Or inline
python AutoSQL.py --query "SELECT * FROM ..." --db myapp.sqlite --iterations 15

# Use a different model
python AutoSQL.py --query slow.sql --db myapp.sqlite --model llama-3.1-8b-instant

Example output

──────────────────────────────────────────────────────────────
  AutoSQL — Self-optimizing Query Pipeline
  Model  : llama-3.3-70b-versatile  (Groq)
──────────────────────────────────────────────────────────────

  Measuring baseline … 1842.3 ms

  [01/10] Generating … ✓  312.4 ms  (5.89× faster)  ← NEW BEST
  [02/10] Generating … ✓  298.1 ms  (6.18× faster)  ← NEW BEST
  [03/10] Generating … ✗  wrong result
  [04/10] Generating … ✓  291.7 ms  (6.32× faster)  ← NEW BEST
  ...

──────────────────────────────────────────────────────────────
  Final Report
──────────────────────────────────────────────────────────────
  Baseline : 1842.3 ms
  Best     : 291.7 ms  (6.32× faster)

What it optimizes

AutoSQL tackles the classic anti-patterns that kill query performance:

Anti-pattern What AutoSQL does
Correlated subquery in WHERE Replaces with pre-aggregated JOIN / CTE
Correlated subquery in SELECT Replaces with pre-aggregated JOIN
Multiple passes over the same table Collapses into a single scan

Correctness guarantee

AutoSQL never accepts a faster query that returns different results.
Every candidate is run and its output hashed. Only an exact match against the baseline hash counts as correct.


Outputs

After the run, autosql_log.json contains every attempt:

{
  "baseline_ms": 1842.3,
  "best_ms": 291.7,
  "speedup": 6.32,
  "iterations": [
    { "iteration": 1, "time_ms": 312.4, "speedup": 5.89, "correct": true, "query": "..." },
    ...
  ]
}

Options

Flag Default Description
--query (required) SQL string or .sql file path
--db (required) SQLite database path
--iterations 10 How many rewrites to attempt
--model llama-3.3-70b-versatile Groq model to use

Extending

AutoSQL intentionally stays minimal. Easy extension points:

  • Other databases — swap sqlite3 for psycopg2, duckdb, etc. Only run_query() changes.
  • Other metrics — add memory, I/O reads, or query plan cost alongside wall-clock time.
  • Indexes — let the LLM propose CREATE INDEX statements alongside the query rewrite.
  • Batch mode — feed a folder of .sql files, get a report for each.

Requirements

  • Python 3.10+
  • groq (pip install groq)
  • GROQ_API_KEY environment variable (free at console.groq.com)
  • SQLite database

About

AutoSQL — Self-optimizing SQL query pipeline powered by Groq LLM. Automatically rewrites slow queries, measures speedup, and keeps improvements. One file, zero dependencies, free API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages