SQL Vibe Coding Challenge

The SQL Vibe Coding Challenge

An objective benchmark for multi-agent software development. Build a SQL database from scratch. Pass 6 million tests. Win the trophy.

The Only Metric That Matters: Calendar Time

Commits and lines of code are proxies. What matters is days to completion. Can 1,000 agents working in parallel beat 100 agents? Does your orchestration framework maintain productivity as you scale? This benchmark will tell you.

The Challenge

Objective

Build a SQL database engine from scratch that passes the SQLLogicTest suite. This is the same test suite used to validate SQLite, DuckDB, and other production databases.

Success Criteria

  • 100% pass rate on SQLLogicTest suite
  • ~6 million individual test assertions
  • All 622 test files passing

The Rule: Execution Boundary

Existing database systems may be studied, benchmarked, and used for external analysis, but they must never cross the execution boundary of your submitted system.

Outside the Boundary (Allowed)

  • Study source code and algorithms
  • Benchmark to identify slow query classes
  • Compare query plans and behaviors
  • Use to guide design decisions

Inside the Boundary (Disqualified)

  • Execute queries via existing engines
  • Use as fallback for unsupported features
  • Use as correctness oracle during tests
  • Link, embed, or invoke at runtime

Key test: Removing SQLite must not change whether your engine builds, runs, or passes tests.

Do / Don't Examples

DO (Allowed)

Read SQLite source code to understand B-tree implementation

Study and learn from existing implementations

Run DuckDB to benchmark JOIN performance and identify optimization targets

External analysis to guide your implementation

Compare your query plan output against PostgreSQL's EXPLAIN

Learning tool that doesn't affect your execution

Use scripts that call SQLite to pre-compute expected test outputs

Offline reference data, not runtime dependency

DON'T (Disqualified)

Fall back to SQLite for window functions "temporarily"

Crosses execution boundary, even if planned for removal

Run queries through SQLite to verify correctness during tests

Using as oracle means tests depend on external engine

Link against libsqlite3 for "just the parser"

Embedded dependency violates clean-room implementation

Shell out to DuckDB for complex aggregation queries

Delegation of execution, regardless of method

Common Question

Q: Can I use SQLite during development?

A: Yes — for reading code, benchmarking, and analysis. No — for executing queries, validating correctness, or acting as part of your system.

The Trophy

VibeSQL Challenge Trophy

The Vibe Coding Trophy

A physical trophy will be awarded to each record holder. The design reflects the spirit of "vibe coding" — a gold-plated wand mounted on walnut with brass nameplates.

Your name goes on the trophy when you beat the current record by at least 5%.

Award Rules

  • 5% improvement required — beat the previous record by at least 5% (in calendar days) to claim the trophy
  • Public repository — your code must be publicly available for verification
  • 100% pass rate — all 622 SQLLogicTest files must pass
  • Verifiable git history — first commit date to 100% pass rate determines your time

Current Record Holder

25 days

VibeSQL (Baseline)

October - November 2025

275fb925

Beat this by 5%? That's 24 days or less to claim the trophy.

Why This Challenge?

Objective Measurement

No subjective code reviews. Either the tests pass or they don't. 6 million assertions leave no room for ambiguity.

Real Complexity

SQL databases require parsers, optimizers, and execution engines. This isn't a toy problem—it's production-grade engineering.

Time Is Truth

Calendar days to completion is the ultimate metric. Does parallelizing to 1,000 agents help? Now you can find out.

Get Started

Start from scratch in any language, or use one of our seed repos for convenience. Each seed includes the SQLLogicTest suite, a test runner, and CI workflow.

Seed Repos (optional)

1

Start Your Project

Create a new repo from scratch, or fork a seed above for a head start. Get the SQLLogicTest suite. Your first commit starts the clock.

2

Build Your Database

Implement a SQL parser, query executor, and storage engine. Use any AI tools—Claude, Copilot, or your own agents. Run make test to track progress.

3

Hit 100% and Submit

When all 622 test files pass, open an issue at vibesql-challenge/submissions with your repo link and commit hashes. Beat 25 days to join the leaderboard.

Explore VibeSQL