The Only Metric That Matters: Calendar Time
Commits and lines of code are proxies. What matters is days to completion. Can 1,000 agents working in parallel beat 100 agents? Does your orchestration framework maintain productivity as you scale? This benchmark will tell you.
The Challenge
Objective
Build a SQL database engine from scratch that passes the SQLLogicTest suite. This is the same test suite used to validate SQLite, DuckDB, and other production databases.
Success Criteria
- 100% pass rate on SQLLogicTest suite
- ~6 million individual test assertions
- All 622 test files passing
The Rule: Execution Boundary
Existing database systems may be studied, benchmarked, and used for external analysis, but they must never cross the execution boundary of your submitted system.
Outside the Boundary (Allowed)
- Study source code and algorithms
- Benchmark to identify slow query classes
- Compare query plans and behaviors
- Use to guide design decisions
Inside the Boundary (Disqualified)
- Execute queries via existing engines
- Use as fallback for unsupported features
- Use as correctness oracle during tests
- Link, embed, or invoke at runtime
Key test: Removing SQLite must not change whether your engine builds, runs, or passes tests.
Do / Don't Examples
DO (Allowed)
Read SQLite source code to understand B-tree implementation
Study and learn from existing implementations
Run DuckDB to benchmark JOIN performance and identify optimization targets
External analysis to guide your implementation
Compare your query plan output against PostgreSQL's EXPLAIN
Learning tool that doesn't affect your execution
Use scripts that call SQLite to pre-compute expected test outputs
Offline reference data, not runtime dependency
DON'T (Disqualified)
Fall back to SQLite for window functions "temporarily"
Crosses execution boundary, even if planned for removal
Run queries through SQLite to verify correctness during tests
Using as oracle means tests depend on external engine
Link against libsqlite3 for "just the parser"
Embedded dependency violates clean-room implementation
Shell out to DuckDB for complex aggregation queries
Delegation of execution, regardless of method
Common Question
Q: Can I use SQLite during development?
A: Yes — for reading code, benchmarking, and analysis. No — for executing queries, validating correctness, or acting as part of your system.
The Trophy
The Vibe Coding Trophy
A physical trophy will be awarded to each record holder. The design reflects the spirit of "vibe coding" — a gold-plated wand mounted on walnut with brass nameplates.
Your name goes on the trophy when you beat the current record by at least 5%.
Award Rules
- 5% improvement required — beat the previous record by at least 5% (in calendar days) to claim the trophy
- Public repository — your code must be publicly available for verification
- 100% pass rate — all 622 SQLLogicTest files must pass
- Verifiable git history — first commit date to 100% pass rate determines your time
Current Record Holder
Beat this by 5%? That's 24 days or less to claim the trophy.
Why This Challenge?
Objective Measurement
No subjective code reviews. Either the tests pass or they don't. 6 million assertions leave no room for ambiguity.
Real Complexity
SQL databases require parsers, optimizers, and execution engines. This isn't a toy problem—it's production-grade engineering.
Time Is Truth
Calendar days to completion is the ultimate metric. Does parallelizing to 1,000 agents help? Now you can find out.
Get Started
Start from scratch in any language, or use one of our seed repos for convenience. Each seed includes the SQLLogicTest suite, a test runner, and CI workflow.
Seed Repos (optional)
Rust
Cargo build system, zero-cost abstractions, memory safety without GC.
Fork on GitHub →C++
CMake build system, maximum performance, full control over memory.
Fork on GitHub →Go
Simple toolchain, fast compilation, excellent concurrency primitives.
Fork on GitHub →Start Your Project
Create a new repo from scratch, or fork a seed above for a head start. Get the SQLLogicTest suite. Your first commit starts the clock.
Build Your Database
Implement a SQL parser, query executor, and storage engine. Use any AI tools—Claude,
Copilot, or your own agents. Run make test to track progress.
Hit 100% and Submit
When all 622 test files pass, open an issue at vibesql-challenge/submissions with your repo link and commit hashes. Beat 25 days to join the leaderboard.