Gaurav Sarma

The "just use SQLite" take has been everywhere lately. For a lot of workloads it's right. SQLite is fast, embedded, zero-ops, and has more rigor in its test suite than most production databases. But if you reach for it without thinking about how it serializes writes, how it picks who gets the lock next, and how you'd grow past one machine, you can hit walls that are obvious in hindsight and painful in production.

This post walks through three of those walls with a small benchmark for each. None of them are reasons to never use SQLite. They are reasons to ask whether your workload sits inside its operating envelope before defaulting to it.

The Problem

Three things that don't show up in a hello-world benchmark but matter in production:

SQLite's lock acquisition is unfair. Multiple waiters do not form a queue.
Write transactions take a global lock on the file. Concurrency for writes is exactly one.
SQLite is an embedded library on a filesystem. You scale a single machine, not a cluster.

Postgres and MySQL handle all three by paying an upfront cost: a server process, a connection protocol, a more involved deployment. That cost is real but not large. The rest of this post tries to make the cost of not paying it concrete.

Prerequisites

Python 3.10+ (the sqlite3 module is in the stdlib)
A Unix-ish machine for the multiprocess scripts
A working knowledge of SQL transactions

Technical Decisions

A few notes on how the experiments are set up.

WAL mode for everything

Every measurement runs with PRAGMA journal_mode=WAL and PRAGMA synchronous=NORMAL. WAL mode is the modern default for any SQLite database with concurrent access, because it lets readers and writers proceed in parallel. The single-writer constraint is still there, but at least readers aren't blocked. Running this benchmark in legacy rollback-journal mode would make the numbers worse and less representative of a sensibly configured deployment.

`BEGIN IMMEDIATE` instead of `BEGIN`

SQLite defaults to deferred transactions: the lock is acquired lazily when the first write happens. This is convenient but produces a race where two transactions can both think they're going to be the writer, and one gets aborted with SQLITE_BUSY. BEGIN IMMEDIATE acquires the write lock at the start of the transaction and is the standard advice for any code that needs to coordinate writers. It also makes lock-acquisition timing easy to measure.

`busy_timeout` set to 30 seconds

Workers wait up to 30 seconds for the lock instead of failing fast with SQLITE_BUSY. This is what most application code does in practice. It's also what makes the unfairness visible: with a long timeout, no waiter gives up early, and we can see the order in which they actually get the lock.

Demo 1: Lock acquisition is unfair

The first claim is that SQLite has no queue for waiting writers. When the lock becomes available, any waiting process might get it.

To make this visible, start 8 workers that each try to acquire BEGIN IMMEDIATE at the same time, then look at how many times each worker is the first, middle, and last to acquire it across many trials.

import sqlite3, multiprocessing, time, os

DB = "lock_test.db"
WORKERS = 8
TRIALS = 100

def worker(wid, barrier):
    conn = sqlite3.connect(DB, timeout=30.0, isolation_level=None)
    conn.execute("PRAGMA busy_timeout=30000")
    conn.execute("PRAGMA synchronous=NORMAL")
    for trial in range(TRIALS):
        barrier.wait()                       # all 8 line up
        t0 = time.perf_counter()
        conn.execute("BEGIN IMMEDIATE")
        t1 = time.perf_counter()
        conn.execute("INSERT INTO t (who, trial, ts) VALUES (?, ?, ?)",
                     (wid, trial, t1))
        conn.execute("COMMIT")
    conn.close()

After 100 trials with 8 workers on a stock laptop, the per-worker breakdown looks like this. "Position 1" is how many times that worker got the lock first; "Position 8" is how many times it got it last. If acquisition were FIFO from the moment all 8 were blocked, the order would be deterministic. If it were perfectly fair-random, every cell would be 12.5.

Worker	Position 1	Position 4	Position 8	Max wait (ms)
0	10	12	11	79.9
1	14	10	14	27.1
2	12	13	12	49.8
3	13	10	13	27.3
4	14	8	14	27.4
5	14	16	13	41.6
6	10	15	10	27.2
7	13	16	13	114.8

Two things to notice. First, position counts scatter around 12.5 with no FIFO pattern. Second, max-wait per worker varies wildly. Worker 1 never waited more than 27 ms, while worker 7 waited 115 ms at least once. That's a 4.2x spread for the same workload on the same machine. Some workers consistently get unlucky.

Why? SQLite's busy-wait is built on top of the OS file-locking primitives. When the lock is released, every blocked waiter is racing to acquire it next. Whoever the kernel schedules first wins. There's no FIFO queue, no priority, nothing. In a quiet system this looks roughly fair on average. Under load, the long tail can blow up, and there's no upper bound on starvation that you can derive from "I am next in line", because there is no line.

For batch jobs this is a non-issue. For an HTTP server with a tail-latency SLO, it matters. p99 is what users feel, and p99 is what gets blown up by random scheduling.

Demo 2: One writer, total

The second claim is that the write lock is global to the entire database file. Adding more writer processes does not increase write throughput. It can decrease it.

import sqlite3, multiprocessing, time, os

DB = "write_scaling.db"
TOTAL_INSERTS = 20_000

def worker(n_inserts):
    conn = sqlite3.connect(DB, timeout=30.0, isolation_level=None)
    conn.execute("PRAGMA busy_timeout=30000")
    conn.execute("PRAGMA synchronous=NORMAL")
    payload = "x" * 100
    for _ in range(n_inserts):
        conn.execute("BEGIN IMMEDIATE")
        conn.execute("INSERT INTO t (payload) VALUES (?)", (payload,))
        conn.execute("COMMIT")
    conn.close()

Each worker does its share of TOTAL_INSERTS / N inserts. Sweep N from 1 to 16:

Writers	Inserts/sec	Speedup
1	122,604	1.00x
2	116,545	0.95x
4	112,282	0.92x
8	96,949	0.79x
16	56,033	0.46x

Throughput goes down as you add writers, and at 16 it has more than halved. The work is identical (20,000 inserts), but the lock thrash, kernel scheduling, and BEGIN IMMEDIATE retries eat into useful time. There's exactly one writer at any moment, by design. Adding processes adds contention without adding parallelism.

The corresponding Postgres or MySQL number on the same machine grows roughly linearly with writers until you hit CPU or disk limits, because they have row-level locking and a real concurrent-transaction model. SQLite gave that up in exchange for being a library with no server.

The fix-it advice you'll find online is "batch your writes" and "use a single writer process behind a queue". Both are correct. Both are also a small operational system that you now have to build, debug, and observe. If your write workload is naturally bursty or comes from independent processes that you don't control (think: many CGI workers, many background jobs), you'll be doing this work yourself.

Demo 3: One file, one machine

The third claim doesn't need a benchmark, but it's the one with the longest tail of consequences.

A SQLite database is a file. You can open it from one machine. If you want a second machine to read from it, you copy the file (a consistent snapshot) or you put a service in front of SQLite that the second machine talks to. If you want a second machine to write, you can't, end of story.

Postgres and MySQL ship with the primitives for the next step:

Streaming replication for read scale-out
Logical replication for partial replication or version migration
Failover tooling, leader election, and broad ecosystem support for both
Sharding extensions (Citus, Vitess) for the cases where one node is genuinely not enough

Building any of this on top of SQLite is a project. It has been done, and the projects work, but you're now operating a distributed system whose primitives you wrote, configured, or pulled in from a young ecosystem. The reason "vertical-only scaling" feels light when you write it down and heavy when you live with it is that the moment you cross the one-machine line, every operational concern you took for granted becomes your problem to solve.

This is fine for a personal project or a single-tenant SaaS where one big instance is plenty. It's not fine for a multi-region service or anything that needs to survive a single host going down without losing minutes of writes.

How It All Fits Together

The three limits compound in a specific way:

The unfair lock means tail latency under contention is unpredictable.
The single global writer means you can't fix that by adding processes.
The single-file model means you can't fix that by adding machines.

If your workload doesn't push on any of these (low write rate, single-machine deploy, no tight tail-latency SLO), SQLite is great and the operational simplicity is real. If your workload pushes on any one of them, the workarounds exist but they all cost more than just running Postgres or MySQL would have.

Lessons Learned

A few takeaways from poking at this directly.

The unfairness is not theoretical. With 8 contending writers and 100 trials, max-wait per worker varied by 4.2x. That's the kind of variance that ruins p99 latency for any real-time service.

Adding writer processes makes things worse, not better. This is the most counterintuitive finding for engineers used to Postgres or MySQL, where adding workers usually helps until you saturate something. With raw SQLite, more writers means more lock thrash and lower throughput. Going from 1 to 16 writer processes more than halved insert/sec on this machine.

The one-machine ceiling is real, but the more interesting cost is the one before the ceiling. Long before you outgrow a single host, you'll want a read replica, or zero-downtime maintenance, or point-in-time recovery on a hot database. All of those are first-class in Postgres and MySQL. In SQLite they're external projects you bolt on.

None of this means "don't use SQLite". The ergonomic and operational wins are real. For local-first apps, embedded analytics, single-process services, single-tenant CLIs, and file-format-as-database use cases, SQLite is the right answer. The point is that "just use SQLite" should come with the same kind of fit-check you'd apply to any other database: what's the write concurrency, what's the read scale-out story, what's the failover plan?

For most server-side, multi-tenant, multi-process workloads, raw Postgres or MySQL is still the better default. The setup tax is small and the headroom is enormous.

What's Next

A few directions worth measuring next:

The unfairness benchmark with synchronous=FULL, to see if fsync timing reshuffles which worker gets the next lock.
Mixed read/write workloads, to confirm that WAL's reader-writer parallelism really does isolate read latency from writer contention even at high write rates.
BEGIN IMMEDIATE versus an explicit application-level write queue (a single writer goroutine fed by a channel), to see how much of the throughput loss in Demo 2 is reclaimable by serializing writers in userspace instead of the kernel.

Where SQLite Gives Up - Locks, Writers, and the Single-File Problem

The Problem

Prerequisites

Technical Decisions

WAL mode for everything

`BEGIN IMMEDIATE` instead of `BEGIN`

`busy_timeout` set to 30 seconds

Demo 1: Lock acquisition is unfair

Demo 2: One writer, total

Demo 3: One file, one machine

How It All Fits Together

Lessons Learned

What's Next

References

Where SQLite Gives Up - Locks, Writers, and the Single-File Problem

The Problem

Prerequisites

Technical Decisions

WAL mode for everything

BEGIN IMMEDIATE instead of BEGIN

busy_timeout set to 30 seconds

Demo 1: Lock acquisition is unfair

Demo 2: One writer, total

Demo 3: One file, one machine

How It All Fits Together

Lessons Learned

What's Next

References

`BEGIN IMMEDIATE` instead of `BEGIN`

`busy_timeout` set to 30 seconds