When you are building something small, you never think about this. One user, one request, one database write. Clean. Simple.
Then you ship it.
Now you have ten thousand users hitting the same tables at the same time. And suddenly data looks wrong in ways that are hard to explain. The dashboard shows a number that should not be possible. A transaction goes through that should have been blocked. A report has a row that was not there a second ago.
This is not a bug in your code. This is a category of problem called Read Phenomena. And the database does not protect you from all of it by default.
The setup
Every database operation runs inside something called a transaction. You have probably seen this before.
BEGIN;
UPDATE accounts SET balance = balance - 20 WHERE id = 1;
UPDATE accounts SET balance = balance + 20 WHERE id = 2;
COMMIT;The idea is simple. Either both updates happen, or neither does. That is what makes transactions useful.
The problem is when two transactions run at the same time and they touch the same data. That is where things get weird.
1. Dirty Read
Transaction A updates a row but has not committed yet. Transaction B reads that row right now, mid-update. Then Transaction A rolls back.
Transaction B just read data that never officially existed.
Transaction A: balance = 10 → UPDATE to 20 (not committed yet)
Transaction B: READ → sees 20
Transaction A: ROLLBACK → balance is 10 again
Transaction B is now making decisions based on 20. That number is gone.
The real world version of this: a payment is initiated, something else reads the new balance, then the payment fails and rolls back. The thing that read the balance already acted on it.
2. Non-Repeatable Read
Transaction A reads a row. Gets a value. Reads the same row again inside the same transaction. Gets a different value.
Nothing in Transaction A changed. But the answer changed.
Transaction A: READ price → 10
Transaction B: UPDATE price to 20, COMMIT
Transaction A: READ price again → 20
This sounds unlikely until you think about a checkout flow. You read the price at the start of the transaction to show the user. You read it again at the end to charge them. In between, someone updated it.
Same transaction. Same query. Different result.
3. Phantom Read
This one is subtle. Transaction A runs a query that returns a set of rows. Runs the exact same query again. Gets extra rows back.
Transaction A: SELECT * WHERE salary > 10 → 10 rows
Transaction B: INSERT new row with salary 15, COMMIT
Transaction A: SELECT * WHERE salary > 10 → 11 rows
Non-Repeatable Read is about an existing row changing. Phantom Read is about new rows appearing out of nowhere. Same query, different result set.
If you are generating a report or running a calculation across multiple reads inside one transaction, this will silently break your numbers.
Why this does not always happen
The database gives you a way to control this. It is called an Isolation Level. You set it based on how much protection you need.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;There are four levels.
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | |---|---|---|---| | Read Uncommitted | possible | possible | possible | | Read Committed | prevented | possible | possible | | Repeatable Read | prevented | prevented | possible | | Serializable | prevented | prevented | prevented |
Higher protection means more locking. More locking means less throughput. You are always trading one for the other.
PostgreSQL and MySQL do not agree on defaults
This is the part that actually bites people.
PostgreSQL defaults to Read Committed. MySQL InnoDB defaults to Repeatable Read.
That means the same query, running against two different databases, can behave completely differently under concurrent load. You move from one database to another and suddenly a race condition that was handled is now not. Or a lock that was not happening now is.
I did not know this for a long time. If you have ever had a bug that only showed up under load and only in production, this is worth checking.
The honest summary
Normally you will never see any of this. A small app with light traffic, a database with sensible defaults, everything just works.
But once you have real concurrency — thousands of users, millions of rows, transactions overlapping constantly — the guarantees you assumed you had might not be there.
Knowing what isolation level your database is running at, and what that actually protects you from, is one of those things that feels like trivia until the moment it is not.
All the examples above use small numbers on purpose. The phenomena work the same whether the value is 10 or 10,000,000. The number does not matter. The timing does.