VACUUM

What is VACUUM?

Vacuuming is the database's garbage collection for row versions — it removes dead tuples, reclaims space, and prevents internal ID wraparound. In PostgreSQL, it's essential housekeeping, usually handled automatically by autovacuum, but DBAs sometimes run it manually for large/busy tables.

Why vacuuming is needed

Databases that use MVCC (multi-version concurrency control) don't overwrite rows in place when you update or delete. Instead:

UPDATE: inserts a new row version (a tuple) and marks the old one as obsolete.
DELETE: just marks the row as dead, but keeps it until no transaction can still see it.

This means tables accumulate dead tuples (old versions of rows). If nothing removes them:

Tables and indexes bloat (wasted space, slower scans).
Queries may see outdated data if old tuples stick around incorrectly.
Transaction IDs can wrap around (in Postgres), which risks data corruption.

So vacuuming is the process of cleaning up old/dead tuples and maintaining system metadata.

VACUUM in different databases

PostgreSQL: VACUUM does several things: reclaims space, maintains visibility maps, advances relfrozenxid.
MySQL (InnoDB): doesn't have explicit VACUUM, but background purge threads clean up old row versions and undo logs.
Oracle: similar concept under the hood, but automated by undo/redo management; DBAs rarely "vacuum."
SQLite: has a VACUUM command, but it's different: it rebuilds the entire database file to reclaim free space and defragment it.

Key trade-offs

Regular vacuuming keeps tables healthy but consumes I/O and CPU.
Too little vacuuming → bloat, wraparound risks.
Too aggressive vacuuming → contention with queries (though Postgres is pretty good at minimizing this with autovacuum).

What is VACUUM?​

Why vacuuming is needed​

VACUUM in different databases​

Key trade-offs​

What is VACUUM?

Why vacuuming is needed

VACUUM in different databases

Key trade-offs