How Database Reads Work: Pages, Buffer Pools, and Disk

Quick Answer: When you query a database, it rarely reads a single row directly from disk. Instead, it relies on RAM. Databases load fixed-size blocks of data called "pages" (typically 8KB) into a memory cache known as the buffer pool. This minimizes slow disk reads and dramatically speeds up follow-on queries.
Let's say you need to fetch a specific order from your database. You write a standard query that looks something like this:
SELECT * FROM orders WHERE user_id = 6;
I see a lot of developers naively think the database engine behaves like a simple file reader: it spins up the physical disk, scans the orders table, finds the exact row for user 6, and hands that single row directly back to you.
In reality, 99% of the time, your database isn't touching the disk at all. And for that 1% of the time when it actually does go to disk, it absolutely is not returning a single row. It returns a page.
How does a database read data from disk?
Databases rarely read single rows directly from the physical disk. Instead, they read larger, fixed-size blocks of data called pages, pulling them directly into a specialized RAM cache called the buffer pool.
To visualize this, imagine your team is building an e-commerce backend. When an API requests an order, the database doesn't send a forklift into the warehouse (the disk) just to retrieve a single AAA battery (the row). That would be wildly inefficient. Instead, it grabs the entire pallet (the page) and brings it to the staging area (RAM).
This means almost every read operation you execute is actually interacting with memory, not the physical hard drive.
What is a database page?
A database page is the fundamental unit of storage in a relational database. It is a fixed-size container—usually about 8 kilobytes—that holds multiple rows of data alongside other structural information.
Because a page is 8KB, it contains a reasonable amount of information. When you insert records into a table, the database packs those rows into pages. The disk is essentially just a massive collection of these 8KB blocks. When the database engine needs to read or write, it operates entirely in terms of these pages, never in isolated rows.
How does the database buffer pool work during a query?
When you execute a query, the database engine checks the buffer pool in RAM to see if the required index and data pages are already loaded. If the pages are missing, it fetches them from disk, stores them in memory, and then returns your result.
Here is the exact step-by-step execution flow when you run a simple SELECT query:
- Query the Index in RAM: The database checks the buffer pool for the relevant index pages. (Hopefully, you have an index on your
user_idcolumn!). - Resolve the Page Location: The index tells the database exactly which page holds the target row, as well as the exact position (offset) of that row within the page.
- Check the Buffer Pool for Data: The engine checks RAM again. It asks, "Do I already have this specific data page in my buffer pool?" If yes, it returns the data immediately.
- Fetch from Disk (If Necessary): If the page is not in RAM, the database goes to disk, reads the entire 8KB page, and stores it in the buffer pool.
- Extract and Return: Finally, the database locates the specific row within that cached page and returns it to your application.
Why do databases read entire pages instead of single rows?
Databases read whole pages into RAM to optimize for future queries. Fetching data from a physical disk is an expensive operation, so loading adjacent rows simultaneously speeds up subsequent requests.
This concept relies heavily on spatial locality. If you load an entire page into RAM just to get one order for user 6, you have also loaded the adjacent rows into memory. When follow-on queries ask for user 7 or another related record, that data is likely already sitting directly in the buffer pool. You skip the slow disk read entirely.
Of course, this raises a logical next question: you obviously don't want to load your entire database into RAM. So, what happens when the buffer pool gets full? Well, that involves eviction algorithms, which is a topic for another video.
Frequently Asked Questions
What is a buffer pool in a database?
The buffer pool is a dedicated segment of RAM used by the database engine to cache table and index data. By keeping frequently accessed pages in memory, the database avoids slow disk I/O operations, making queries run exponentially faster.
How large is a standard database page?
In most major relational database systems, like PostgreSQL and SQL Server, the default page size is 8 kilobytes (8KB). Other databases might use 16KB or allow for custom page sizing, but 8KB is the industry standard.
What happens when the database RAM gets full?
When the buffer pool reaches its memory limit, the database must remove older data to make room for new pages. It typically uses algorithms like Least Recently Used (LRU) to evict pages that haven't been queried recently, ensuring the most active data remains in RAM.



