Write Amplification Factor (WAF) is a critical phenomenon in NAND flash devices that directly impacts SSD endurance, performance, and lifespan. In simple terms, lower WAF is always better, as it means fewer internal writes for the same amount of host data.

WAF is defined as a ratio, and in an ideal world it should be 1.0, meaning every byte written by the host is written exactly once to NAND. However, in real-world SSDs, this ideal condition is rarely achievable. Due to factors such as garbage collection, wear leveling, and flash translation layer (FTL) operations, the actual amount of data written to NAND is often higher than what the host requestse

So what is WAF ?

As per the theoretical definition WAF is the ratio of Data written to the NAND vs Data Written by the host

At first glance, the definition and formula of Write Amplification Factor (WAF) appear straightforward. When I initially read about WAF, it seemed like nothing more than a simple ratio with a clean mathematical definition. However, once you start digging deeper, it quickly becomes clear that write amplification is far more complex than it looks on paper. To truly understand WAF, we need to look beyond the formula and explore how an SSD actually works internally.

Understanding the Internal Structure of an SSD

Before diving further into write amplification, it’s important to understand how an SSD is organized internally—specifically in terms of NAND dies, planes, blocks, and pages. A helpful way to visualize this is by thinking about the building blocks many of us played with as kids.

Just like stacking blocks to build a structure, NAND flash memory is organized in a hierarchical manner. At the lowest level are pages, which are grouped into blocks. Multiple blocks form a plane, and multiple planes together make up a NAND die.

In simple terms, the internal hierarchy of an SSD looks like this:

NAND Die → Plane → Block → Page

Visualizing this structure is key to understanding why write amplification occurs and why SSD write behavior is fundamentally different from traditional storage devices.

Now let’s move to one of the fundamental limitations of NAND flash memory:
you cannot erase a single page independently. In NAND flash, the erase operation is only possible at the block level. This means that if even one page needs to be erased, the entire block must be erased.

Now consider a scenario where all the pages within a block already contain valid data, and there isn’t a single free or empty page available in that block. In such a case, the SSD cannot simply erase or update an individual page without impacting the rest of the data stored in the block.

This block-level erase constraint is a core reason behind write amplification in SSDs, as it forces the controller to perform additional data movement and housekeeping operations before new data can be written.

Now imagine that a new write request arrives, but the current block has no free pages left. Do we simply return an error saying we’re out of space? Of course not. Instead, the SSD controller looks for empty pages in a different block.

To make room, the controller copies the valid data from the target page into a free page in another block. However, since NAND flash can only be erased at the block level, erasing the original block outright would also destroy the other valid pages stored in it—which is clearly unacceptable.

So what actually happens?

After the data is copied, the original page in the source block is marked as invalid. The data now lives in a new physical location, while the old copy remains until the entire block is eventually erased during garbage collection.

Where Does Write Amplification Come In?

At this point, an important question arises:
Did the write count increase?
Yes—and this is exactly where write amplification occurs.

A single write request from the host results in:

Extra data copies
More writes to NAND flash

Even though the host issued only one write, the SSD performed multiple internal writes, increasing the total write count.

It’s also important to note that invalidating a page does not immediately free space. The physical space is only reclaimed when the entire block is erased later, which further contributes to write amplification and NAND wear.

Internally, an SSD maintains a mapping table that tracks which logical address from the host maps to which physical NAND page. Given this level of tracking, a natural question arises:
Why not simply erase the entire block repeatedly and avoid all this complex data relocation logic?

The answer lies in NAND flash endurance.

Erasing a block repeatedly would cause rapid wear-out of the NAND cells, significantly reducing the lifespan and efficiency of the SSD. Each block can endure only a limited number of program/erase (P/E) cycles, and excessive block erasures would quickly degrade the flash memory.

To avoid this, SSDs rely on data relocation and page invalidation mechanisms instead of frequent block erases. Valid data is moved to new locations, old pages are marked invalid, and the block is only erased later—when enough invalid pages have accumulated to make the operation efficient.

This relocation activity happens continuously in the background, driven automatically by the SSD firmware and system design through processes such as garbage collection and wear leveling. While this approach adds internal write overhead, it helps balance performance, endurance, and reliability, which is why it is a fundamental part of modern SSD architectures.

Overprovisioning – So do we now your a question may come in your mind where are these free blocks coming from ?

Here comes the role of overprovisioning (OP), the so called saviour of the free blocks, you might have noticed that whenever you plug in a storage drive based on NAND flash, you might not see the real capacity, for e.g. if you plug in a 1TB SSD, it will not show full 1TB, have wondered anytime why? or you complain that I am being scammed, no you are not scammed, some space is kept aside for OP and is not exposed to the user

Relationship Between Write Amplification Factor (WAF) and Over-Provisioning (OP)

Over-Provisioning (OP) plays a crucial role in controlling Write Amplification Factor (WAF) in NAND flash–based SSDs. OP refers to the portion of NAND capacity that is reserved internally by the SSD controller and is not visible to the host. This extra space is primarily used for garbage collection, wear leveling, and data relocation.

The relationship between WAF and OP is inversely proportional:

Higher Over-Provisioning → Lower WAF
With more free blocks available, the SSD has greater flexibility to relocate valid data efficiently. This reduces unnecessary data movement, minimizes internal writes, and results in lower write amplification.
Lower Over-Provisioning → Higher WAF
When fewer free blocks are available, the controller is forced to perform more frequent data copying and block erases to service new write requests. This increases internal NAND writes, leading to higher write amplification.

In essence, over-provisioning provides breathing room for the SSD. More OP allows the firmware to manage flash memory more efficiently, improving endurance, performance, and long-term reliability by keeping WAF under control.

Garbage Collection (GC) and Its Impact on Write Amplification Factor (WAF)

Another important phenomenon that has a significant impact on Write Amplification Factor (WAF) is Garbage Collection (GC). While we have already touched on GC indirectly, it’s worth calling it out explicitly because of its central role in SSD write behavior.

Garbage Collection is a background process in SSDs that helps reclaim NAND flash blocks. It works by identifying partially used blocks—blocks that contain a mix of valid and invalid pages—then relocating the remaining valid data to new locations. Once all valid data has been moved, the original block can be safely erased and returned to the pool of free blocks for future writes.

While GC is essential for maintaining available space and sustaining performance, it also introduces additional internal NAND writes. These extra data movements directly contribute to write amplification, making GC one of the key factors influencing WAF in real-world SSD workloads

So better the GC algorithm the better it helps lowering WAF

Conclusion

In conclusion, the background processes involved in data relocation and rewriting inside an SSD mean that each host write can trigger multiple internal NAND writes. In addition to the original write from the host, several other writes occur behind the scenes as part of garbage collection, wear leveling, and flash management.

As a result, two writes issued by the host do not necessarily translate into only two writes to NAND. The actual number of internal writes is often much higher, and this discrepancy is precisely what Write Amplification Factor (WAF) captures.

Simply put, WAF explains why write operations in SSDs don’t follow simple arithmetic—and why internal firmware behavior plays a critical role in SSD endurance and performance.