Every byte your programs touch lives, however briefly, in DRAM: billions of leaky little capacitors that forget their contents in milliseconds and must be constantly refreshed. This is an interactive, ground-up tour — poke the simulations, scrub the timing diagrams, and watch a memory read travel from the CPU all the way down to one cell and back.
1
DRAM Cell (1T1C) A DRAM cell stores a single bit as electric charge on a tiny capacitor, gated by one access transistor — the canonical "one-transistor, one-capacitor" (1T1C) structure that has dominated main memory since the 1970s. Because the storage capacitor (only tens of femtofarads) constantly leaks charge through subthreshold, junction, and gate-induced paths, the bit decays within tens of milliseconds to seconds and must be periodically rewritten (refreshed). Reading the cell is destructive: connecting the capacitor to the bitline shares the stored charge onto the much larger bitline capacitance, collapsing the cell's full-rail voltage to a tiny perturbation, so a sense amplifier must detect the sign of that perturbation and then restore the bit. The relentless drive for density turned the capacitor 3D — buried trench capacitors etched into the substrate or stacked capacitors built above the transistor — to hold capacitance roughly constant while cell area shrank.
2
Memory Array & Sense Amplifiers A DRAM chip stores bits in a 2D grid of one-transistor/one-capacitor (1T1C) cells, addressed by horizontal wordlines (rows) and vertical bitlines (columns). Reading is destructive: opening a wordline dumps each accessed cell's tiny stored charge onto a precharged bitline, nudging it only a few tens of millivolts up or down. A differential sense amplifier per bitline pair detects that minuscule swing, amplifies it to a full logic level via positive feedback, and simultaneously writes the data back, latching the row into the row buffer (the open page). Array layout (open-bitline vs folded/twisted-bitline) trades cell density against noise immunity in how the two lines feeding each sense amp are routed.
3
Organizational Hierarchy DRAM is organized as a nested hierarchy that fans out from the CPU's integrated memory controller through channels, DIMM modules, ranks, individual DRAM chips, bank groups, banks, and finally the two-dimensional row/column array of 1T1C capacitor cells inside each bank. Each level adds a distinct form of parallelism: channels are fully independent buses, ranks time-share a channel (one drives at a time), banks within a chip pipeline accesses, and the open-row/column array exposes spatial locality through the row buffer. A standard 64-bit (non-ECC) data path is physically assembled in lockstep from several narrow chips — for example eight x8 chips or sixteen x4 chips — all of which receive the same command on the shared command/address bus and each of which drives a fixed slice of the data bus.
4
Commands & Read/Write Operation DRAM is not accessed by a simple address; the memory controller drives a small set of commands over a shared command/address bus that move a DRAM bank through a strict state machine. Reading or writing a byte requires first ACTIVATE-ing (opening) an entire row into a per-bank sense-amplifier "row buffer," then issuing one or more READ/WRITE column accesses (CAS) against that buffer, and finally PRECHARGE-ing to close the row and restore the bitlines before a different row can be opened. Whether the next access is a row-buffer hit, miss, or conflict determines whether you pay just CAS latency or the full ACT+CAS+PRE penalty, and every transfer moves a fixed-length burst (typically 8 beats / a 64-byte cache line on DDR4/DDR5) over the data bus.
5
Timing Parameters DRAM performance is governed by a set of timing parameters expressed in memory clock cycles, each bounding a distinct physical operation inside the array: sensing a row, charging bitlines, precharging back to a neutral state, and refreshing leaky capacitors. The familiar "16-18-18-38" sticker on a DDR4 module is shorthand for CL-tRCD-tRP-tRAS, the four parameters that dominate random-access latency. Because the parameters are counted in clock cycles but the underlying silicon takes a roughly fixed amount of time in nanoseconds, faster-clocked memory needs proportionally larger cycle counts to honor the same physical delays — which is why CL keeps climbing from DDR3 to DDR5 even as actual latency stays nearly flat.
6
Refresh DRAM stores each bit as charge on a tiny capacitor that leaks away through transistor sub-threshold and junction leakage, so every cell must be read and rewritten ("refreshed") before its charge decays past the sense-amplifier's detection threshold. The JEDEC retention guarantee is 64 ms at normal temperature and 32 ms above 85 C, met by issuing periodic auto-refresh (REF) commands roughly every 7.8 us (tREFI). Each REF takes tRFC (tens to hundreds of nanoseconds, growing with density) during which the affected banks are unavailable, creating a refresh overhead "tax" that worsens as chips scale to higher capacities. Modern devices add per-bank refresh, fine-granularity refresh, temperature-controlled refresh, and self-refresh modes to cut this cost and keep retention reliable.
7
Memory Controller & Address Mapping The memory controller is the bridge between the CPU's flat physical address space and the deeply hierarchical, parallel structure of DRAM. It decodes each physical address into a tuple of channel, rank, bank group, bank, row, and column bits; chooses an address-interleaving scheme that spreads consecutive accesses across independent resources to maximize parallelism; and runs a scheduler (classically FR-FCFS) that reorders queued read/write requests to exploit already-open rows while honoring dozens of DRAM timing constraints (tRCD, tRP, tCAS, tFAW, tWTR, bus turnaround, etc.). Its page policy (open vs closed) decides whether a row stays buffered after an access, trading locality against latency for random traffic. Good address mapping and scheduling can swing real-world DRAM throughput and latency by 2x or more.
8
DDR Evolution & Data Rate DRAM interfaces evolved from SDR (one data transfer per clock cycle) to DDR (two transfers per cycle, on both rising and falling clock edges), and have scaled five generations (DDR1-DDR5) primarily by widening the internal prefetch buffer (2n -> 4n -> 8n -> 16n) rather than by speeding up the slow DRAM cell array. The core capacitor array has barely improved in latency over 25 years (~tens of nanoseconds), so each generation fetches more bits per internal access and streams them out faster on the I/O bus to bridge the growing gap between internal array speed and external I/O bandwidth. DDR4 added bank groups to hide this widening prefetch by interleaving accesses across groups, and burst length grew from BL8 to BL16 in DDR5 to match the 16n prefetch.