Updated 2020-01-20
A typical SRAM cell consists of back-to-back inverter. Note that due to the nature of the cell and logic gate (meta-stablility and regenerative properties), upon powerup the stored bit is random.
The inverter could either be CMOS (2 MOS + 2 MOS + 2 control MOS = 6T), or using resistive load implementation.
These SRAM cells are arranged in a lattice where individual stored data can be accessed using wordlines and outputted in bit lines.
The cell select line is activated as response from the CPU address, and turns on the two control MOS and the inverter outputs are fed into a sense amplifier.
We need a sense amplifier because upon read, the change in bitlines is very small due to the fact that all memory cells of a bitline are connected in parallel. The sense amplifier is biased to Vdd/2 and upon a small change, it will be pulled to either 0 or Vdd.
The cell select is asserted HIGH, then the CPU drives data into the data in/out lines (bitlines). The higher power control MOS overpowers the inverter and forces the data into an appropriate stable state.
No two chips exhabit identical timing due to difference in temperature, material, capacitance, voltage. For example, a higher voltage can make a circuit work faster because capacitances can be charged and discharged more quickly. Another example is when a digtial pin switches from a 1 to a 0 (discharging capcitance), so there exists a current that flows into the ground. The ground has intrinsically inductance properties, so the Vdd inside the digital chip could drop, making the circuit run slower.
Key design decisions for read operation:
There exists a period of time between transitions of different address signals. This is to ensure that we turn one memory device OFF between a another memory device ON to avoid contension. This is done using Address Strobe (AS_L
) active low signal. AS_L
turns HIGH before the address changes, and turns on a bit after the address changes.
Other delays can come from buffers. In this case, we’re not even considering the propagation delay in the wires.
The total delay, ttotal, after Dtack is received is at tPAB + tAA + tPDB + tDSU or tPAB + tPAD + tACE + tDSU.
Synchronous memory can have the data clocked, so a clock is needed. SSRAMs have internal counters to automatically increment the last address, allowing burst reading or writing of data (quickly burst different data without changing address).
It is more important to consider timing for synchronous memory system because CPU usally clocks a lot faster than memory access time (1 GHz corresponds to 1ns, and is much faster compared to 10 ns, which is typical for fast SRAM). So either we need to slow CPU down, or have a robust protocol to communicate with the SRAM.
Recall our process-memory interface:
AS_L
R/W_L
indiccates direction of data transferUDS
or LDS
active low signals indicate which byte to transfer.DTACK
) which means that the memory has done its job and we can move on.😇 The advantage is that async-memory is flexible and allows for both fast and slow memory to communicate.
😱 The disadvantage is that this DTACK
“handshaking” has lots of overheads that puts an upper limit on transfer rates on highspeed systems.
Here is an example of an implenetation fo 68K async SRAM:
Notice the DTACK
active low signal generator.
We need to look at both read and write operation and how they look like (they have different timings).
The read operation takes 4 clock cycles. They are further split into 8 states (S0 to S7), this is the fastest we can go. The arrows on the diagram below implies the casal relationships between signals.
UDS*
and LDS*
is disabled (HIGH) prior to any read operation.R/W*
is asserted first before we do anything else (S0).AS*
is now asserted along with UDS*
and LDS*
; now the chip has all the signal it needs to start producing its data to the output buffers.DTACK*
from memory is asserted low before end of S4, then we go to S5. Otherwise, the statemachine go to wait states.DTACK*
is asserted low, then the data latch internal in the CPU is latched. Control signals to memory AS*
, UDS*
, LDS*
are disabled, which makes CS*
and OE*
also disabled. At that point, DTACK*
is negated too. This is like taking a exam away from the student when time is up during a midterm.Note: Quartus simulations of the soft core does not simulate tristate of the address bus. So the address is actually asserted a lot earlier. This is good for us becuase the address decoder can start working earlier, reducing delay.
8 MHz Clock
12.5 MHz Clock
The 68k is capable of extending bus cycle indefinitely. We often want to insert wait statements because DTACK*
doesn’t arrive in-time.
DTACK*
is improtantt for DRAM becuase DRAM is leaky and requires refereshing. During the refresh, no read/write is possible and thus DTACK must be delayed.
For fast SRAM and ROM, we don’t need to delay DTACK*
. When AS*
goes HIGH, it makes the counter load the preloaded timer value; and when AS*
goes LOW and the the counter starts.
The input data must to presented to the memory chip and a the same time WE and CE are activated. OE can be ignored because we don’t care. The memory address must remain valid throughout the write cycle.
In write operation, there are two setup and hold constraints: address setup and hold time, and data setup and hold time.
The CE will be removed when one of the R/W or UDS/LDS is deactivated, whichever comes first.
Note: the softcore has similar timing diagrams. However, the address hold time is being violated as the address simultaneously changes when UDS/LDS/AS are deactivated. Propagation or capacitance delays may cause UDS/LDS/AS (as result CE) to be delayed and arrive later than address line changes — which could be problematic.