muchen 牧辰

Project 4 - Assignment

Updated 2019-11-11

1. NAND3 Simulation and Layout

Design Summary Layout Area Delay Area × Delay
  2.9068μm2 70.5ps 204.9294 μm2ps

NAND3 Layout

The following figure is the layout of the NAND3.

nand3_layout

For a typical NAND3, the theoretical sizing is 2W for PMOS and 3W for NMOS. However through simulation this ratio is not sufficient to ensure tpHL and tpLH match. Ergo, for my design, I’ve chosen 4W for PMOS and 8W for NMOS. Where W is 120nm.

The distance from input (A, B, and C) to output (F) is about 0.9μm apart.

Waveform

The following waveform graph showcase the input vs. the output for a worst-case test-bench setup. See the next section for the test-bench setup. Essentially, the worst-case is if two of the inputs to the NAND3 are tied to fixed value. The input pattern transitions like this: 001000 and output should toggle between 0 and 1.

The simulation is performed accounting for the extracted view after layout, as well as all the parasitic capacitances.

q1_delay_graph

The tpHL is 69.5ps and tpLH is 71.5ps.

Schematics

The diagram below shows the schematic for the test bench used to test the worst-case scenario of the NAND3. The waveform in the previous section is generated using this schematic and configured to use the extracted view from the layout.

nand3_tb_schematic

Important notes:

The diagram below is the schematic of the NAND3.

nand3_schematic

2. Static CMOS Logic

1573518002698

Logic Function

We can look at the pull-down portion to decipher the logic: ~F=(A+B)CD, so F=(~A~B)+~C+~D.

CMOS Sizing

We want to match the MOS’s such that overall it matches that of resistance in an inverter where NMOS W/L=4λ, and PMOS W/L=8λ.

1573518480157

The worst case for the pull-up logic is that only a single branch, specifically the A-B PMOS branch since there is more resistance due to the two PMOS in series. Therefore, we have to make them 4W each, or 16λ. For C and D, they’re 2W or 8λ each.

For pull down, only one of A or B NMOS is on would be the worst case. So the three NMOS in series need to match 1W for an inverter. So they’re all 3W or 12λ each.

Worst Case Transitions

Hight-Low

For tpHL, we need to consider a scenario where as many nodes are charged up to VDD as possible before transition. By this logic, D mustn’t be on and C must be always on. So the input pattern ABCD = 1010, 0110, or 1110 all works.

During transition, we need to consider a scenario where as little NMOS are turned on to drive discharge all nodes to ground. The only NMOS we can turn on to do this is D. So the pattern ABCD is xxx1 (where x is whatever input is before).

For instance, this is a worst-case HL transitions ABCD = 10101011.

Low-High

Before transition, we want as many node to be 0, so C and D NMOS are both on. (C and D PMOs are off). Since the only node in the pull-up complex that is capable of being 0 is the node between B and A, we turn A off (PMOS for A is on). Therefore, the input pattern before transition is ABCD = 0111.

During the transition, we turn on as little PMOS as possible to drive up. So we just turn off B (turn on B PMOS). The input pattern is ABCD = 0011.

For worst-case LW transition ABCD = 01110011.

Simulation Schematic

The schematic of the logic gates and the test-benches are shown in the diagrams below:

Function logic module:

q2f_schematic

Test bench for transitioning ABCD = 1010 to 1011:

q2f_tb1_schematic

Test bench for transitioning ABCD = 0111 to 0011:

q2f_tb2_schematic

Simulation Waveform

The worst case tpHL is 218ps by reading from the waveform of the simulation of the first test-bench schematic.

q2f_tphl

The worst case tpLH is 111ps by reading from the waveform of the simulation of the second test-bench schematic.

q2f_tplh

3. Transmission Gate Logic

The following circuit has transmission gate sizing as $W_N=W_P=4\lambda$, and $L=2\lambda$. The inverters have PMOS W/L of $8\lambda:2\lambda$, NMOS W/L of $4\lambda: 2\lambda$.

1573521798706

The expression of the output function is derived as follows:

Therefore the combined logic is:

\[OUT=\neg C\\ C=(\neg A\wedge sel)\vee(\neg B \wedge selB)\\ OUT = (A\vee\neg sel)\wedge(B\vee \neg selB)\]

A to C

The path from A to C passes through a single inverter and a single TG. There exists some capacitances C1 and C2 at each node along the path.

1573532736244

The equivalent circuit looks like this:

1573532607301

As derived in class, we can approximate the TG resistance as $R_{TG}=R_{eqn}(L/W)$. And we can further approximate $R_{inv}=R_{TG}=(12.5k\Omega)(2/4)=6.25k\Omega$.

For C1, we need to account for capacitance from the input inverter and the TG. The input inverter has capacitance of $C_{inv}=C_{eff}(4\lambda+8\lambda)$. The TG is turned on because sel is on, so its capacitance is $C_{TG}=C_{eff}\cdot2(4\lambda)+C_g(4\lambda)$. Combined, we get:

\[\begin{aligned} C_1 &= (4\lambda+8\lambda)C_{eff}+2(4\lambda)C_{eff}+(4\lambda)C_g\\ &=20\lambda C_{eff}+4\lambda C_g\\ &=20(0.1)(1.0)+4(0.1)(2.0)\\ &=2.4~\mathrm{fF} \end{aligned}\]

For C2, we get the TG capacitance in the same manner. But we also consider the load inverter, which is $f$ times larger. The capacitance from the load inverter is given as $C_{li}=f C_g(4\lambda + 8\lambda)$. Combined:

\[\begin{aligned} C_2&=f(4\lambda+8\lambda)C_g+2(4\lambda)C_{eff}+(4\lambda)C_g\\ &=C_g(12f\lambda +4\lambda)+8\lambda C_{eff}\\ &=(2.0)(12f(0.1)+4(0.1))+8(0.1)(1.0)\\ &=2.4f+1.6~\mathrm{fF} \end{aligned}\]

A to C Delay

1573532607301

Using Elmore delay to find the total delay:

\[t_D=R_{inv}C_1+(R_{inv}+R_{tg})C_2=RC_1+2RC_2,\quad R=6.25\mathrm{k\Omega}\]

A to OUT

Now we want to find the delay through the entire circuit (from A to OUT).

1573534837156

The equivalent circuit is now as follows:

1573534938035

C1 and C2 is the same as before. Except now we also have to consider another set of resistance from the inverters, and C3 at the output node.

\[C_3=f\cdot C_{eff}(4\lambda + 8\lambda) + C_{load},\quad C_{load}=50\mathrm {fF}\\ C_3=f(1.0)(12(0.1))+50=1.2f+50~\text{fF}\] \[R_{inv1}=\frac{R_{inv}}{f}\]

Total delay using Elmore delay, using $R_{inv}=R_{tg}=R=6.25\mathrm{k\Omega}$:

\[\begin{aligned} t_D&=RC_1+2RC_2+ \left(2R+\frac{R}{f}\right) C_3\\ &=R\left[(2.4)+2(2.4f+1.6)+\left(2+\frac{1}{f}\right)(1.2f+50)\right]\\ &=(6.25\mathrm{k})\left(7.2f+106.8+\frac{50}{f}\right)\\ \end{aligned}\]

To minimize the delay, we choose to minimize the terms $7.2f+50/f$. By plotting, there exists a local minimum at $f=2.635$:

1573538810644

Note: ignore the y-axis value. The graph is only made to find the the value of $f$ by looking at a local minimum.