Tuesday, August 11, 2009

Temporal Logic

CTL*

CTL vs. LTL

Model checking

Symbolic Model Checking



Kripke Structure: M = (S,R,L)



- transform a FSM with guarded transition to M, by modeling original transitions as states in M



you can unroll M to get an execution tree



formulas and logic:

- Ef, Af

- Xf, Ff, Gf, fUg

-

Friday, August 7, 2009

Multi-level Logic Synthesis

RTL === (logic synthesis) ==> netlist

Decomposition
Extraction
Factoring
Substituttion
Collapsing/Flattening

F = d * q + r

algebraic expressions
- product
- divisor

Boolean vs. Algebraic

What's an algebraic expression?
What's a cube-free expression?
What's a primary divisor?

Weak division

Rectangle Covering

What's a kernel?
- what is the level of a kernel?
What's a co-kernel?
- how to extract kernels and co-kernels from a function?

Thursday, August 6, 2009

Retiming

Leiserson and Saxe

- moving registers to minimize area, time (clock period), or both (given one is constrained)
- it is done under several (untrivial) assumptions

- D matrix - path delay matrix
- W matrix - path matrix for min #register for any pair of vertices.

Floyd-Warshall

Bellman-Ford
- n - 1 iterations
- an extra iteration to check negative weigh cycle
- d(v) < d(u) + w(u,v) <== relaxation

clock skew vs. retiming

Peripherial retiming
- use retiming to augment the circuit
- and perform optimization on the combinational parts

The initial value problem
- decreasing the number of registers, then there are less initial values
- if initial values determine the behavior for the circuit, then retiming may not preserve the original circuit

Tuesday, August 4, 2009

Technology Binding (DAGON)

Technology independent mapping of circuit to cell library components. They called this technique DAG rewriting. The complexity of finding a minimal match, or cover, is NP-complete.

It assumes non-local optimizations are done at global optimization or decomposition stages.

Two essential steps
- partitioning into forest of trees (special case of DAG)
- partitioning split any fanout > 1 and is O(G).
- optimally match trees to patterns based on cost (e.g. time, area, or both)
- matching is done with twig (based on Aho-Corasick) is O(T).

It borrows from code generator in compilers
- the use of grammar to specify families of patterns
- peehole optimizer to handle cross-tree boundary opt.

Quality enhancements:
- use annotations to export local information globally
- e.g. inverse, signal, and fanout


--------------------------------------------

(circuit) base function ==> subject graph

(library) base function ==> pattern graphs

Partitioning

- single fanout - split multiple fanout

- single cone - traverse from sink nodes

- may have overlap clustering because of the input-output constraint from patterns

Dynamic programming

- store intermediate best result for each stage

- at each new stage, compute best matches using previous best results

Saturday, August 1, 2009

Linear Algebra

adjoint, or adj(A), is the transpose of the cofactor matrix of A.

Crame's Rule:
- solves AX = B
- xi = det(Ai) / det(A), where Ai is replacing the i-th column with B.

vs. Gaussian Elimination
- Crame's Rule requires computing (n+1) determinants
- computing each determinant requires as much work as solution using G.E.

Jacobian is the det(M'), where M' is the partial derivative matrix of M containing F(f1,....,fn).


Vector:
- v dot product,
- projection, +, -, norm
- span of vector(s)

Thursday, July 30, 2009

Two-Level Logic Minimization

Multivalue circuit terminology:
- literal (variable)
- part
- term
- function

Splitting L, R partitions for MV functions

Identify variable as unate
- if all parts are either monotone increasing or decreasing.
- columns w/ all 1's = monotone increasing
- all columns not all 1's are identical = monotone decreasing
- in the boolean case, if there's no 0's is positive unate, no 1's is negative unate.


prime = cube that cannot be expanded (or reduced).

essential prime = contains a minterm not found in other prime cubes.

irredundant = cover that contains all prime cubes.

orthogonal = two cubes that do not share any common minterms.



PLA


Quine-McCluskey:

  • - implicant table
  • - cover table
    - solving for cyclic core

- Find the complement of a unate cover (NP-complete)

- check if a cover is unate

- check if a function is unate, (is it easy? why or why not?)

-

Tuesday, July 28, 2009

Timing Analysis

statically sensitizable

statically co-sensitizable


true path

critical path = longest true path


Delay Models:

- fixed

- monotone speedup

- bounded


ENF

- leaf DAG


Timed D-Calculus

Monday, July 27, 2009

Timed Automata (Alur, '97)

Terminology:
- location
- switch
- invariant
- time constraint
- labels
- clocks

- A := (L, L0, Sigma, X, I, E)
- E := (s, a, phi, s')

The semantics is a transition system with infinite states and transitions
- because of clock increments
- transition can due to taking a switch or increment of clock(s).

Defining an equivalent relation ~ with automata with finite quotients:
- there are three requirements: stable, Lf-sensitive, and finite.
- examples are zone and region automata

Constructing zone automata can be done algorithmically:
- starting from the initial zone (s0, v), where v is a clock zone, or a set of clock constraints.
- we can apply a formula to derive the next clock zone

Solving the reachability problem for the zone automata is the same as the original timed automata
- there are only finite # of zones
- can be structured as BDD problem


two systems are bisimilar if they match each other's moves

Tuesday, July 21, 2009

Layout (GORDIAN)

Standard cell array - gridline for cells
Sea-of-gate - there is not gridline for the cells


GORDIAN
- global optimization, rectangular disection and final placement
- it's an improvement over approach that combines min-cut bisection with divide-and-conquer


Global optimization
- minimizing distance^2 instead of distance, it decouples x from y

Partitioning
- brute force enumeration
- FM
- repartition (after global opt.)

Final Placement

- For standard cell




shape function

LQP: for x w.l.o.g.,

Minimize "spring distance", squared distances

min: L = sum(xu + c - xn)^2)
constraints: Xc = sum(wu * xu)

LQP ==> UQP

- pick xd in terms of xi

- sub xd's in L ==> Lt

- min Lt by solving Lt' = 0, which is a (Ax = b) problem, use G.E.

Complexity:

- space: O(m)

- time: (O(GO) + O(RD)) * levels of slicing tree ==> (m^3 + m^2lg(m)) * lg(m)

Saturday, July 18, 2009

Compaction

feature + mask

3 ways to reduce layout area:
- reduce spacing betw. features
- reduce feature size
- reshape feature

design rules:
- spacing constraint
- grouping constraint (e.g. via + wire==>sliding ports)
- compatibility (pi)
- diverse constraint (max distance btw features)


Variations:
- 1-D compaction and 2-D compaction
- Constraint-graph based compaction vs. virtual grid based compaction
- packing is a special case where M=1, dij=0, pi=id




constraint generation - why O(n^2)?
- scan-line
- shadow propagation

Constraint solving:
- union-find alg.
- how to deal with equality constraints?
- how to deal with cycles?

compression ridge method

- tile graph to encode empty space + max-flow alg.

-



Layout graph

- generate graph

-- plane-sweep (bottom-to-top scan)

--- include compatibility relation

-- branch-and-bound

- find longest path

-- Dijkstra vs. Bellman-Ford





Topological Compaction

Wednesday, July 15, 2009

Routing

Greedy Channel Router (Rivest and Fiduccia '82):
- problem definition (given channel length, list of T and B pins, and set of L and R nets).
- terms: pins, jog, track, collapse,
- heuristics: rising/falling/steady,

For each column:
1. bring in T and B nets
2. collapse split nets (prove it exponential)
3. reduce the range of split nets
4. raise or lower nets
5. widen channel to bring in new nets


The Path Connection Problem (Lee '61) - The "Lee's Algorithm"
- definition: (C, S, N, Gamma, M), F, and c*, c**
- basically BFS, using cell mass for path weight, and chain coordinates for backtracing

- Dijskra shortest path

Miminum Spanning Tree - useful for city road planning, or laying TV cables within an area.
- for any cut of the graph, the min edge across is part of MST

3D Circuit & 3D Routing
- for 2D, there are usually 6-7 layers

The "switchbox" problem is to solve the intersection of channel

Tuesday, July 14, 2009

Circuit Simulation

what devices are there?

Sparse Tableau Analysis (STA)
- KCL, KVL, BCR, incidence matrix

Nodal Analysis
1. A * i = 0
2. A * f(v) = 0
3. A * f(transpose(A) * e) = 0

Device stamps: filling in matrix A values by input inspection

Modified Nodal Analysis (MNA)

Is an ODE always solvable analytically?

DAE system: d/dt q(x(t)) + f(x(t)) + b(t) = 0
- x: unknowns; q(): charges/fluxes; f(): resistive; b(t): inputs

GE
- pivoting
- partial O(n), complete O(n^2)
- numerical stability problem
- O(n^3) complexity

LU factorization
- forward + backward substitution
- sparsity maintainance


Other topics: sensitivity analysis and noise analysis

* proving linearity of functions

Newton-Raphson Method
- assumptions:
- heuristics for solution
- step size
- initial condition
- (prove) quadratic convergence
- what are NR weaknesses?
- convergence
- abstol, reltol, and residual
- e ~ mV (10^-3), i ~ uA (10^-6)



Transient Simulation (DAE and ODE treatments)
- Lipschitz condition (existence and uniqueness)
- LMS: (FE, BE, TRAP) and their stability
- local truncation error
- how to choose step size based on truncation error
- global error
- Gear's Method

Smoothing
- step() function using tanh()


AC Analysis
DC Analysis
Transient Simulation

Monday, July 13, 2009

219A Lectures

Model of circuit elements is the relationships between electrical quantities at terminals.

Constitutive relations between:
➔ current and voltage: resistor
➔ charge and voltage: capacitor
➔ current and flux: inductor
➔ charge and flux: memristor


Important features in modeling:
- linearity (What are the implications?)
- smoothness is essential for robustness and requirement for some algs.

Modified Nodal Analysis (SPICE)
- e.g. KCL, KVL

Newton-Raphson Method

LU Factorization

Numberical Alg. for solving ODEs/DAEs

DAE Linearization
AC Analysis
Sensitivity Analysis

Noise Analysis
Stochastic Modelling and Simulation

Model Reduction (macromodeling)
- whitebox vs. blackbox approaches

Thursday, July 9, 2009

Testing (ATPG)

Goal: Test for stuck-at faults by generating input patterns that creates output different from the expected output.

D-calculus: 0,1,D,!D,X

1. Justification
2. Pick a variable (decision making)
3. set = 0 / 1
4. implication (reasoning)
5. conflict, not done, or found the test.

SAT solver implementation
- learn thru SCC
- non-local clauses

Wednesday, July 8, 2009

OBDD (Randal Bryant)

The size of BDD's is sensitive to the ordering of variables.
- With linear arrangment, the size is bounded by 2^(wf * 2^wr).
- where wf is the size (# of wires) of the forward cross section
- and wr = reversed cross section.

At every node of a BDD, there is a boolean function (e.g. P = v*L + !vR).

Prove BDD is cannonical
- prove by induction
- (n + 1) case:

BDD implementation:
-

APPLY is a generalization of operations on two boolean functions:
- f*g = x(fx op

Reordering
- sifting

Possible questions:
- Analyze the BDD size for a given function
- explain the effect of variable ordering on BDD size

Friday, June 19, 2009

Graph Partitioning

Problem Search space is hugh:
- 2^V - 2 (for weighed nodes)
- reasoned by powerset
- V choose V/2 (for non-weighed nodes)


Partitioning is used to generate constraints for placement, as opposed to its original use for divide-and-conquer a problem. It's useful in placement where cells and wires are abstracted to nodes and edges. Delays are represented by weighs. The problem is formulated as partitioning a set of nodes into clusters of approx. equal size. The problem is known to be NP-complete. K&L's algorithm leverages heuristics to give good solutions.

Kernighan and Lin's algorithm:
Rely on gain calculation (for exchange a, b): g = Da - Db - 2*Cab, where Di = Ei (external cost) - Ii (internal cost). KL goes through a series of n exchanges, and chooses the culmulative gain maximum.
- O(n^2 log n)

Feduccia and Mattheyses
- claims O(n)


Multilevel scheme
1. Coarsening phase
2. Partitioning phase
3. Uncoarsening phase

- The main idea is that coarsening reduces the number of nodes and edges, which helps to speed up the partitioning alg.

- Collapsing vertexes into multinodes by matching
- Matching is not unique, so how do we find good matching?

Complex Theory

Good day for a little reading on complexity theories.

A complete problem is a representative problem for its class (e.g. NP-complete). All other problems within the class can be reduced to it.

NL
P
NP
PSPACE
EXPSPACE

Thursday, June 18, 2009

Programming Paradigms for Dummies: What Every Programmer Should Know (Peter Van Roy)

Peter Van Roy. Programming Paradigms for Dummies: What EveryProgrammer Should Know, New Computational Paradigms for ComputerMusic, G. Assayag and A. Gerzso (eds.), IRCAM/Delatour France, 2009.
http://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf

Tuesday, June 16, 2009

Rate-Monotonic vs. EDF

Liu and Layland (1973) first discussed the max utilization for RM and EDF.

RM may have higher number of preemptions than EDF.

It has useful graphs for:
- execution load vs. # of preemptions
- # of tasks vs. # of preemptions

Jitter and Latency
- Control applications

Resource Sharing

Aperiodic Task

Monday, June 15, 2009

244 Timing Analysis

The basic clock cycle model is:


Tmax + Tsetup + Tskew + Tclk−to−Q ≤ T


- Tmax = time to propagate through the gates,

- Tsetup = time for signal to be stable within a cycle (before a clock tick)

- Tskew = mis-alignment of a clock signal to FF's

- Tclk-to-Q = time to change FF's output


in additional, there is the hold constraint:


Tmin ≥ Thold + Tskew


- Thold = time for signal to be stable within a cycle (after a clock tick). This says we have to hold the input signals to FFs for a while after a clock tick.


Tsetup is how fast a FF can operate (sufficient condition). Thold is the minimum time a FF need to operate (necessary condition). If Tsetup ≥ Thold, then we can ignore Thold.


Timing analysis helps to automatically infer the clock cycle time to run the circuit. It needs to model delay for wires and gates.