14
Filter Compute Engine
The filter compute engine is a dual multiply-accumulator
(MAC) data path with a microcoded FIR sequencer. The filter
compute engine can implement a single FIR or a set of
filters. For example, the filter chain could include two
halfband filters, a shaping (matched) filter and a resampling
filter, all with different decimations. The following filter types
are currently supported by the architecture and microcode:
Even symmetric with even # of taps decimation filters
Even symmetric with odd # of taps decimation filters
(including HBFs)
Odd symmetric with even # of taps decimation filters
Odd symmetric with odd # of taps decimation filters
Asymmetric decimation filters
Complex filters
Interpolation filters (up to interpolate by 4)
Interpolation halfband filters
Resampling filters (under resampler NCO control)
Fixed resampling ratio filter (within the available number of
coefficients)
Quadrature to real filtering (w/ fs/4 up conversion)
The input to the filter compute engine comes from one of
three sources—a CIC filter output (which can also be
another backend section), the output of the filter compute
engine (fed back to the input) or the magnitude and d
φ
/dt fed
back from the cartesian-to-polar coordinate converter.
The number and size of the filters in the chain is limited by the
number of clock cycles available (determined by the
decimation) and by the data and coefficient RAM/ROM
resources. The data RAM is 384 words (I/Q pairs) deep. The
data addressing is modulo in power-of-2 blocks, so the
maximum filter size is 256. The block size and the block starting
memory address for each filter is programmable so that the
available memory can be used efficiently. The coefficient RAM
is 192 words deep. It is half the size of the data memory
because filter coefficients are typically symmetric. ROMs are
provided with halfband filter coefficients, resampling filter
coefficients, and constants. The filter compute engine exploits
symmetry where possible so that each MAC can compute two
filter taps per clock by doing a pre-add before multiplying. In the
case of halfband filters, the zero-valued coefficients are skipped
for extra efficiency. There is an overhead of one clock cycle per
input sample for each filter in the chain (for writing the data into
the data RAM) and (except in special cases) a two clock cycle
overhead for the entire chain for program flow control
instructions.
The output of the filter compute engine is routed through a
FIFO in the main output path. The FIFO is provided to more
evenly space the FIR outputs when they are produced in bursts
(as when computing resampling or interpolation filters). The
FIFO is four samples deep. The FIFO is loaded by the output of
the filter when that path is selected. It is unloaded by a counter.
The spacing of the output samples is specified in clock periods.
The spacing can be set from 1 (fall through) to 4096 samples
M
U
X
I
Q
RAM
384
WORDS
I
Q
RAMR/Wb
ADDRA (8:0)
ADDRB (8:0)
S
W
A
P
S
W
A
P
A
L
U
A
L
U
R/d
φ
/dt
0..-23
INMUX (1:0)
0..-23
A
B
1..-23
1..-25
WITH RND
A
B
R
R
I
I
Q
C
∑
S
H
F
T
R
E
G
S
H
F
T
R
E
G
L
I
M
I
T
L
I
M
I
T
R
E
G
R
E
G
R
E
G
M
U
X
M
U
X
∑
E
S
R
E
E
E
O
DOWN SHIFT
0, 1, 2 PLACES
9..-31
0..-23
0..-21
COEF (21:0), SHIFT (1:0)
NOTE: PIPELINE DELAYS
OMITTED FOR CLARITY
IQ
ISL5216