Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

parallel and pipeline implementation of FIR

Status
Not open for further replies.

alimassster

Junior Member level 3
Joined
Aug 1, 2006
Messages
27
Helped
1
Reputation
2
Reaction score
0
Trophy points
1,281
Activity points
1,473
Hi all
I wanna know by cascading DSP48 blocks if we need for example N clocks to feed multipliers with N inputs in an N tap FIR filter to calculate Y , then what's the difference between the parallel form and a single MACC based form ( if there's an equal delay(number of clocks) in both forms)?

what's the advantages of parallel implementation?
I know in parallel form we have a result in every clock cycle.
but is it because of useing pipeline or something else?

Are inputs ( not coeficients ) fed into multipliers simultaneously or one by one by using BCIN-BCOUT ? if so , again what's the advantage of a parallel form? is it using pipeline or what?
thx in adv
37_1163942194.jpg
 

parallel implementation gives o/p in one clk
fine if ur multipliers support one clk o/p.
pipelined implementation doesn't
cos u have pipelines(reg's) after each section(say a cascaded iir sections).ur operating clk speed increases so does ur latency.
 

I think in the pipeline form after all the registers are full (which may takes some clk cycles) then there is an output in each clk am i right?

my question is : are inputs fed into multipliers simultaneously or with a delay and one by one as a stream by using BCIN-BCOUT in DSP48 block(in Xilinx virtex-4)
and if so(one by one) then how does using for example 64 multipliers in a 64-tap FIR make a better performance? by using pipeline or what?

I want to know how using a huge number of MUL blocks can increase the performance ( if inputs are fed one by one with delay i see no difference with a single MACC based form)

thx a million
 

say ur multiplier and adder(MAC) is done in one clk
* 64 tap in parallel implies 64 multipliers and 64 adders operate in parallel.so output in one clk
* 64 tap in serial implies 1 multiplier and 1 adder operate in serial.so output in 64 clk's(reuse of resources)
 

say ur multiplier and adder(MAC) is done in one clk
* 64 tap in parallel implies 64 multipliers and 64 adders operate in parallel.so output in one clk

are inputs stored in memory and fed simultaneously into MUL blocks OR as a stream , one by one , using BCIN-BCOUT ?
 

inputs(data not coeff) are fed into the input port x(n)
to the diagram u have showed.
regarding the coefficients if ur filter is fixed say 1000Hz Fc always, u can read all of them @ once(only once initially)and they may be stored in the FF(reg may be) they encounter in the diagram shown.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top