count_enable
Newbie level 3
- Joined
- Apr 18, 2012
- Messages
- 3
- Helped
- 0
- Reputation
- 0
- Reaction score
- 0
- Trophy points
- 1,281
- Activity points
- 1,312
To make long story short:
- I have large number (>100) of 8-bit signed integers, produced in parallel as BRAM output.
- I need to find saturated sum of this integers, which leads me to a huge 100-input adder.
To simplify let's assume that number of inputs is always a power of 2.
My first idea is to build a binary tree of 2-input adders, i.e. represent the sum of A+B+C+D as a (A+B)+(C+D) . Maybe somebody already has written parametrized code for it?
I am stuck with the next (VHDL):
-I need an array of intermediate signals between the layers of adders, but the array is not rectangular but triangular one: for 128 input I need 64 adders on 1st level, 32 on the second and so on. So if I create the 2D signal array of size (NUM_LAYERS)*(NUM_INPUTS) more than 70% of the signals will be unused. But I don't know the number of layers (it should be parametrizable) to create individual vectors for intermediate signals of appropriate length.
Also I am afraid that such tree will be very slow.
Second idea is to use fast DSP48E blocks as 4-way adders (I have 64 in V5), but I have little experience working with them as adders.
Any ideas how to efficiently solve this?
- I have large number (>100) of 8-bit signed integers, produced in parallel as BRAM output.
- I need to find saturated sum of this integers, which leads me to a huge 100-input adder.
To simplify let's assume that number of inputs is always a power of 2.
My first idea is to build a binary tree of 2-input adders, i.e. represent the sum of A+B+C+D as a (A+B)+(C+D) . Maybe somebody already has written parametrized code for it?
I am stuck with the next (VHDL):
-I need an array of intermediate signals between the layers of adders, but the array is not rectangular but triangular one: for 128 input I need 64 adders on 1st level, 32 on the second and so on. So if I create the 2D signal array of size (NUM_LAYERS)*(NUM_INPUTS) more than 70% of the signals will be unused. But I don't know the number of layers (it should be parametrizable) to create individual vectors for intermediate signals of appropriate length.
Also I am afraid that such tree will be very slow.
Second idea is to use fast DSP48E blocks as 4-way adders (I have 64 in V5), but I have little experience working with them as adders.
Any ideas how to efficiently solve this?