I was implementing my design into Virtex 4 FPGA, there were a
module using 15 pieces of 8X8-bit multipliers which were originally
donw with DSP48s. The entire design uses
Number of Slices 27455 out of 89088 30%
In an experiment I constrained this module's mult_style to LUT. As
expected the Number of DSP48s reduced by 15.
I am curious the final P&R report says
Number of Slices 25915 out of 89088 29%
I like the result but how do I explain it? :?:
Timingwise, two implementations differ only by 0.04ns, I attribute it
to savings in routing.