Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Assignment of DSP Slices in FPGA

Status
Not open for further replies.
E

expertengr

Guest
Hello, there are DSP slices in Modren FPGAs. Do they work on single clock cycle ? for example if an FPGA has 220 DSP slices and each can support and can handle the multiplication of 18 x 25, does this means that this operation will be done in single clock cycle by each DSP slice ?

Second question how to assign intentionally a DSP slice in VHDL code. Is it possible to assign intentionally ? Following is an example.

Code:
architecture behav of multiply_unsigned is
begin
Res <= A * B;
end behav;

Is this (assignment of DSP slices) done automatically during synthesis and tool optimization or can be done manually in VHDL code ?
 
Last edited:

This is referred to as inferring hardware from behavioural VHDL code. If your code matches a template that the synth tool matches, then it will recognise your behavioural code as a candidate for specific hardware.
But you need to understand the DSP slices - both altera and Xilinx have documentation for their devices and their synth tool that tell you how to code for them.

So to answer your questions
1. The actual multiplier itself is not clocked, but is it surrounded by registers that are. So yes, the result will always be computed in a single clock cycle, and you can add extra pipe stages in the slice depending on how you configure it (or infer it). Synth tool also offer register retiming, where pipe stages after a multiply can be moved around by the synth tool to use the internal registers more efficiently than you coded. But yes, an 18x25 can be done in a single cycle. You could even do a 72x72 in a single cycle, as the architecture allows chaining of DSPs (but dont forget about designing your timing).

2. Your code should infer an un-clocked DSP.

Some features of the DSP may not be available from inference, and you may need to generate a core and instantiate it in your VHDL. But in most cases, inference is fine.
 

    V

    Points: 2
    Helpful Answer Positive Rating
As it's not under a clocked process - the above code will infer a multiplier whose result will be available in "ZERO" clocks (not one clock).
As with any combinatoric logic they'll be an output delay, it's somewhat incorrect to analyze this delay in term of "number of clocks it will take"...

With that said, embedded FPGA multipliers are highly optimized areas of silicon that have a very low time to output - and therefore are capable of being integrated in designs that are run at high speeds (even without pipe-lining the multiplication ITSELF).
Even though, if the design allows - it's a good idea to register the output of the multiplier to ease the tools timing efforts when connecting it to the upstream logic (this register is free as it's part of the DSP block).

Code:
process ( clock ) 
begin
if rising_edge ( clock ) then 
result <= a*b ; 
end if ;
end process ;
 

Is there any example to add a DSP slice as component in VHDL module and then instantiate the slice by using Port Map ?
 

Not sure what you'd get of it but here's an example of instantiating Xilinx's DSP48 block:

Code:
--   DSP48E1   : In order to incorporate this function into the design,
--    VHDL     : the following instance declaration needs to be placed
--  instance   : in the body of the design code.  The instance name
-- declaration : (DSP48E1_inst) and/or the port declarations after the
--    code     : "=>" declaration maybe changed to properly reference and
--             : connect this function to the design.  All inputs and outputs
--             : must be connected.

--   Library   : In addition to adding the instance declaration, a use
-- declaration : statement for the UNISIM.vcomponents library needs to be
--     for     : added before the entity declaration.  This library
--   Xilinx    : contains the component declarations for all Xilinx
-- primitives  : primitives and points to the models that will be used
--             : for simulation.

--  Copy the following two statements and paste them before the
--  Entity declaration, unless they already exist.

Library UNISIM;
use UNISIM.vcomponents.all;

-- <-----Cut code below this line and paste into the architecture body---->

   -- DSP48E1: 48-bit Multi-Functional Arithmetic Block
   --          Artix-7
   -- Xilinx HDL Language Template, version 2017.4

   DSP48E1_inst : DSP48E1
   generic map (
      -- Feature Control Attributes: Data Path Selection
      A_INPUT => "DIRECT",               -- Selects A input source, "DIRECT" (A port) or "CASCADE" (ACIN port)
      B_INPUT => "DIRECT",               -- Selects B input source, "DIRECT" (B port) or "CASCADE" (BCIN port)
      USE_DPORT => FALSE,                -- Select D port usage (TRUE or FALSE)
      USE_MULT => "MULTIPLY",            -- Select multiplier usage ("MULTIPLY", "DYNAMIC", or "NONE")
      USE_SIMD => "ONE48",               -- SIMD selection ("ONE48", "TWO24", "FOUR12")
      -- Pattern Detector Attributes: Pattern Detection Configuration
      AUTORESET_PATDET => "NO_RESET",    -- "NO_RESET", "RESET_MATCH", "RESET_NOT_MATCH" 
      MASK => X"3fffffffffff",           -- 48-bit mask value for pattern detect (1=ignore)
      PATTERN => X"000000000000",        -- 48-bit pattern match for pattern detect
      SEL_MASK => "MASK",                -- "C", "MASK", "ROUNDING_MODE1", "ROUNDING_MODE2" 
      SEL_PATTERN => "PATTERN",          -- Select pattern value ("PATTERN" or "C")
      USE_PATTERN_DETECT => "NO_PATDET", -- Enable pattern detect ("PATDET" or "NO_PATDET")
      -- Register Control Attributes: Pipeline Register Configuration
      ACASCREG => 1,                     -- Number of pipeline stages between A/ACIN and ACOUT (0, 1 or 2)
      ADREG => 1,                        -- Number of pipeline stages for pre-adder (0 or 1)
      ALUMODEREG => 1,                   -- Number of pipeline stages for ALUMODE (0 or 1)
      AREG => 1,                         -- Number of pipeline stages for A (0, 1 or 2)
      BCASCREG => 1,                     -- Number of pipeline stages between B/BCIN and BCOUT (0, 1 or 2)
      BREG => 1,                         -- Number of pipeline stages for B (0, 1 or 2)
      CARRYINREG => 1,                   -- Number of pipeline stages for CARRYIN (0 or 1)
      CARRYINSELREG => 1,                -- Number of pipeline stages for CARRYINSEL (0 or 1)
      CREG => 1,                         -- Number of pipeline stages for C (0 or 1)
      DREG => 1,                         -- Number of pipeline stages for D (0 or 1)
      INMODEREG => 1,                    -- Number of pipeline stages for INMODE (0 or 1)
      MREG => 1,                         -- Number of multiplier pipeline stages (0 or 1)
      OPMODEREG => 1,                    -- Number of pipeline stages for OPMODE (0 or 1)
      PREG => 1                          -- Number of pipeline stages for P (0 or 1)
   )
   port map (
      -- Cascade: 30-bit (each) output: Cascade Ports
      ACOUT => ACOUT,                   -- 30-bit output: A port cascade output
      BCOUT => BCOUT,                   -- 18-bit output: B port cascade output
      CARRYCASCOUT => CARRYCASCOUT,     -- 1-bit output: Cascade carry output
      MULTSIGNOUT => MULTSIGNOUT,       -- 1-bit output: Multiplier sign cascade output
      PCOUT => PCOUT,                   -- 48-bit output: Cascade output
      -- Control: 1-bit (each) output: Control Inputs/Status Bits
      OVERFLOW => OVERFLOW,             -- 1-bit output: Overflow in add/acc output
      PATTERNBDETECT => PATTERNBDETECT, -- 1-bit output: Pattern bar detect output
      PATTERNDETECT => PATTERNDETECT,   -- 1-bit output: Pattern detect output
      UNDERFLOW => UNDERFLOW,           -- 1-bit output: Underflow in add/acc output
      -- Data: 4-bit (each) output: Data Ports
      CARRYOUT => CARRYOUT,             -- 4-bit output: Carry output
      P => P,                           -- 48-bit output: Primary data output
      -- Cascade: 30-bit (each) input: Cascade Ports
      ACIN => ACIN,                     -- 30-bit input: A cascade data input
      BCIN => BCIN,                     -- 18-bit input: B cascade input
      CARRYCASCIN => CARRYCASCIN,       -- 1-bit input: Cascade carry input
      MULTSIGNIN => MULTSIGNIN,         -- 1-bit input: Multiplier sign input
      PCIN => PCIN,                     -- 48-bit input: P cascade input
      -- Control: 4-bit (each) input: Control Inputs/Status Bits
      ALUMODE => ALUMODE,               -- 4-bit input: ALU control input
      CARRYINSEL => CARRYINSEL,         -- 3-bit input: Carry select input
      CLK => CLK,                       -- 1-bit input: Clock input
      INMODE => INMODE,                 -- 5-bit input: INMODE control input
      OPMODE => OPMODE,                 -- 7-bit input: Operation mode input
      -- Data: 30-bit (each) input: Data Ports
      A => A,                           -- 30-bit input: A data input
      B => B,                           -- 18-bit input: B data input
      C => C,                           -- 48-bit input: C data input
      CARRYIN => CARRYIN,               -- 1-bit input: Carry input signal
      D => D,                           -- 25-bit input: D data input
      -- Reset/Clock Enable: 1-bit (each) input: Reset/Clock Enable Inputs
      CEA1 => CEA1,                     -- 1-bit input: Clock enable input for 1st stage AREG
      CEA2 => CEA2,                     -- 1-bit input: Clock enable input for 2nd stage AREG
      CEAD => CEAD,                     -- 1-bit input: Clock enable input for ADREG
      CEALUMODE => CEALUMODE,           -- 1-bit input: Clock enable input for ALUMODE
      CEB1 => CEB1,                     -- 1-bit input: Clock enable input for 1st stage BREG
      CEB2 => CEB2,                     -- 1-bit input: Clock enable input for 2nd stage BREG
      CEC => CEC,                       -- 1-bit input: Clock enable input for CREG
      CECARRYIN => CECARRYIN,           -- 1-bit input: Clock enable input for CARRYINREG
      CECTRL => CECTRL,                 -- 1-bit input: Clock enable input for OPMODEREG and CARRYINSELREG
      CED => CED,                       -- 1-bit input: Clock enable input for DREG
      CEINMODE => CEINMODE,             -- 1-bit input: Clock enable input for INMODEREG
      CEM => CEM,                       -- 1-bit input: Clock enable input for MREG
      CEP => CEP,                       -- 1-bit input: Clock enable input for PREG
      RSTA => RSTA,                     -- 1-bit input: Reset input for AREG
      RSTALLCARRYIN => RSTALLCARRYIN,   -- 1-bit input: Reset input for CARRYINREG
      RSTALUMODE => RSTALUMODE,         -- 1-bit input: Reset input for ALUMODEREG
      RSTB => RSTB,                     -- 1-bit input: Reset input for BREG
      RSTC => RSTC,                     -- 1-bit input: Reset input for CREG
      RSTCTRL => RSTCTRL,               -- 1-bit input: Reset input for OPMODEREG and CARRYINSELREG
      RSTD => RSTD,                     -- 1-bit input: Reset input for DREG and ADREG
      RSTINMODE => RSTINMODE,           -- 1-bit input: Reset input for INMODEREG
      RSTM => RSTM,                     -- 1-bit input: Reset input for MREG
      RSTP => RSTP                      -- 1-bit input: Reset input for PREG
   );

   -- End of DSP48E1_inst instantiation
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top