FPGA Linear interpolator, problem in difference terms

zermelo · Aug 6, 2014

Hi,

I am trying to build my own linear interpolator (that's it , to implement : y(x) = y1+ [(y2-y1)/(x2-x1)]*(x - x1)). To do that , I designed a simple pipelined stage computing the difference terms, and other stages computing the product, division and final sum.

The problem is that s_x_x1 term (see code above) computes well, while as the s_y2_y1 and the s_x2_x1 remain stuck at zero (of course the operands are not equal in any case).

Here is the code:

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
Library ieee;
library work;
 
use ieee.std_logic_1164.all ;
use ieee.numeric_std.all;
use work.skp_math_pkg.all;
 
 
Entity lin_itpl is
Generic
  (G_WORD_SIZE  : integer := 8); 
Port  
  (-- Reset & clock/clock en   
   i_rst : in std_logic;                 
   i_clk : in std_logic;
   i_conv : in std_logic;
   -- Inputs 
   i_x   : in signed(G_WORD_SIZE - 1 downto 0);   
   i_x1  : in signed(G_WORD_SIZE - 1 downto 0);
   i_x2  : in signed(G_WORD_SIZE - 1 downto 0); 
   i_y1  : in signed(G_WORD_SIZE - 1 downto 0);
   i_y2  : in signed(G_WORD_SIZE - 1 downto 0);
   -- Outputs  
   o_data_rdy : out std_logic;
   o_data     : out signed(G_WORD_SIZE - 1 downto 0));
    
end lin_itpl;
 
Architecture rtl of lin_itpl is 
 
signal s_sgn_quot : std_logic;
 
-- pipe ctrl
signal s_p1_trg : std_logic;
signal s_p2_trg : std_logic;
signal s_p3_trg : std_logic;
signal s_p4_trg : std_logic;
 
-- Arithmetic
signal s_x_x1 : signed(G_WORD_SIZE downto 0); -- Sign bit!
signal s_x2_x1: signed(G_WORD_SIZE downto 0); 
signal s_y2_y1: signed(G_WORD_SIZE downto 0); 
signal s_quot : signed(G_WORD_SIZE downto 0); -- Unsigned division of operands of the same size , converted to signed (+1 bit!) 
signal s_prod : signed(G_WORD_SIZE downto 0);
signal s_sum  : signed(G_WORD_SIZE - 1 downto 0);
 
 
 
begin
 
-- Quotient sign (operands in 2's comp)
s_sgn_quot <= s_y2_y1(s_y2_y1'left) xor s_x2_x1(s_x2_x1'left); 
 
P_PIPE_ITPL : process(i_clk)
begin
  if (i_clk'event and i_clk = '1') then
    if (i_rst = '1') then
      -- pipe ctrl
      s_p1_trg <= '0';
      s_p2_trg <= '0';
      s_p3_trg <= '0';
      s_p4_trg <= '0';
      
      -- Arithmetic
      s_x_x1  <= (others => '0');
      s_x2_x1 <= (others => '0');
      s_y2_y1 <= (others => '0');
      s_sum   <= (others => '0');
      s_prod  <= (others => '0');
    else
      -- Pipeline triggers are pulsed
      s_p1_trg <= '0';
      s_p2_trg <= '0';
      s_p3_trg <= '0';
      s_p4_trg <= '0';
    
      -- Difference terms are independent: compute in parallel
      if (i_conv = '1') then 
        s_x_x1   <= resize(i_x, s_x_x1'length)  - resize(i_x1,s_x_x1'length);      
        s_x2_x1  <= resize(i_x2,s_x2_x1'length) - resize(i_x1,s_x2_x1'length);  
        s_y2_y1  <= resize(i_y2,s_y2_y1'length) - resize(i_y1,s_y2_y1'length);            
        s_p1_trg <= '1';
      end if; 
      
      -- Unsigned division of operands of the same size , converted to signed 
      -- (include sign bit and to_std_logic_vector first!!)      
      if (s_p1_trg = '1') then  
        s_quot   <= resize(signed(std_logic_vector(s_sgn_quot&f_divide(unsigned(s_y2_y1),unsigned(s_x2_x1)))),s_quot'length);
        s_p2_trg <= '1';
      end if;
      
      -- Truncated product term
      if (s_p2_trg = '1') then 
        s_prod   <= resize(s_quot*s_x_x1,s_prod'length);
        s_p3_trg <= '1';
      end if;      
      
      -- Truncated product term 
      if (s_p3_trg = '1') then
        s_sum    <= resize(s_prod + i_y1,s_sum'length) ;
        s_p4_trg <= '1';   
      end if;  
    end if; 
  end if;
end process P_PIPE_ITPL;
 
 
-- Outputs
o_data_rdy <= s_p4_trg;  
o_data     <= s_sum;
 
     
 
end rtl;

Found this error while simulating with Modelsim a larger component which instantiates the interpolator. The instantation of this component is:

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
U_ITPL_TTL_HIGH: lin_itpl
Generic map
  (G_WORD_SIZE => 12)
Port map 
  (-- Reset & clock/clock en   
   i_rst  => i_rst,                 
   i_clk  => i_clk,
   i_conv => s_tx_cr_p_zc,
   -- Inputs 
   i_x  => signed(s_tx_tmr),   
   i_x1 => to_signed(0,12), 
   i_x2 => signed(c_tx_tmr_tc), 
   i_y1 => signed(s_freq_hilen),
   i_y2 => signed(s_freq_lolen),
   -- Outputs  
   o_data_rdy => s_ttl_data_hi_rdy,
   o_data     => s_ttl_data_hi);

The mapped signals/constants are declared in the architecture of the top component as:

Code VHDL - [expand]
1
2
3
4
signal s_tx_tmr     : unsigned(11 downto 0);
constant c_tx_tmr_tc : unsigned(s_tx_tmr'range):=to_unsigned(1000,s_tx_tmr'length); 
signal s_freq_lolen  : unsigned(s_tx_tmr'range); 
signal s_freq_hilen  : unsigned(s_tx_tmr'range);

The signals i_x1, i_x2 , y_1 & y_2 take constant values (as required for the interpolator) . Any hint of what could be going on here?

Thanks in advance. In the mean time I'll try to simulate the interpolator isolated with different values.

Note:

I found the "divide" function somewhere on the internet, and seemed to work fine at least in behavioral simulation. That's the only thing I use of the work.skp_math.pkg. It's defined for unsigned only, and , as I want the interpolator to be as general as possible (and avoid instantiation of Xilinx/Altera IPs) I had to do that nasty conversion (hope it works).

Here is the code for the function:

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
function  f_divide  (a : unsigned ; b : unsigned ) return unsigned is
 
variable a1 : unsigned(a'length-1 downto 0):= a;  -- Divisor
variable b1 : unsigned(b'length-1 downto 0):= b;  -- Dividend
variable p1 : unsigned(b'length downto 0):= (others => '0');
 
begin
  for i in 0 to b'length-1 loop
    p1(b'length-1 downto 1) := p1(b'length-2 downto 0);
    p1(0) := a1(a'length-1);
    
    a1(a'length-1 downto 1) := a1(a'length-2 downto 0);
    p1 := p1-b1;
    
    if(p1(b'length-1) ='1') then
      a1(0) :='0';
      p1    := p1+b1;
    else
      a1(0) :='1';
    end if;
  end loop;
  
  return a1;
 
end f_divide;
 
end skp_math_pkg ;

TrickyDicky · Aug 6, 2014

I dont really know why they are stuck at 0 either, unless your i_conv input or i_reset is stuck at '0';

Some other comments on the code though:
1. Why are you enabling each stage of the pipeline? why not just make a normal pipeline without the enables and just pass i_conv through a shift register of delay 4? that would have the result effect, and use less routing logic.

2. Your divide function will produce a lot of chained logic, giving you a poor fmax. Also, the way you have done the divide means you are going to lose most of the resolution. for example, with your code: 0x800 / 0x800 = 1. But 0x800/0x801 = 0. There is also an oddity, in that with the function, 0xFFF/0xFFF = 0xFFC. I would probably avoid this function. The / function in the numeric_std library is synthesisable by Altera, but Im not sure Xilinx likes it when the dividend is not a power of 2 (ie. a bit shift). But you still have the issue with lack of pipelining and poor fmax.

But in the example you gave, X1 and X2 are constants, so you could have easily input a 1/X value and done a multiply instead (which can be done in a single clock)

The resolution is going to be a problem with any divide function. You need to extend the divisor with extra bits to allow for the extra resolution.

To get around the pipelining and resolution problems, you need to use an IP core (or use the 1/X input so you can do a multiply instead) The way to ensure compatability with both Altera and XIlinx tools is to write a wrapper around their IP, create one file for X and one for A with the same interface, and then include the appropriate file in your project.

zermelo · Aug 6, 2014

Hi again

I simulated the interpolator isolated and now the difference terms behave properly. So, I was probably passing wrong data to the inputs in the design.

The problem now is the quotient, it remains stuck at zero. I simulated several y2-y1 and x2-x1 values.
Maybe that is related with the resolution issue you mentioned for the division.

I'll follow your recommendations: there is no special reason to implement the pipeline in the way I did. Also , that divide function I found looks suspicious, I'll use the wrapper IP method.

But I have to keep y1,y2, x1,x2 as signals, (cannot implement the *(1/constant) & scale method) , since the slope is variable in my design.

Keep you updated

Thanks for your time

Jose

- - - Updated - - -

Hi again,

Regarding the IP wrapper method you mentioned. Is there any way to define generics for the IP instantiation template?

I am working with Altera. Went through the Megawizard for the ALT_DIV function and re-edited the template to add generics for numerator, denominator and result size. I know Altera has a clear "

Quartus generated without warnings related to the division but Modelsim gives me a "component not bound" warning for the divide_wrapper and and undefined output for the result signal.

Any hint to work around this?

Regards

zermelo

std_match · Aug 6, 2014

zermelo said:
But I have to keep y1,y2, x1,x2 as signals, (cannot implement the *(1/constant) & scale method) , since the slope is variable in my design.

y1 and y2 must of course be variable.
It is enough if the difference (x2-x1) is a constant or always a power of 2. Then the division will be very simple. If it is a constant, multiply with 1/(x2-x1). If it is a power of 2, use a variable bit shift.

TrickyDicky · Aug 6, 2014

zermelo said:
Hi again

Quartus generated without warnings related to the division but Modelsim gives me a "component not bound" warning for the divide_wrapper and and undefined output for the result signal.

Component not bound error means it cannot map the component to an entity. Is the problem related to the wrapper or the Altera IP?

PS> You dont need to use the megawizard for altera IPs. You can include the following library:

library altera_mf;
use altera_mf.altera_mf_components.all;

and then instantiate the alt_div yourself - All of the generics are explained in the alt_div documentation

zermelo · Aug 7, 2014

Hi again,

Yes, I went through that. The component simulates now. I rechecked my requirements: for the dividends and divisors I use , the result will be always a positive number such that 0<x<1.

I checked the ALT_FP IP but the resources are prohibitive for a Cyclone II.

After re-thinking, I can pre-compute those values (they are just a few) and pass them as the slope of the linear interpolation. That would force me to work with fixed point arithmetic in the interpolator, right?

FvM · Aug 7, 2014

After re-thinking, I can pre-compute those values (they are just a few) and pass them as the slope of the linear interpolation. That would force me to work with fixed point arithmetic in the interpolator, right?

Yes. I think there are two resource effective implementation variants of table interpolators.
- Having equidistant 2^N +1 table points so that the division can be performed as simple right shift. The calculation takes at least two clock cycles because two table points must be read to calculate the slope.
- Having separate table entries for y and dy/dx

Welcome to EDAboard.com

FPGA Linear interpolator, problem in difference terms

zermelo

Junior Member level 3

TrickyDicky

Advanced Member level 7

zermelo

Junior Member level 3

std_match

Advanced Member level 4

TrickyDicky

Advanced Member level 7

zermelo

Junior Member level 3

FvM

Super Moderator

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics