vandelay
Advanced Member level 4
I am implementing a very simple Direct-Form-I 2nd order IIR lowpass filter on dsPIC30 - I do not have too much experience with fixed point and math hardware, so I have a very fundamental/basic question..;
The MAC is 16x16 bit with 40bit accumulator, I use 16 bits signed filter coefficients with two bits in front of decimal point and thus 13 bits behind the point.
I use the MAC with a register w8 pointing to an array of filter coefficients (b, a) and a register w10 pointing to an array of current and previous input values (x, 12-bit samples from ADC), I do something like mac w5*w6,a,[w8]+=2,w5,[w10]+=2,w6 to surf through y=b0*x+b1*x(n-1)+b2*x(n-2)+...
So far so good - the problem is when I get to the output feedback: ...+(-a1)*y(n-1)+(-a2)*y(n-2)
My problem? the filter outputs are not 12 bit sized like the inputs but much larger (12 bits with 13 bits of fractions) MAC'ing them directly with the coefficients will yield 26 bits of fractions so they cannot be accumulated without scaling them first..
How do I do this most efficiently? How is it commonly done? No doubt I can do it the hard way (manual multiply, scale (shift >> 13 bits), add to accumulator) but there must be something I missed somewhere?
EDIT: Can I simply round the output to 12 bits (same format as input) ?
The MAC is 16x16 bit with 40bit accumulator, I use 16 bits signed filter coefficients with two bits in front of decimal point and thus 13 bits behind the point.
I use the MAC with a register w8 pointing to an array of filter coefficients (b, a) and a register w10 pointing to an array of current and previous input values (x, 12-bit samples from ADC), I do something like mac w5*w6,a,[w8]+=2,w5,[w10]+=2,w6 to surf through y=b0*x+b1*x(n-1)+b2*x(n-2)+...
So far so good - the problem is when I get to the output feedback: ...+(-a1)*y(n-1)+(-a2)*y(n-2)
My problem? the filter outputs are not 12 bit sized like the inputs but much larger (12 bits with 13 bits of fractions) MAC'ing them directly with the coefficients will yield 26 bits of fractions so they cannot be accumulated without scaling them first..
How do I do this most efficiently? How is it commonly done? No doubt I can do it the hard way (manual multiply, scale (shift >> 13 bits), add to accumulator) but there must be something I missed somewhere?
EDIT: Can I simply round the output to 12 bits (same format as input) ?