Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronic Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Register Log in

A question for the math experts

Status
Not open for further replies.

suiram

Junior Member level 3
Joined
Mar 27, 2001
Messages
25
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
201
Hi all,
I have to evaluate a function that contains termens like x^0.8+y^3.6
Since I have to execute this a few hundred times per second on a relatively slow uC, I can't afford to use the standard pow from the C library because it's slow.
Isn't any way to decompose x^0.8 for example into an expression faster to evaluate that using the standard pow algorithm?
I would apreciate any advices or pointers.
Thanks
 

zorro

Advanced Member level 4
Joined
Sep 6, 2001
Messages
1,131
Helped
356
Reputation
710
Reaction score
298
Trophy points
1,363
Location
Argentina
Activity points
8,903
Hi suiram,

I advice to use polynomial approximations: you need only multiplications and additions. For finding the coefficients of the approximationg polynomial, you need to specify the range of interest and a error criterium to minimize (least-squares, maximum absolute-value, integral absolute-value,). The polynomial order depends of the allowed error.

You can use several methods for calculating the polynomial. One of them is the funcion POLYFIT of matlab (uses least squares). For example, with this function I found this approximation for z=y^3.6 using a 3rd order poly in the range [0,1]:
zaprox=(1.62257972218847)*y^3 + (-0.78204046757632)*y^2 + 0.16022572661344*y + (-0.00767901243029)

Once you have the coefficients, you have different forms to implement the program: one possibility is

y2 =y*y;
y3=y2*y;
z=b3*y3+b2*y2+b1*y+b0;

But another one is:

z= (((y+a0)*y+a1)*y+a2)*a3;

It depends of your uC and how you code (C or assembly, fixed or floating point) which one is more efficient.

Regards

Z
 

bunalmis

Full Member level 5
Joined
Jan 3, 2003
Messages
268
Helped
19
Reputation
38
Reaction score
11
Trophy points
1,298
Location
Ireland
Activity points
2,147
" Once you have the coefficients, you have different forms to implement the program: one possibility is

y2 =y*y;
y3=y2*y;
z=b3*y3+b2*y2+b1*y+b0;

But another one is:

z= (((y+a0)*y+a1)*y+a2)*a3;

It depends of your uC and how you code (C or assembly, fixed or floating point) which one is more efficient. "

I think second equaition fast works and short, becouse formula need only 3 multiplification and 3 addition.
 

dipchip

Member level 2
Joined
May 7, 2001
Messages
48
Helped
1
Reputation
2
Reaction score
1
Trophy points
1,288
Location
middle USA
Activity points
365
Just a question. What degree of accuracy does the approximation need to have and what are the expected ragne in the varables x & y?
 

suiram

Junior Member level 3
Joined
Mar 27, 2001
Messages
25
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
201
pow

The idea with polyfit sounds great.
Unfortunatelly I don't know the range.

To be more specific here is the code that I try to optimize:

static float EvaluateC_PF(DEB_PARAM*deb_param,float RD,BOOL first_call,void* cache_buf)
{
float C;
float D,L1,L2,A,M;
float tmp1,tmp2,tmp3;
float* cache=(float*)cache_buf;
A = pow((19000.0*deb_param->beta)/RD,0.8);
if(first_call)
{
D=deb_param->Dt*1000.0;
L1=L2=25.4/D;
M=2*L2/(1-deb_param->beta);
tmp1=(0.043+0.080*exp(-10.0*L1)-0.123*exp(-7.0*L1))*(deb_param->beta4/deb_param->beta1m4)*(1-0.11*A);
tmp1-=0.031*(M-0.8*pow(M,1.1))*deb_param->beta13;
tmp2=0.5961+0.0261*deb_param->beta2-0.216*deb_param->beta8;
tmp3=0.11*(0.75-deb_param->beta)*(2.8-D/25.4);
cache[0]=D;
cache[1]=L1;
cache[2]=L2;
cache[3]=M;
cache[4]=tmp1;
cache[5]=tmp2;
cache[6]=tmp3;
}
else
{
D=cache[0];
L1=cache[1];
L2=cache[2];
M=cache[3];
tmp1=cache[4];
tmp2=cache[5];
tmp3=cache[6];
}
RD=1e6/RD;
C=tmp2+0.000521*pow(RD*deb_param->beta,0.7);
C+=(0.0188+0.0063*A)*deb_param->beta35*pow(RD,0.3);
C+=tmp1;
if(D<71.12)
C+=tmp3;
return C;
}
It's very hard to know the range of parameters in such equation.
 

zorro

Advanced Member level 4
Joined
Sep 6, 2001
Messages
1,131
Helped
356
Reputation
710
Reaction score
298
Trophy points
1,363
Location
Argentina
Activity points
8,903
Hi suiram,

You can proceed in this way for covering all the needed range:
Suppose you want to calculate y=x^A (for x positive).
Let x=M*2^E with 0.5<=M<1 (M and E are the floating-point mantissa and exponent of x).
Then y = (M*2^E)^A = M^A * (2^A)^E.
As 0.5<=M<1, you can calculate by polynomial approximation the function M^A valid in the range [0.5, 1) in the way I described above, and multiply the result by (2^A)^E. As the possible values of E are intigers and limited, you can construct a look-up table for this factor, or perform this (but it is slower):

#define pow2_A ...... // 2^A
#define pow2_minusA ...... // 2^-A

factor = 1;
if (E>0) {
for (ii=E; ii; ii--)
factor *= pow2_A;
}
else if (E<0) {
for (ii=E; ii; ii++)
factor *= pow2_minusA;
}
// Now, factor==(2^A)^E

Stay connected.
Regards

Z
 

suiram

Junior Member level 3
Joined
Mar 27, 2001
Messages
25
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
201
pow

Hi zorro,

I've implemented the algorithm you described. The idea is so simple...after you know it. This algorithm executes 4 times faster than my standard pow function.
Still I do have problems if base is greater than 1. Please take a look at attached file.
 

zorro

Advanced Member level 4
Joined
Sep 6, 2001
Messages
1,131
Helped
356
Reputation
710
Reaction score
298
Trophy points
1,363
Location
Argentina
Activity points
8,903
Hi suiram,

your program seems to be OK. What do you mean saying that you do have problems if base is greater than 1?

Some suggestions that could optimize (just a little) your code on uC:

1) Many uC do have a hardware multiplier (but dont't have a hardware divider). In that case division is slower than multiplication (even in floating point). You can avoid the final division (1.0/Answer) using the constant POW_2_3x6 when exponent is negative.

2) (RemainingMultiplicity&1) is faster than (RemainingMultiplicity%2)

3) (RemainingMultiplicity >> 1) is faster than (RemainingMultiplicity /= 2) and is the same because !(RemainingMultiplicity%2)

4) arithmetic with float types could be faster than with double (this is implementation dependent). [But it's better to test the algorithm with doubles].

About evaluation of errors:

Rather than the average of errors, the squared-root of the average of squared relative errors (or average of absolute relative errors) is more meaningful for evaluating the approximation.
Using a high-order approximation (as your 8th order polynomial), when you compare the results with those of the standard pow() function, the errors of this function should be considered because they could be comparable to the errors of your xpow() function. The errors of the C library are implementation-dependent; they can be not too accurate because accurate functions cuold be very slow with a uC [pow() function performs log, multiplication and exp]. Results from the PC program are reliable as reference, but non necessarily if you test on the target uC.

Regards

Z
 

brmadhukar

Advanced Member level 3
Joined
Jun 21, 2002
Messages
840
Helped
42
Reputation
84
Reaction score
11
Trophy points
1,298
Location
India
Activity points
6,783
Hi,
Is there a range for x and y. If you know the ranges it can be easy to get a better approximation.

For example,

if x < 2.0, then the square root can be easliy evaluated, refer attachment but you may optimize on it.

say z = sqrt(x), x^(0.8) = x^(5/2/2) = z^5

Pl remember the number of iterations depends on accuracy required.

If you need simple implementations try mclaurin/taylors expansion.
 

Rainbow Wizard

Junior Member level 2
Joined
Mar 3, 2002
Messages
20
Helped
1
Reputation
2
Reaction score
1
Trophy points
1,283
Activity points
128
Fast Exponentials in assembly or C

There is another approach to this problem if you have some spare memory in your prototype. Basically, you use a Look Up Table (LUT). Simply generate a log and an antilog table using any programming tool, and load the tables at program start-up. Then your equation: x^0.8+y^3.6 reduces to

result=antilog[log(x)*.8] +antilog[log(y)*3.6]

Hope this helps
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Top