floating point to fixed point

Status
Not open for further replies.

smileysam

Member level 4
Hi,
I need help in coverting floating point to fixed point..
Any help will be appreciated

rsrinivas

Advanced Member level 1
Hi,
what format is ur data what Q formatdo u need???
Why not try fixed point toolbox in matlab?
It would be really helpful

cheers
srinivas

Thinkie

Full Member level 3
Generally speaking you just use the exact bit pattern of the floating point SHIFTED by a number determined by the exponent and the place of your point in the fixed point

I will try to dig out some code from my archives

smileysam

Member level 4
i have to implement it in 'c' language

echo47

Advanced Member level 5
C does many numeric conversions automatically. However, I'm not clear what you mean by "fixed point". Do you mean an integer with an implied scaling factor? If you can show us example numbers, someone can probably give you example C code.

Kral

Advanced Member level 4
smileysam,
Visit the following site:
steve.hollasch.net/cgindex/coding/ieeefloat.html
It gives on overview of the IEEE 754 floating point standard, which is the most commonly used format today.
Regards,
Kral

lambtron

Full Member level 5
smileysam said:
I need help in coverting floating point to fixed point
How about something like this ...

Code:
// Set this to the number of desired binary digits to the right of the radix.
#define NUM_FRACT_BIN_DIGITS  8

// The Fixed Point data type, which must be a fundamental type (e.g., int, short, long, etc.).
typedef int FIXEDPOINT;

// Convert float value to fixed point.
FIXEDPOINT FloatToFixed( float val )
{
return (FIXEDPOINT)( val * ( 1 << NUM_FRACT_BIN_DIGITS ) );
}

// Convert fixed point value to float.
float FixedToFloat( FIXEDPOINT val )
{
return (float)val / ( 1 << NUM_FRACT_BIN_DIGITS );
}

// Extract the whole part of a fixed point value.
int IntegerPart( FIXEDPOINT val )
{
return val >> NUM_FRACT_BIN_DIGITS;
}

// Extract the fractional part of a fixed point value.
int FractionalPart( FIXEDPOINT val )
{
return val & ( ( 1 << NUM_FRACT_BIN_DIGITS ) - 1 );
}

amitkumargupta_amitkumar

Newbie level 4
see page-805 DIGITAL SIGNAL PROCESSING"APRACTICAL APPROACH" BY EMMANUEL C. IFEACHOR . This will help , i am sure.

just4me

Newbie level 4
No need to do such complicated stuff. Just typecase using (int)

float f;
int i;

f = 10.5;

i = (int)f;

converts the float which is stored using floating point notation by C into a fixed point number.

I am not sure if you want to do this, or a program which accepts a floating point number as input and display its representation in fixed point.

smileysam

Member level 4
hi ..thanks for ur help
As far i know
First convert the floating point num in to fixed point by rightshifting by num of fraction bits & opp thing is done at end to convert fixed into floating point..
1)Now the prob is multiplaction... Suppose 1.31*1.31 will give 2.62 format .We right shift by 31 ..but one sign bit is wasted so we left shift by one...
in'c' how do i handle 64 bit num...
2) In addition of 2 1.31 numjs the answer should be 2.30 rather than 1.31..how do i handle that? coz -1+(-1) cannot be represented in 1.31 format..

echo47

Advanced Member level 5
I don't think you need to worry about floating point conversion details, because C does it for you, and your compiler will probably optimize it very well. Simply multiply your float by a suitable scaling factor, and cast it to an int. Of course, your floating point data must have the same format that's used by your C compiler.

lambtron

Full Member level 5
Here I am using code presented in an earlier post.

The sum of two fixed point values has a consistent radix alignment, so the sum needs no radix adjustment:

Code:
FIXEDPOINT a, sum;
float fsum;
a = FloatToFixed( 1.31 );     // a = 335
sum = a + a;                  // sum = 670
fsum = FloatToFixed( sum );   // fsum = 2.61 (almost 2.62!)

Multiplication
The product of two fixed point values includes the SQUARE of the fixed-float conversion factor, so the result must be divided by the conversion factor as shown in this function:

Code:
FIXEDPOINT FixedMultiply( FIXEDPOINT a, FIXEDPOINT b )
{
return ( a * b ) >> NUM_FRACT_BIN_DIGITS;
}

FIXEDPOINT a, product;
float fsum;
a = FloatToFixed( 1.31 );             // a = 335
product = FixedMultiply( a, a );      // product = 438
fproduct = FloatToFixed( product );   // fproduct = 1.71

Regarding the handling of 64 bit (or larger) values: Usually, the reason for using fixed point is to accelerate arithmetic performance for values that are restricted to a relatively small dynamic range. If your value range is so large that the math doesn't fit your cpu's native integer size then maybe you should reconsider your decision to use fixed point math.

If you absolutely must have a fixed point data size that is larger than a native integer, you may want to code your fixed point math routines in assembly language. Although this can be done in C/C++, the math functions will be an order of magnitute faster if done in assembly.

echo47

Advanced Member level 5
smileysam, can you tell us what hardware you are using and why you are trying to do the conversion manually? Perhaps someone here can suggest a different easier approach.

For some processors, floating point arithmetic is equally fast or faster than integer arithmetic, and conversions can cause slowdowns.

If you are using the common IEEE 754 floating point format, try searching for a copy of IEEE Std 754-1985. You may find it helpful.

Status
Not open for further replies.