float code replacement?

4 posts in this topic


the only float calculation i have is - sadly on a frequently used place (every 32th step 32*8 values) ---


the calculation should add and substract Velocity to (all) CsC, depending on:

  the currently Received Midi-Note-On-Value (Velo_From_Note)

   the Assign-CC-Matrix: which and if a CC should be addet or subtract a specified ammount of Value.  where Value 63 should end up with +-0, and 0 to -64, 127 to +64 (beat[port].Velo_Morph[x]-64.0)

   the Velocity Morph Offset Controller, which when turned of, should minimize the effect from above parameters to minimal, while on MAX the Offset should have full effect (beat[port].Velo_Morph_Offset)


working float code:

static float value = 0;
static float morph = 0;
// calculate CC Value  
morph = (((Velo_From_Note[port]/127.0) * (beat[port].Velo_Morph[x]-64.0))  / 64.0)   *   beat[port].Velo_Morph_Offset;

int morphint = morph;
value = value + morph;
if(value <=   0) { value = 0; }    // only in a range of 0-127
if(value >= 127) { value = 127; }  // only in a range of 0-127
int valueint = value;


i was trying out bitshifting in high atmosphares (S32 integers), but  in reality it did not worked out, i was in maths really bad, and up to now its a pain in the...for me.

not working for example:

static u32 value = 0;
static s32 morph = 0;

morph =
                //   to get to full u8 range      // no need to shift we need half
               (( ((  (Velo_From_Note[port] << 1)   *  (beat[port].Velo_Morph[x]<<8) )>>8)        *  (beat[port].Velo_Morph_Offset<<9))>>24);
value = value + morph;
  if(value <=   0) { value = 0; }    // only in a range of 0-127
  if(value >= 127) { value = 127; }  // only in a range of 0-127

maybe there is some integer-expert workaround BRO out there who can help?

Edited by Phatline

Share this post

Link to post
Share on other sites

Hi  Phatline,

This may be a bit late, but what you are trying to do shouldn't be that much of a problem :)


I refactored your code a little to understand better what it should do:

int32_t value;

uint8_t note_vel = Velo_From_Note[port];
uint8_t morph_vel = beat[port].Velo_Morph[x];
uint8_t morph_offset = beat[port].Velo_Morph_Offset;
int32_t morph = ( ((vel / 127.0) * (morph_vel - 64.0)) / 64.0 ) * morph_offset;

If all input values are ints then the only operations we should worry about are divisions.
If we modify the formula so that there are no divisions anymore and we can verify that our data fits into a 32bit value we are fine.
Note that divisions by constants get optimized by more recent compilers, so unless you want to emphasize the use of a shift operation you can simply use normal * and / operators and avoid feeling like driving your brain through a meat grinder ;)

Because multiplications are associative and commutative we can write the formula like this:


int32_t morph = vel * (morph_vel - 64) * morph_offset / (127 * 64);

/* Now we multiply by 127 * 64 = 8128 and everything is integer */

int32_t morph_8128 = vel * (morph_vel - 64) * morph_offset;

let's check if that fits into our 32bit value by adding the size of every factor variable.
This works because a product of two variables with x bits fits into a variable with x+x bits
The signed int can only represent a positive 31 bit value as the MSB is used for negative values.

31 >= 8 + 8 + 8
That is valid and therefore if the variables all have 8 bits they fit into 31 (or even 24) bits.

When you add it to value, we have to have one bit in reserve because e.g. if they are both ((1<<24) - 1)
we need 25 bits for that value
value = morph + value

Your code that checks the value needs to be adjusted, too:

if (value < 0)
    value = 0;
else if (value > 127 * 8128)
    value = 127 * 8128;

return value / 8128;


You can reduce the time those multiplications drastically if you stick to powers of 2 but you might get away with this here ;)


Cheers, Roman




Share this post

Link to post
Share on other sites

thx... i guess - i read that, understood maybe half - - have to impliment it into my code, the project was sleeping anyway, time to wake it up


Share this post

Link to post
Share on other sites

Just as romsom said, an easy tip to avoid floats is to calculate with high precision 32bit integers and regard a few bits of these integers as floating part.

I.e. if you have 32bit integers available on your platform (MIOS32 has this) and you want to use 8 bit floating precision, you would have 24bit integer precision.
You can then just multiply input "floats" with the number 256 (2^8), and calculate everything with the "upscaled" integer values.
Depending on operation, you need to divide again, that can be quickly done by bitshifting (e.g. >> 8).

If you need the output, just divide by 256 again, or bitshift down.

In short words: just calculate with upscaled numbers, and then downscale again, when results are needed or when operations (e.g. multiplying two upscaled numbers) require that.

Many greets and good luck!


Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now