Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Faster fade for 32bit modes
#1
Just to not left out poor QBN - ers from the good stuff Big Grin (I posted this on TBN yesterday):



Here is a good fade algorithm two times faster than the usual ones:

"tr" can vary from 0 to 256 specifiing the fade level (not 0 to 255!)

Code:
case 2:  /*Two multiply fade*/
for (j=0;j<480000;j+=800){
  t=j+800;
  for (i=j;i<t;i++){
   t2=(img[i] & 0xFF00FF)*tr;
   wData[i]=MakeRGB(
     (t2 & 0xFF000000)>>24,
     ((img[i] & 0xFF00)*tr)>>16,
     (t2 & 0xFF00)>>8);             /*wData: 800x600 window data*/
  }
}
break;

As you can see i tested it with a 800x600 image. It performed at around 10FPS on my P233MHz PC what is acceptable. Hope you can understand well Smile The img image data is an unsigned 32 bit integer array holding RGB values while it's highest byte remains unused (The usual way of holding 24bit images). wData is the bitmap of the window on which this image is being sent.

If i invented something existing since ages, sorry, but i had never ran in this before Smile





Yet another, a bit lossy, but 2 times faster fade algorithm Smile

Code:
case 4:  /*Dither fade*/
dith++;  /*dith must be even in one run, odd in the second...
         **There are even no. of lines, that's why this is needed*/
for (j=0;j<480000;j+=800){
dith++;
dith=dith % 2;
t=j+800;
for (i=j+dith;i<t;i+=2){
  t2=(img[i] & 0xFF00FF)*tr;
  wData[i]=MakeRGB(
    (t2 & 0xFF000000)>>24,
    ((img[i] & 0xFF00)*tr)>>16,
    (t2 & 0xFF00)>>8);             /*wData: 800x600 window data*/
}
}
break;

It draws every second pixel in each update, so it needs half amount of time to finish with it. It is not noticeable that it skips the half of the image in each run since the next run will fill especially the remained gaps. The image not looked dithered either as there are not too much difference between two frames, due to the movement it is not noticeable. It performed at around 14FPS for me (Not 20FPS as more and more time is being taken away by updating the display).

Of course this can be made faster by skipping even more pixels at each run, but the effect will become worse and worse. The best way should be calculating the PC speed before doing the fades, and determine the best algo to be used that way.

Setting the skipped pixels to 3 made it a little noticeable, 4 resulted in a "strange" effect - although it not looked bad, but was not that fade what i wished Smile

Code:
case 4:  /*Dither fade*/
diths++;
diths=diths % 4;
switch (diths){            /*Set up start positions*/
case 0: dith=0; break;
case 1: dith=3; break;
case 2: dith=1; break;
case 3: dith=2; break;
}
for (j=0;j<480000;j+=800){
dith+=2;
dith=dith % 4;
t=j+800;
for (i=j+dith;i<t;i+=4){
  t2=(img[i] & 0xFF00FF)*tr;
  wData[i]=MakeRGB(
    (t2 & 0xFF000000)>>24,
    ((img[i] & 0xFF00)*tr)>>16,
    (t2 & 0xFF00)>>8);             /*wData: 800x600 window data*/
}
}
break;

This is a nice 1/4 fade algo running at the same speed as giving 4 to the above code, but still looking continous (although it is a little visible that it is dithered). Then bingo! This resulted in a smooth nice 17FPS fade on my PC (P233, with that 800x600 image). When i set it to fade out in 6 seconds it looked completely continous... Something what i had never seen in 32bit OSses so far - a Sumatran Tiger fading in and out smoothly in the background occupiing almost the whole screen Cool Impressive...

It is certainly annoying that Windows is so slow in updating the window... The maximal empty update FPS should be at around 21 or even less. I think i will look for a better method as now it showed that this is the weakest point.

Calculating with 21FPS i tried to estimate the correct FPSes of my algos:
Original fade: 4FPS / 5FPS
Two multiply fade: 10FPS / 19FPS
Dither fade 2: 14FPS / 42FPS
Dither fade 4: 17FPS / 85FPS

As i said these values are not exact, there might be possibly even 25% difference between the real performance and the calculated, but they seem to fit in my predictions. The "original fade" is an algorithm with three multiplications and three divisons supporting fade levels from 0 - 255 as usual.
fter 60 million years a civilization will search for a meteorite destroying most of the living creatures around this age...

There must be a better future for the Cheetahs!

http://rcs.fateback.com/
Reply
#2
WOW. That is awesome! Now we just need to port it to QB for QB-ers to use also!
974277320612072617420666C61696C21 (Hexadecimal for those who don't know)
Reply
#3
What is so hard in porting that ~3*15 lines of code? :lol: It is just an idea, i just tested it in C since i understand that lang. the best. Why for QB? Who uses SVGA in QB Big Grin - rather for FreeBasic.
fter 60 million years a civilization will search for a meteorite destroying most of the living creatures around this age...

There must be a better future for the Cheetahs!

http://rcs.fateback.com/
Reply
#4
I found a method to test the FPS of these routines appropriately by kicking off the "slow Windows" factor. So here are the results with the same 800x600 Sumatran Tiger:

Usual 256 level 3 multiply / 3 divide fade: 4.5FPS
257 level 3 multiply fade: 9FPS
257 level 2 multiply fade: 10.5FPS
257 level multiplication table fade: 4FPS
257 level ditherfade 4: 34FPS

So in overall i terribly overestimated the speed of those algos, although they are still not slow (at least the ditherfade). The most strange might be that in these conditions the multiplication table fade's performance was awfully low.

Code:
t2=((t3=img[i]) & 0xFF00FF)*tr;
wData[i]=MakeRGB(
   (t2 & 0xFF000000)>>24,
   ((t3 & 0xFF00)*tr)>>16,
   (t2 & 0xFF00)>>8);             /*wData: 800x600 window data*/

This little change seemed to improve the speed a very little, but i am not sure...
257 level 2 multiply fade: 11.5FPS
257 level ditherfade 4: 34FPS

Unrolling the loop (4 pixels in each cycle) with the original algo resulted this:
257 level 2 multiply fade: 13FPS

The same with the new code:
257 level 2 multiply fade: 14.5FPS


I think something was still left in there as a slowdown factor since ditherfade 4 did not performed 4 times faster than the two multiply fade. Probably the message processing... Although i certainly overestimated Tongue the optimizations gave some of the "lost" speed back. 58 FPS for ditherfade 4 is still nice, although it is certainly far from 85...
fter 60 million years a civilization will search for a meteorite destroying most of the living creatures around this age...

There must be a better future for the Cheetahs!

http://rcs.fateback.com/
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)