[vox-tech] float vs double, int vs short

Thu Feb 16 17:47:24 PST 2006

Boris Jeremic wrote:
> I think that on X86, both float and double get expanded to long double, 
> then all the computations locally (in CPU) get done with long double) 
> and the results are then converted to whatever they were intended to be 
> to begin with. See the attached example code for finding machine epsilon 
> (plain c, just do
> 
> 
> gcc LUGODtest.cpp
> 
> and run ./a.out
> 
> 
> On other architectures (tried on SUN) there is actually  a difference 
> (as it should be).
> 
> So to conclude, I do not think that you gain much (if anything) on CPU 
> with floats versus doubles. There might be a gain in moving data from 
> and to memory and similar, but computationally, probably the best is to 
> run a million loops on  an example (maybe even the one for machine 
> epsilon attached below) and see where it goes...

This result is somewhat language/compiler dependent.  For historical
reasons, C/C++ can be tricky to use single precision floating point
data in without getting unintentional "widening" that can slow down the
single-precision processing.

The f77 program below runs about 5% faster when the data type is
"REAL" instead of "DOUBLE PRECISION".  I believe this is due to
Fortran not widening and narrowing at the drop of a hat the way
C++ often does. (I haven't written Fortran in quite awhile...
sorry for the crude code.)

It is possible to switch the 80x87 FPU mode to different precision
settings regardless of the fact that it ALWAYS uses 80-bit numbers.
Thus, it will stop a division before working out all the bits if
it is set to do so and thereby shorten the computation time for
individual floating point operations. It just so happens that
it is normally set to a particular level of precision [1] so
you shouldn't see much difference between float and double
on most x86 platforms if the handling of the data outside of the
FPU is similar. (If [2] is to be believed, Linux keeps the FPU
at a long-double level of accuracy, while Windows keeps it at
double, which could give Windows a performance edge.)

I don't mean to suggest that fast computation is not possible
in C++... to the contrary, it should be able to go quite fast.
The difficulty may lie in avoiding legacy Standard C numerical
routines and in learning your compiler's options and getting
familiar with best numerical analysis practices in C++.  I don't
claim such expertise particularly, though I have read quite a bit
about computations for embedded computing applications where floating
point may even be an unavailable luxury.

[1] http://www.stereopsis.com/FPU.html
[2] http://www.intel.com/support/performancetools/fortran/sb/cs-007783.htm

=========
        PROGRAM TESTFLT
        IMPLICIT NONE
        DOUBLE PRECISION NUM1, NUM2, ANS
        INTEGER CNT
        NUM1 = 1.0
        NUM2 = 1.0/3.0
        DO 100, CNT=1,1000000000
          ANS=NUM1*NUM2
100    CONTINUE
        WRITE(*,*) ANS
        STOP
        END
=========

-- 
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------