[vox-tech] another gcc question

Charles Polisher vox-tech@lists.lugod.org
Wed, 27 Feb 2002 14:08:34 -0800


On Wed, Feb 27, 2002 at 10:03:22AM -0800, ME wrote:
<snip>
> $ gcc gcc -funroll-all-loops -S sample.c
> 
> When I inspect the above, I see loops included.
> -12(%ebp) (3 32-bit offset from %ebp) is set to 5 and -4(%ebp) is incl
> until it is cmpl to be no longer less than -12(%ebp).
> 
> Labels even show loops when you watch it. I count about 3 when I quickly
> scan it.
> 
> This would lead me to believe the generated asm, code is not unrolled if I
> understand the expectation of the unrolling process. (I would guess
<snip>

>From the gcc manual: "-funroll-all-loops ... usually
makes the program run slower."

gcc doesn't unroll loops unless -O3 is selected. Also,
unrolling a loop can make it too big to fit in the cache,
which will make the code larger and very much slower
if the processor has to deal with page faults. The break-
even point is hard to predict, but intuitively a loop 
that does tons of processing and loops maybe 3 times
is not a good candidate, but a loop that executes
thousands of times and does diddly would see a speedup.
ISTR that gcc unrolls loops intelligently where it can,
so a loop that executes n times won't end up with
n copies of the code, generally speaking. 

You'll also want to look at -fmove-all-movables, 
-frerun-cse-after-loop, -frerun-loop-opt, and
-fexpensive-optimizations. They're all documented
in the gcc manual. 

-- 
A watched process never cores.