The following sections describe the general optimizations done by the compiler, they are not processor specific. Some of these require some compiler switch override while others are done automatically (those which require a switch will be noted as such).
In Free Pascal, if the operand(s) of an operator are constants, they will be evaluated at compile time.
Example
x:=1+2+3+6+5;will generate the same code as
x:=17;
Furthermore, if an array index is a constant, the offset will be evaluated at compile time. This means that accessing MyData[5] is as efficient as accessing a normal variable.
Finally, calling Chr, Hi, Lo, Ord, Pred, or Succ functions with constant parameters generates no run-time library calls, instead, the values are evaluated at compile time.
Using the same constant string, floating point value or constant set two or more times generates only one copy of that constant.
Evaluation of boolean expression stops as soon as the result is known, which makes code execute faster then if all boolean operands were evaluated.
Using the in operator is always more efficient then using the
equivalent <>
, =
, <=
, >=
, <
and >
operators. This is because range comparisons can be done more easily with
in then with normal comparison operators.
Sets which contain less then 33 elements can be directly encoded using a 32-bit value, therefore no run-time library calls to evaluate operands on these sets are required; they are directly encoded by the code generator.
Assignments of constants to variables are range checked at compile time, which removes the need of the generation of runtime range checking code.
When the second operand of a mod on an unsigned value is a constant power of 2, an and instruction is used instead of an integer division. This generates more efficient code.
When one of the operands in a multiplication is a power of two, they are encoded using arithmetic shift instructions, which generates more efficient code.
Similarly, if the divisor in a div operation is a power of two, it is encoded using arithmetic shift instructions.
The same is true when accessing array indexes which are powers of two, the address is calculated using arithmetic shifts instead of the multiply instruction.
By default all variables larger then a byte are guaranteed to be aligned at least on a word boundary.
Alignment on the stack and in the data section is processor dependant.
This feature removes all unreferenced code in the final executable file, making the executable file much smaller.
Smart linking is switched on with the -Cx command-line switch, or using the {$SMARTLINK ON} global directive.
The following runtime library routines are coded directly into the final executable: Lo, Hi, High, Sizeof, TypeOf, Length, Pred, Succ, Inc, Dec and Assigned.
When using the -O1 (or higher) switch, case statements will be generated using a jump table if appropriate, to make them execute faster.
Under specific conditions, the stack frame (entry and exit code for the routine, see section section CallingConventions) will be omitted, and the variable will directly be accessed via the stack pointer.
Conditions for omission of the stack frame:
When using the -Or switch, local variables or parameters which are used very often will be moved to registers for faster access.
This lists the low-level optimizations performed, on a processor per processor basis.
Here follows a listing of the optimizing techniques used in the compiler:
Although you can enable uncertain optimizations in most cases, for people who do not understand the following technical explanation, it might be the safest to leave them off.
Remark: If uncertain optimizations are enabled, the CSE algortihm assumes that
The practical upshot of this is that you cannot use the uncertain optimizations if you both write and read local or global variables directly and through pointers (this includes Var parameters, as those are pointers too).
The following example will produce bad code when you switch on uncertain optimizations:
Var temp: Longint; Procedure Foo(Var Bar: Longint); Begin If (Bar = temp) Then Begin Inc(Bar); If (Bar <> temp) then Writeln('bug!') End End; Begin Foo(Temp); End.The reason it produces bad code is because you access the global variable Temp both through its name Temp and through a pointer, in this case using the Bar variable parameter, which is nothing but a pointer to Temp in the above code.
On the other hand, you can use the uncertain optimizations if you access global/local variables or parameters through pointers, and only access them through this pointer.
For example:
Type TMyRec = Record a, b: Longint; End; PMyRec = ^TMyRec; TMyRecArray = Array [1..100000] of TMyRec; PMyRecArray = ^TMyRecArray; Var MyRecArrayPtr: PMyRecArray; MyRecPtr: PMyRec; Counter: Longint; Begin New(MyRecArrayPtr); For Counter := 1 to 100000 Do Begin MyRecPtr := @MyRecArrayPtr^[Counter]; MyRecPtr^.a := Counter; MyRecPtr^.b := Counter div 2; End; End.Will produce correct code, because the global variable MyRecArrayPtr is not accessed directly, but only through a pointer (MyRecPtr in this case).
In conclusion, one could say that you can use uncertain optimizations only when you know what you're doing.
Using the -O2 switch does several optimizations in the code produced, the most notable being:
subl $4,%esp
") instead of slower, smaller instructions
("enter $4
"). This is the default setting.
movzbl (mem), %eax|to a combination of simpler instructions
xorl %eax, %eax movb (mem), %alfor the Pentium.