Subsections
3 Using Assembly language
Free Pascal supports inserting assembler statements in your code. The
mechanism for this is the same as under Turbo Pascal. There are, however
some substantial differences, as will be explained in the following
sections.
1 Intel syntax
Free Pascal supports Intel syntax for the Intel family of Ix86 processors
in it's asm blocks.
The Intel syntax in your asm block is converted to AT&T syntax by the
compiler, after which it is inserted in the compiled source.
The supported assembler constructs are a subset of the normal assembly
syntax. In what follows we specify what constructs are not supported in
Free Pascal, but which exist in Turbo Pascal:
- The TBYTE qualifier is not supported.
- The & identifier override is not supported.
- The HIGH operator is not supported.
- The LOW operator is not supported.
- The OFFSET and SEG operators are not supported.
Use LEA and the various Lxx instructions instead.
- Expressions with constant strings are not allowed.
- Access to record fields via parenthesis is not allowed
- Typecasts with normal pascal types are not allowed, only
recognized assembler typecasts are allowed. Example:
mov al, byte ptr MyWord -- allowed,
mov al, byte(MyWord) -- allowed,
mov al, shortint(MyWord) -- not allowed.
- Pascal type typecasts on constants are not allowed.
Example:
const s= 10; const t = 32767;
in Turbo Pascal:
mov al, byte(s) -- useless typecast.
mov al, byte(t) -- syntax error!
In this parser, either of those cases will give out a syntax error.
- Constant references expressions with constants only are not
allowed (in all cases they do not work in protected mode,
under LINUX i386). Examples:
mov al,byte ptr ['c'] -- not allowed.
mov al,byte ptr [100h] -- not allowed.
(This is due to the limitation of Turbo Assembler).
- Brackets within brackets are not allowed
- Expressions with segment overrides fully in brackets are
presently not supported, but they can easily be implemented
in BuildReference if requested. Example:
mov al,[ds:bx] -- not allowed
use instead:
mov al,ds:[bx]
- Possible allowed indexing are as follows:
- Sreg:[REG+REG*SCALING+/-disp]
- SReg:[REG+/-disp]
- SReg:[REG]
- SReg:[REG+REG+/-disp]
- SReg:[REG+REG*SCALING]
Where Sreg is optional and specifies the segment override.
Notes:
- The order of terms is important contrary to Turbo Pascal.
- The Scaling value must be a value, and not an identifier
to a symbol. Examples:
const myscale = 1;
...
mov al,byte ptr [esi+ebx*myscale] -- not allowed.
use:
mov al, byte ptr [esi+ebx*1]
- Possible variable identifier syntax is as follows:
(Id = Variable or typed constant identifier.)
- ID
- [ID]
- [ID+expr]
- ID[expr]
Possible fields are as follow:
- ID.subfield.subfield ...
- [ref].ID.subfield.subfield ...
- [ref].typename.subfield ...
- Local abels: Contrary to Turbo Pascal, local labels, must
at least contain one character after the local symbol indicator.
Example:
@: -- not allowed
use instead, for example:
@1: -- allowed
- Contrary to Turbo Pascal local references cannot be used as references,
only as displacements. Example:
lds si,@mylabel -- not allowed
- Contrary to Turbo Pascal, SEGCS, SEGDS, SEGES and
SEGSS segment overrides are presently not supported.
(This is a planned addition though).
- Contrary to Turbo Pascal where memory sizes specifiers can
be practically anywhere, the Free Pascal Intel inline assembler requires
memory size specifiers to be outside the brackets. Example:
mov al,[byte ptr myvar] -- not allowed.
use:
mov al,byte ptr [myvar] -- allowed.
- Base and Index registers must be 32-bit registers.
(limitation of the GNU Assembler).
- XLAT is equivalent to XLATB.
- Only Single and Double FPU opcodes are supported.
- Floating point opcodes are currently not supported
(except those which involve only floating point registers).
The Intel inline assembler supports the following macros:
- @Result
- represents the function result return value.
- Self
- represents the object method pointer in methods.
2 AT&T Syntax
Free Pascal uses the GNU as assembler to generate its object files for
the Intel Ix86 processors. Since
the GNU assembler uses AT&T assembly syntax, the code you write should
use the same syntax. The differences between AT&T and Intel syntax as used
in Turbo Pascal are summarized in the following:
- The opcode names include the size of the operand. In general, one can
say that the AT&T opcode name is the Intel opcode name, suffixed with a
'l', 'w' or 'b' for, respectively, longint (32 bit),
word (16 bit) and byte (8 bit) memory or register references. As an example,
the Intel construct 'mov al bl is equivalent to the AT&T style 'movb
%bl,%al' instruction.
- AT&T immediate operands are designated with '$', while Intel syntax
doesn't use a prefix for immediate operands. Thus the Intel construct
'mov ax, 2' becomes 'movb $2, %al' in AT&T syntax.
- AT&T register names are preceded by a '%' sign.
They are undelimited in Intel syntax.
- AT&T indicates absolute jump/call operands with '*', Intel
syntax doesn't delimit these addresses.
- The order of the source and destination operands are switched. AT&T
syntax uses 'Source, Dest', while Intel syntax features 'Dest,
Source'. Thus the Intel construct 'add eax, 4' transforms to
'addl $4, %eax' in the AT&T dialect.
- Immediate long jumps are prefixed with the 'l' prefix. Thus the
Intel 'call/jmp section:offset' is transformed to 'lcall/ljmp
$section,$offset'. Similarly the far return is 'lret', instead of the
Intel 'ret far'.
- Memory references are specified differently in AT&T and Intel
assembly. The Intel indirect memory reference
Section:[Base + Index*Scale + Offs]
is written in AT&T syntax as:
Section:Offs(Base,Index,Scale)
Where Base and Index are optional 32-bit base and index
registers, and Scale is used to multiply Index. It can take the
values 1,2,4 and 8. The Section is used to specify an optional section
register for the memory operand.
More information about the AT&T syntax can be found in the as manual,
although the following differences with normal AT&T assembly must be taken
into account:
The AT&T inline assembler supports the following macros:
- __RESULT
- represents the function result return value.
- __SELF
- represents the object method pointer in methods.
- __OLDEBP
- represents the old base pointer in recusrive routines.
The inline assembler reader for the Motorola 680x0 family of processors,
uses the Motorola Assembler syntax (q.v). A few differences do exit:
- Local labels start with the @ character, such as
@MyLabel:
- The XDEF directive in an assembler block will
make the symbol available publicly with the specified name
(this name is case sensitive)
- The DB, DW, DD directives can only
be used to declare constants which will be stored in the
code segment.
- The Align directive is not supported.
- Arithmetic operations on constant expression use the same
operands as the intel version (e.g : AND, XOR ...)
- Segment directives are not supported
- Only 68000 opcodes are currently supported
The inline assembler supports the following macros:
- @Result
- represents the function result return value.
- Self
- represents the object method pointer in methods.
3 Signaling changed registers
When the compiler uses variables, it sometimes stores them, or the result of
some calculations, in the processor registers. If you insert assembler code
in your program that modifies the processor registers, then this may
interfere with the compiler's idea about the registers. To avoid this
problem, Free Pascal allows you to tell the compiler which registers have changed.
The compiler will then avoid using these registers. Telling the compiler
which registers have changed is done by specifying a set of register names
behind an assembly block, as follows:
asm
...
end ['R1', ... ,'Rn'];
Here R1 to Rn are the names of the registers you
modify in your assembly code.
As an example:
asm
movl BP,%eax
movl 4(%eax),%eax
movl %eax,__RESULT
end ['EAX'];
This example tells the compiler that the EAX register was modified.
Free Pascal Compiler
2001-09-22