Subsections


8 Memory issues


1 The memory model.

The Free Pascal compiler issues 32-bit or 64-bit code. This has several consequences:

The fact that 16-bit code is no longer used, means that some of the older Turbo Pascal constructs and functions are obsolete. The following is a list of functions which shouldn't be used anymore:

Seg()
: Returned the segment of a memory address. Since segments have no more meaning, zero is returned in the Free Pascal run-time library implementation of Seg.
Ofs()
: Returned the offset of a memory address. Since segments have no more meaning, the complete address is returned in the Free Pascal implementation of this function. This has as a consequence that the return type is longint or int64 instead of Word.
Cseg(), Dseg()
: Returned, respectively, the code and data segments of your program. This returns zero in the Free Pascal implementation of the system unit, since both code and data are in the same memory space.
Ptr
: Accepted a segment and offset from an address, and would return a pointer to this address. This has been changed in the run-time library, it now simply returns the offset.
memw and mem
: These arrays gave access to the DOS memory. Free Pascal supports them on the go32v2 platform, they are mapped into DOS memory space. You need the go32 unit for this. On other platforms, they are not supported

You shouldn't use these functions, since they are very non-portable, they're specific to DOS and the 80x86 processor. The Free Pascal compiler is designed to be portable to other platforms, so you should keep your code as portable as possible, and not system specific. That is, unless you're writing some driver units, of course.


2 Data formats

This section gives information on the storage space occupied by the different possible types in Free Pascal. Information on internal alignment will also be given.

1 integer types

The storage size of the default integer types are given in Reference guide. In the case of user defined-types, the storage space occupied depends on the bounds of the type:

2 char types

A char, or a subrange of the char type is stored as a byte.

3 boolean types

The boolean type is stored as a byte and can take a value of true or false.

A ByteBool is stored as a byte, a WordBool type is stored as a word, and a longbool is stored as a longint.

4 enumeration types

By default all enumerations are stored as a cardinal (4 bytes), which is equivalent to specifying the {$Z4}, {$PACKENUM 4} or {$PACKENUM DEFAULT} switches.

This default behavior can be changed by compiler switches, and by the compiler mode.

In the tp compiler mode, or while the {$Z1} or {$PACKENUM 1} switches are in effect, the storage space used is shown in table (enumstoragetp) .


Table: Enumeration storage
# Of Elements in Enum. Storage space used
0..255 byte (1 byte)
256..65535 word (2 bytes)
> 65535 cardinal (4 bytes)

When the {$Z2} or {$PACKENUM 2} switches are in effect, the value is stored on 2 bytes (word), if the enumeration has less or equal then 65535 elements, otherwise, the enumeration value is stored as a 4 byte value (cardinal).

5 floating point types

Floating point type sizes and mapping vary from one processor to another. Except for the Intel 80x86 architecture, the extended type maps to the IEEE double type.


Table: Processor mapping of real type
Processor Real type mapping
Intel 80x86 double
Motorola 680x0 (with {$E-} switch) double
Motorola 680x0 (with {$E+} switch) single

Floating point types have a storage binary format divided into three distinct fields : the mantissa, the exponent and the sign bit which stores the sign of the floating pointer value.

1 single

The single type occupies 4 bytes of storage space, and its memory structures is the same as the IEEE-754 single type.

The memory format of the single format looks like this:

single.png

2 double

The double type occupies 8 bytes of storage space, and its memory structures is the same as the IEEE-754 double type.

The memory format of the double format looks like this:

double.png

On processors which do not support co-processor operations (and which have the ${E-} switch), the double type does not exist.

3 extended

For Intel 80x86 processors, the extended type has the format shown in figure XXX, and takes up 10 bytes of storage.

For all other processors which support floating point operations, the extended type is a nickname for the double type. It has the same format and size as the double type. On processors which do not support co-processor operations (and which have the ${E-} switch), the extended type does not exist.

4 comp

For Intel 80x86 processors, the comp type has the format shown in figure XXX, and can contain integer values only. The comp type takes up 8 bytes of storage space.

On other processors, the comp type is not supported.

5 real

Contrary to Turbo Pascal, where the real type had a special internal format, under Free Pascal the real type simply maps to one of the other real types. It maps to the double type on processors which support floating point operations, while it maps to the single type on processors which do not support floating point operations in hardware. See table (RealMapping) for more information on this.

6 pointer types

A pointer type is stored as a cardinal (unsigned 32-bit value) on 32-bit processors, and is stored as a 64-bit unsigned value[*]on 64-bit processors.

7 string types

1 ansistring types

The ansistring is a dynamically allocated string which has no length limitation. When the string is no longer being referenced (its reference count reaches zero), its memory is automatically freed.

If the ansistring is a constant, then its reference count will be equal to -1, indicating that it should never be freed. The structure in memory for an ansistring is shown in table (ansistrings) .


Table: AnsiString memory structure (32-bit model)
Offset Contains
-12 Longint with maximum string size.
-8 Longint with actual string size.
-4 Longint with reference count.
0 Actual array of char, null-terminated.

2 shortstring types

A shortstring occupies as many bytes as its maximum length plus one. The first byte contains the current dynamic length of the string. The following bytes contain the actual characters (of type char) of the string. The maximum size of a short string is the length byte followed by 255 characters.

3 widestring types

The widestring (composed of unicode characters) is not supported in Free Pascal v1.0.

8 set types

A set is stored as an array of bits, where each bit indicates if the element is in the set or excluded from the set. The maximum number of elements in a set is 256.

If a set has less than 32 elements, it is coded as an unsigned 32-bit value. Otherwise it is coded as a 32 element array of 32-bit unsigned values (cardinal) (hence a size of 256 bytes).

The cardinal number of a specific element E is given by :

 CardinalNumber = (E div 32);

and the bit number within that 32-bit value is given by:

 BitNumber = (E mod 32);

9 array types

An array is stored as a contiguous sequence of variables of the components of the array. The components with the lowest indexes are stored first in memory. No alignment is done between each element of the array. A multi-dimensional array is stored with the rightmost dimension increasing first.

10 record types

Each field of a record are stored in a contigous sequence of variables, where the first field is stored at the lowest address in memory. In case of variant fields in a record, each variant starts at the same address in memory. Fields of record are usually aligned, unless the packed directive is specified when declaring the record type. For more information on field alignment, consult section StructuredAlignment.


11 object types

Objects are stored in memory just as ordinary records with an extra field: a pointer to the Virtual Method Table (VMT). This field is stored first, and all fields in the object are stored in the order they are declared (with possible alignment of field addresses, uness the object was declared as being packed).

This field is initialized by the call to the object's Constructor method. If the new operator was used to call the constructor, the data fields of the object will be stored in heap memory, otherwise they will directly be stored in the data section of the final executable.

If an object doesn't have virtual methods, no pointer to a VMT is inserted.

The memory allocated looks as in table (ObjMem) .

Table: Object memory layout (32-bit model)
Offset What
+0 Pointer to VMT (optional).
+4 Data. All fields in the order the've been declared.
...  

The Virtual Method Table (VMT) for each object type consists of 2 check fields (containing the size of the data), a pointer to the object's ancestor's VMT (Nil if there is no ancestor), and then the pointers to all virtual methods. The VMT layout is illustrated in table (ObjVMTMem) . The VMT is constructed by the compiler.


Table: Object Virtual Method Table memory layout (32-bit model)
Offset What
+0 Size of object type data
+4 Minus the size of object type data. Enables determining of valid VMT pointers.
+8 Pointer to ancestor VMT, Nil if no ancestor available.
+12 Pointers to the virtual methods.
...  

12 class types

Just like objects, classes are stored in memory just as ordinary records with an extra field: a pointer to the Virtual Method Table (VMT). This field is stored first, and all fields in the class are stored in the order they are declared.

Contrary to objects, all data fields of a class are always stored in heap memory.

The memory allocated looks as in table (ClassMem) .

Table: Class memory layout (32-bit model)
Offset What
+0 Pointer to VMT.
+4 Data. All fields in the order the've been declared.
...  

The Virtual Method Table (VMT) of each class consists of several fields, which are used for runtime type information. The VMT layout is illustrated in table (ClassVMTMem) . The VMT is constructed by the compiler.


Table: Class Virtual Method Table memory layout (32-bit model)
Offset What
+0 Size of object type data
+4 Minus the size of object type data. Enables determining of valid VMT pointers.
+8 Pointer to ancestor VMT, Nil if no ancestor available.
+12 Pointer to the class name (stored as a shortstring).
+16 Pointer to the dynamic method table (using message with integers).
+20 Pointer to the method definition table.
+24 Pointer to the field definition table.
+28 Pointer to type information table.
+32 Pointer to instance initialization table.
+36 Reserved.
+40 Pointer to the interface table.
+44 Pointer to the dynamic method table (using message with strings).
+48 Pointer to the Destroy destructor.
+52 Pointer to the NewInstance method.
+56 Pointer to the FreeInstance method.
+60 Pointer to the SafeCallException method.
+64 Pointer to the DefaultHandler method.
+68 Pointer to the AfterConstruction method.
+72 Pointer to the BeforeDestruction method.
+76 Pointer to the DefaultHandlerStr method.
+80 Pointers to other virtual methods.
...  

13 file types

File types are represented as records. Typed files and untyped files are represented as a fixed record:

  filerec = packed record
    handle    : longint;
    mode      : longint;
    recsize   : longint;
    _private  : array[1..32] of byte;
    userdata  : array[1..16] of byte;
    name      : array[0..255] of char;
  End;

Text files are described using the following record:

  TextBuf = array[0..255] of char;
  textrec = packed record
    handle    : longint;
    mode      : longint;
    bufsize   : longint;
    _private  : longint;
    bufpos    : longint;
    bufend    : longint;
    bufptr    : ^textbuf;
    openfunc  : pointer;
    inoutfunc : pointer;
    flushfunc : pointer;
    closefunc : pointer;
    userdata  : array[1..16] of byte;
    name      : array[0..255] of char;
    buffer    : textbuf;
  End;

handle
The handle field returns the file handle (if the file is opened), as returned by the operating system.

mode
The mode field can take one of several values. When it is fmclosed, then the file is closed, and the handle field is invalid. When the value is equal to fminput, it indicates that the file is opened for read only access. fmoutput indicates write only access, and the fminout indicates read-write access to the file.

name
The name field is a null terminated character string representing the name of the file.

userdata
The userdata field is never used by Free Pascal, and can be used for special purposes by software developpers.

14 procedural types

A procedural type is stored as a generic pointer, which stores the address of the routine.


3 Data alignment

1 Typed constants and variable alignment

All static data (variables and typed constants) which are greater than a byte are usually aligned on a power of two boundary. This alignment applies only to the start address of the variables, and not the alignment of fields within structures or objects for example. For more information on structured alignment, section StructuredAlignment. The alignment is similar across the different target processors. [*]


Table: Data alignment
Data size (bytes) Alignment (small size) Alignment (fast)  
1 1 1  
2-3 2 2  
4-7 2 4  
8+ 2 4  

The alignment columns indicates the address alignment of the variable, i.e the start address of the variable will be aligned on that boundary. The small size alignment is valid when the code generated should be optimized for size and not speed, otherwise and by default, the fast alignment is used to align the data.


2 Structured types alignment

By default all elements in a structure are aligned to a 2 byte boundary, unless the $PACKRECORDS directive or packed modifier is used to align the data in another way. For example a record or object having a 1 byte element, will have its size rounded up to 2, so the size of the structure will actually be 2 bytes.


4 The heap

The heap is used to store all dynamic variables, and to store class instances. The interface to the heap is the same as in Turbo Pascal, although the effects are maybe not the same. On top of that, the Free Pascal run-time library has some extra possibilities, not available in Turbo Pascal. These extra possibilities are explained in the next subsections.

1 Heap allocation strategy

The heap is a memory structure which is organized as a stack. The heap bottom is stored in the variable HeapOrg. Initially the heap pointer (HeapPtr) points to the bottom of the heap. When a variable is allocated on the heap, HeapPtr is incremented by the size of the allocated memory block. This has the effect of stacking dynamic variables on top of each other.

Each time a block is allocated, its size is normalized to have a granularity of 16 bytes.

When Dispose or FreeMem is called to dispose of a memory block which is not on the top of the heap, the heap becomes fragmented. The deallocation routines also add the freed blocks to the freelist which is actually a linked list of free blocks. Furthermore, if the deallocated block was less then 8K in size, the free list cache is also updated.

The free list cache is actually a cache of free heap blocks which have specific lengths (the adjusted block size divided by 16 gives the index into the free list cache table). It is faster to access then searching through the entire freelist.

The format of an entry in the freelist is as follows:

 PFreeRecord = ^TFreeRecord;
 TFreeRecord = record
   Size : longint;
   Next : PFreeRecord;
   Prev : PFreeRecord;
 end;

The Next field points to the next free block, while the Prev field points to the previous free block.

The algorithm for allocating memory is as follows:

  1. The size of the block to allocate is adjusted to a 16 byte granularity.
  2. The cached free list is searched to find a free block of the specified size or bigger size, if so it is allocated and the routine exits.
  3. The freelist is searched to find a free block of the specified size or of bigger size, if so it is allocated and the routine exits.
  4. If not found in the freelist the heap is grown to allocate the specified memory, and the routine exits.
  5. If the heap cannot be grown anymore, a call to the operating system is made to grow the heap further. If the block to allocate < 256Kb, then the heap is grown by 256Kb, otherwise it is grown by 1024Kb.

2 The heap grows

Free Pascal supports the HeapError procedural variable. If this variable is non-nil, then it is called in case you try to allocate memory, and the heap is full. By default, HeapError points to the GrowHeap function, which tries to increase the heap.

The growheap function issues a system call to try to increase the size of the memory available to your program. It first tries to increase memory in a 256Kb chunk if the size to allocate is less than 256Kb, or 1024K otherwise. If this fails, it tries to increase the heap by the amount you requested from the heap.

If the call to GrowHeap has failed, then a run-time error is generated, or nil is returned, depending on the GrowHeap result.

If the call to GrowHeap was successful, then the needed memory will be allocated.

3 Debugging the heap

Free Pascal provides a unit that allows you to trace allocation and deallocation of heap memory: heaptrc.

If you specify the -gh switch on the command-line, or if you include heaptrc as the first unit in your uses clause, the memory manager will trace what is allocated and deallocated, and on exit of your program, a summary will be sent to standard output.

More information on using the heaptrc mechanism can be found in the Users' guide and Unit reference.

4 Writing your own memory manager

Free Pascal allows you to write and use your own memory manager. The standard functions GetMem, FreeMem, ReallocMem and Maxavail use a special record in the system unit to do the actual memory management. The system unit initializes this record with the system unit's own memory manager, but you can read and set this record using the GetMemoryManager and SetMemoryManager calls:

procedure GetMemoryManager(var MemMgr: TMemoryManager);
procedure SetMemoryManager(const MemMgr: TMemoryManager);

the TMemoryManager record is defined as follows:

  TMemoryManager = record
    Getmem      : Function(Size:Longint):Pointer;
    Freemem     : Function(var p:pointer):Longint;
    FreememSize : Function(var p:pointer;Size:Longint):Longint;
    AllocMem    : Function(Size:longint):Pointer;
    ReAllocMem  : Function(var p:pointer;Size:longint):Pointer;
    MemSize     : function(p:pointer):Longint;
    MemAvail    : Function:Longint;
    MaxAvail    : Function:Longint;
    HeapSize    : Function:Longint;
  end;

As you can see, the elements of this record are procedural variables. The system unit does nothing but call these various variables when you allocate or deallocate memory.

Each of these functions corresponds to the corresponding call in the system unit. We'll describe each one of them:

Getmem
This function allocates a new block on the heap. The block should be Size bytes long. The return value is a pointer to the newly allocated block.
Freemem
should release a previously allocated block. The pointer P points to a previously allocated block. The Memory manager should implement a mechanism to determine what the size of the memory block is [*] The return value is optional, and can be used to return the size of the freed memory.
FreememSize
This function should release the memory pointed to by P. The argument Size is the expected size of the memory block pointed to by P. This should be disregarded, but can be used to check the behaviour of the program.
AllocMem
Is the same as getmem, only the allocated memory should be filled with zeroes before the call returns.
ReAllocMem
Should allocate a memory block Size bytes large, and should fill it with the contents of the memory block pointed to by P, truncating this to the new size of needed. After that, the memory pointed to by P may be deallocated. The return value is a pointer to the new memory block.
MemSize
should return the total amount of memory available for allocation. This function may return zero if the memory manager does not allow to determine this information.
MaxAvail
should return the size of the largest block of memory that is still available for allocation. This function may return zero if the memory manager does not allow to determine this information.
HeapSize
should return the total size of the heap. This may be zero is the memory manager does not allow to determine this information.
To implement your own memory manager, it is sufficient to construct such a record and to issue a call to SetMemoryManager.

To avoid conflicts with the system memory manager, setting the memory manager should happen as soon as possible in the initialization of your program, i.e. before any call to getmem is processed.

This means in practice that the unit implementing the memory manager should be the first in the uses clause of your program or library, since it will then be initialized before all other units (except of the system unit)

This also means that it is not possible to use the heaptrc unit in combination with a custom memory manager, since the heaptrc unit uses the system memory manager to do all it's allocation. Putting the heaptrc unit after the unit implementing the memory manager would overwrite the memory manager record installed by the custom memory manager, and vice versa.

The following unit shows a straightforward implementation of a custom memory manager using the memory manager of the C library. It is distributed as a package with Free Pascal.

unit cmem;

{$mode objfpc}

interface

Function Malloc (Size : Longint) : Pointer;cdecl;
  external 'c' name 'malloc';
Procedure Free (P : pointer); cdecl; external 'c' name 'free';
Procedure FreeMem (P : Pointer); cdecl; external 'c' name 'free';
function ReAlloc (P : Pointer; Size : longint) : pointer; cdecl;
  external 'c' name 'realloc';
Function CAlloc (unitSize,UnitCount : Longint) : pointer;cdecl;
  external 'c' name 'calloc';

implementation

Function CGetMem  (Size : Longint) : Pointer;

begin
  result:=Malloc(Size);
end;

Function CFreeMem (Var P : pointer) : Longint;

begin
  Free(P);
  Result:=0;
end;

Function CFreeMemSize(var p:pointer;Size:Longint):Longint;

begin
  Result:=CFreeMem(P);
end;

Function CAllocMem(Size : Longint) : Pointer;

begin
  Result:=calloc(Size,1);
end;

Function CReAllocMem (var p:pointer;Size:longint):Pointer;

begin
  Result:=realloc(p,size);
end;

Function CMemSize (p:pointer): Longint;

begin
  Result:=0;
end;

Function CMemAvail : Longint;

begin
  Result:=0;
end;

Function CMaxAvail: Longint;

begin
  Result:=0;
end;

Function CHeapSize : Longint;

begin
  Result:=0;
end;


Const
 CMemoryManager : TMemoryManager =
    (
      GetMem : CGetmem;
      FreeMem : CFreeMem;
      FreememSize : CFreememSize;
      AllocMem : CAllocMem;
      ReallocMem : CReAllocMem;
      MemSize : CMemSize;
      MemAvail : CMemAvail;
      MaxAvail : MaxAvail;
      HeapSize : CHeapSize;
    );

Var
  OldMemoryManager : TMemoryManager;

Initialization
  GetMemoryManager (OldMemoryManager);
  SetMemoryManager (CmemoryManager);

Finalization
  SetMemoryManager (OldMemoryManager);
end.


5 Using DOS memory under the Go32 extender

Because Free Pascal for DOS is a 32 bit compiler, and uses a DOS extender, accessing DOS memory isn't trivial. What follows is an attempt to an explanation of how to access and use DOS or real mode memory[*].

In Proteced Mode, memory is accessed through Selectors and Offsets. You can think of Selectors as the protected mode equivalents of segments.

In Free Pascal, a pointer is an offset into the DS selector, which points to the Data of your program.

To access the (real mode) DOS memory, somehow you need a selector that points to the DOS memory. The go32 unit provides you with such a selector: The DosMemSelector variable, as it is conveniently called.

You can also allocate memory in DOS's memory space, using the global_dos_alloc function of the go32 unit. This function will allocate memory in a place where DOS sees it.

As an example, here is a function that returns memory in real mode DOS and returns a selector:offset pair for it.

procedure dosalloc(var selector : word;
                   var segment : word;
                   size : longint);

var result : longint;

begin
     result := global_dos_alloc(size);
     selector := word(result);
     segment := word(result shr 16);
end;
(You need to free this memory using the global_dos_free function.)

You can access any place in memory using a selector. You can get a selector using the allocate_ldt_descriptor function, and then let this selector point to the physical memory you want using the set_segment_base_address function, and set its length using set_segment_limit function. You can manipulate the memory pointed to by the selector using the functions of the GO32 unit. For instance with the seg_fillchar function. After using the selector, you must free it again using the free_ldt_selector function.

More information on all this can be found in the Unit reference, the chapter on the go32 unit.



Free Pascal Compiler
2001-09-22