Delphi: How to use virtual Methods and Polymorphism Part 2

Articles
Members Online:
-Article/Tip Search
-News Group
Member Area
-Top 10 NEW!!
-Forums Upgraded!!
-Indexes NEW!!
Employment
Contacts
Embarcadero
Embarcadero Community
JEDI
Links
How to use virtual Methods and Polymorphism Part 2
31-Oct-03
Category
Algorithm
Language
Delphi 2.x
Views
163
User Rating
No Votes
# Votes
Replies
Publisher:
DSP, Administrator
Reference URL:
DKB
			Author: Danny Thorpe

Smart Linking

Answer:

In Part I we explored the magic of polymorphism and its Object Pascal 
implementation, the virtual method. We discovered that the indicator of which 
virtual method to invoke on the instance data is stored in the instance data 
itself. 

In this installment, we conclude our exploration with a discussion of abstract 
interfaces and how virtual methods can defeat and enhance "smart linking." 

Abstract Interfaces

An abstract interface is a class type that contains no implementation and no data - 
only abstract virtual methods. Abstract interfaces allow you to completely separate 
the user of the interface from the implementation of the interface. 

And I do mean completely separate; with abstract interfaces, you can have an object 
implemented in a DLL and used by routines in an .EXE, just as if the object were 
implemented in the .EXE itself. Abstract interfaces can bridge: 

conceptual barriers within an application, 
logistical barriers between an application and a DLL, 
language barriers between applications written in different programming languages, 
and 
address space barriers that separate Win32 processes. 

In all cases, the client application uses the interface class just as it would any 
class it implemented itself. 

Let's now take a closer look at how an abstract interface class can bridge the gap 
between an application and a DLL. (By the way, abstract interfaces are the 
foundation of OLE programming.) 

Importing Objects from DLLs: The Hard Way. If you want an application to use a 
function in a DLL, you must create a "fake" function declaration that tells the 
compiler what it needs to know about the parameter list and result type of the 
function. Instead of a method body, this fake function declaration contains a 
reference to a DLL and function name. The compiler sees these and knows what code 
to generate to call the proper address in the DLL at run time. 

To have an application use an object that's implemented in a DLL, you could do 
essentially the same thing, declaring a separate function for each object method in 
the DLL. As the number of methods in the DLL object increases, however, keeping 
track of all those functions will become a chore. To make things a little easier to 
manage, you could set up the DLL to give you (the client application) an array of 
function pointers that you would use to call any of the DLL functions associated 
with a particular DLL class type. 

You can see where this is headed. A Virtual Method Table is precisely an array of 
function pointers (we discussed the VMT last month). Why do things the hard way 
when the compiler can do the dirty work for you? 

Importing Objects from DLLs: The Smart Way. The client module (the application) 
requires a class declaration that will make the compiler "visualize" a VMT that 
matches the desired DLL's array of function pointers. Enter the abstract interface 
class. The class contains a hoard of virtual; abstract; method declarations in the 
same order as the functions in the DLL's array of function pointers. Of course, the 
abstract method declarations need parameter lists that match the DLL's functions 
exactly. 

Now you can fetch the array of function pointers from the DLL and typecast a 
pointer to that array into your application's abstract interface class type. (Okay; 
it actually needs to be a pointer to a pointer to an array of function addresses. 
The first pointer simulates the object instance, the second pointer simulates the 
VMT pointer embedded in the instance data, but who's counting?) 

With this typecast in place, the compiler will think you have an instance of that 
class type. When the compiler sees a method call on that typecast pointer, it will 
generate code to push the parameters on the stack, then look up the nth virtual 
method address in the "instance's VMT" (the pointer to the function table provided 
by the DLL), and call that address. Voil?! Your application is using an "object" 
that lives in a DLL as easily as one of its own classes. 

Exporting Objects from DLLs. Now for the flip side. Where does the DLL get that 
array of function pointers? From the compiler, of course! On the DLL side, create a 
class type with virtual methods with the same order and parameter lists as defined 
by the "red-herring" array of function pointers, and implement those methods to 
perform the tasks of that class. Then implement and export a simple function from 
the DLL that creates an instance of the DLL's class and returns a pointer to it. 
Again, Voil?! Your DLL is exporting an object that can be used by any application 
that can handle pointers to arrays of function addresses. Also known as objects! 

Abstract Interfaces Link User and Implementor. Here's the clincher. How do you 
guarantee that the order and parameter lists of the methods in the application's 
abstract interface class exactly match the methods implemented in the DLL? 

Simple. Declare the DLL class as a descendant of the abstract interface class used 
by the application, and override all the abstract virtual methods. The abstract 
interface is shared between the application and the DLL; the implementation is 
contained entirely within the DLL. 

Abstract Interfaces Cross Language Boundaries. This can also be done between 
modules written in different languages. The Microsoft Component Object Model (COM) 
is a language-independent specification that allows different programming languages 
to share objects as just described. At its core, COM is simply a specification for 
how an array of function pointers should be arranged and used. COM is the 
foundation of OLE. 

Since Delphi's native class type implementation conforms to COM specifications, 
there is no conversion required for Delphi applications to use COM objects, nor any 
conversion required for Delphi applications to expose COM objects for other modules 
to use. 

Of course, when dealing with multiple languages, you won't have the luxury of 
sharing the abstract interface class between the modules. You'll have to translate 
the abstract interface class into each language, but this is a small price to pay 
for the ability to share the implementation. 

The Delphi IDE is built entirely upon abstract interfaces, allowing the IDE main 
module to communicate with the editor and debugger kernel DLLs (implemented in 
BC++), and with the multitude of component design-time tools that live in the 
component library (CMPLIB32.DCL) and installable expert modules. 

Virtuals Defeat Smart Linking

When the Delphi compiler/linker produces an .EXE, the procedures, variables, and 
static methods that are not referenced by "live" code (code that is actually used) 
will be left out of the .EXE file. This process is called smart linking, and is a 
great improvement over normal linkers that merely copy all code into the .EXE 
regardless of whether it's actually needed. The result of smart linking is a 
smaller .EXE on disk that requires less memory to run. 

Smart Linking Rule for Virtuals. If the type information of a class is touched (for 
example, by constructing an instance) by live code, all the virtual methods of the 
class and its ancestors will be linked into the .EXE, regardless of whether the 
program actually uses the virtual methods. 

For the compiler, keeping track of whether an individual procedure is ever used in 
a program is relatively simple; figuring out whether a virtual method is used 
requires a great deal more analysis of the descendants and ancestors of the class. 
It's not impossible to devise a scheme to determine if a particular virtual method 
is never used in any descendants of a class type, but such a scheme would certainly 
require a lot more CPU cycles than normal smart linking, and the resulting 
reduction in code size would rarely be dramatic. For these reasons (lots of work, 
greatly reduced compile/link speed, and diminishing returns), adding smart linking 
of virtual methods to the Delphi linker has not been a high priority for Borland. 

If your class has a number of utility methods that you don't expect to use all the 
time, leaving them static will allow the smart linker to omit them from the final 
.EXE if they are not used by your program. 

Note that including virtual methods involves more than just the bytes of code in 
the method bodies. Anything that a virtual method uses or calls (including static 
methods) must also be linked into the .EXE, as well as anything those routines use, 
etc. Through this cascade effect, one method could potentially drag hundreds of 
other routines into the .EXE, sometimes at a cost of hundreds of thousands of bytes 
of additional code and data. If most of these support routines are used only by 
your unused virtual method, you have a lot of deadwood in your .EXE. 

The best general strategy to keep unused virtual methods - and their associated 
deadwood - under control, is to declare virtual methods sparingly. It's easier to 
promote an existing static method to virtual when a clear need arises, rather than 
trying to demote virtual methods down to statics at some late stage of your 
development cycle. 

Virtuals Enhance Smart Linking

Smart linking of virtuals is a two-edged sword: What is so often cursed for 
bloating executables with unused code can also be exploited to greatly reduce the 
amount of code in an executable in certain circumstances - even beyond what smart 
linking could normally achieve with ordinary static methods and procedures. The key 
is to turn the smart linking rule for virtuals inside out: 

Inverse Smart Linking Rule for Virtuals. If the type information of a class is not 
touched by live code, then none of that class' virtual methods will be linked into 
the executable. Even if those virtual methods are called polymorphically by live 
code! 

In a virtual method call, the compiler emits machine code to grab the VMT pointer 
from the instance data, and to call an address stored at a particular offset in the 
VMT. The compiler can't know exactly which method body will be called at run time, 
so the act of calling a virtual method does not cause the smart linker to pull any 
method bodies corresponding to that virtual method identifier into the final 
executable. 

The same is true for dynamic methods. The act of constructing an instance of the 
class is what cues the linker to pull in the virtual methods of that particular 
class and its ancestors. This saves the program from the painful death that would 
surely result from calling virtual methods that were not linked into the program. 
After all, how could you possibly call a virtual method of an object instance 
defined and implemented in your program if you did not first construct said 
instance? The answer is: you can't. If you obtained the object instance from some 
external source, e.g. a DLL, then the virtual methods of that instance are in the 
DLL, not your program. 

So, if you have code that calls virtual methods of a class that is never 
constructed by routines used in the current project, none of the code associated 
with those virtual methods will be linked into the final executable. 

The code in Figure 1 will cause the linker to pull in all the virtual methods of 
TKitchenGadget and TOfficeManager, because those classes are constructed in live 
code (the main program block), and all the virtual methods of TBaseGadget, because 
it's the ancestor of TKitchenGadget.

1   type
2     TBaseGadget = class
3       constructor Create;
4       procedure Whirr; virtual; { Linked in: YES }
5     end;
6   
7     TOfficeGadget = class(TBaseGadget)
8       procedure Whirr; override; { Linked in: NO }
9       procedure Buzz; { Linked in: NO }
10      procedure Pop; virtual; { Linked in: NO }
11    end;
12  
13    TKitchenGadget = class(TBaseGadget)
14      procedure Whirr; override; { Linked in: YES }
15    end;
16  
17    TOfficeManager = class
18    private
19      FOfficeGadget: TOfficeGadget;
20    public
21      procedure InstantiateGadget; { Linked in: NO }
22      { Linked in: YES }
23      procedure Operate(AGadget: TOfficeGadget); virtual;
24    end;
25  
26    { ... Non-essential code omitted ... }
27  
28  procedure TOfficeManager.InstantiateGadget;
29  begin { Dead code, never called }
30    FOfficeGadget := TOfficeGadget.Create;
31  end;
32  
33  procedure TOfficeManager.Operate(AGadget: TOfficeGadget);
34  { Live code, virtual method of a constructed class }
35  begin
36    AGadget.Whirr
37  end;
38  
39  var
40    X: TBaseGadget;
41    M: TOfficeManager;
42  begin
43    X := TKitchenGadget.Create;
44    M := TOfficeManager.Create;
45  
46    X.Free;
47    M.Free;
48  end.

Figure 1: Inverse virtual smart linking: TOfficeGadget.Whirr will not be linked 
into this program, although Whirr is touched by the live method 
TOfficeManager.OperateGadget. 

Because TOfficeManager.Operate is virtual, its method body is all live code (even 
though Operate is never called). Therefore, the call to AGadget.Whirr is a live 
reference to the virtual method Whirr. However, TOfficeGadget is not constructed in 
live code in this example -TOfficeManager.InstantiateGadget is never used. Nothing 
of TOfficeGadget will be linked into this program, even though a live routine 
contains a call to Whirr through a variable of type TOfficeGadget.

Variations on a Theme. Let's see how the scenario changes with a few slight code 
modifications. The code in Figure 2 adds a call to AGadget.Buzz in the 
TOfficeManager.Operate method. Notice that the body of TOfficeGadget.Buzz is now 
linked in, but TOfficeGadget.Whirr is still not. Buzz is a static method, so any 
live reference to it will link in the corresponding code, even if the class is 
never constructed. 

49  type
50    TBaseGadget = class
51      constructor Create;
52      procedure Whirr; virtual; { Linked in: YES }
53    end;
54  
55    TOfficeGadget = class(TBaseGadget)
56      procedure Whirr; override; { Linked in: NO }
57      procedure Buzz; { Linked in: YES }
58      procedure Pop; virtual; { Linked in: NO }
59    end;
60  
61    TKitchenGadget = class(TBaseGadget)
62      procedure Whirr; override; { Linked in: YES }
63    end;
64  
65    TOfficeManager = class
66    private
67      FOfficeGadget: TOfficeGadget;
68    public
69      procedure InstantiateGadget; { Linked in: NO }
70      { Linked in: YES }
71      procedure Operate(AGadget: TOfficeGadget); virtual;
72    end;
73  
74    { ... Non-essential code omitted ... }
75  
76  procedure TOfficeManager.InstantiateGadget;
77  begin { Dead code, never called }
78    FOfficeGadget := TOfficeGadget.Create;
79  end;
80  
81  procedure TOfficeManager.Operate(AGadget: TOfficeGadget);
82  { Live code, virtual method of a constructed class }
83  begin
84    AGadget.Whirr;
85    AGadget.Buzz; { This touches the static method body }
86  end;
87  var
88    X: TBaseGadget;
89    M: TOfficeManager;
90  begin
91    X := TKitchenGadget.Create;
92    M := TOfficeManager.Create;
93  
94    X.Free;
95    M.Free;
96  end.

Figure 2: Notice how the addition of a call to the static Buzz method affects its 
linked-in status. TOfficeGadget.Whirr is still not included. 

The code in Figure 3 adds a call to the static method 
TOfficeManager.InstantiateGadget. This brings the construction of the TOfficeGadget 
class into the live code of the program, which brings in all the virtual methods of 
TOfficeGadget, including TOfficeGadget.Whirr (which is called by live code) and 
TOfficeGadget.Pop (which isn't). If you deleted the call to AGadget.Buzz, the 
TOfficeGadget.Buzz method would become dead code again. Static methods are linked 
in only if they are used in live code, regardless of whether their class type is 
used. 

97  type
98    TBaseGadget = class
99      constructor Create;
100     procedure Whirr; virtual; { Linked in: YES }
101   end;
102 
103   TOfficeGadget = class(TBaseGadget)
104     procedure Whirr; override; { Linked in: YES }
105     procedure Buzz; { Linked in: YES }
106     procedure Pop; virtual; { Linked in: YES }
107   end;
108 
109   TKitchenGadget = class(TBaseGadget)
110     procedure Whirr; override; { Linked in: YES }
111   end;
112 
113   TOfficeManager = class
114   private
115     FOfficeGadget: TOfficeGadget;
116   public
117     procedure InstantiateGadget; { Linked in: YES }
118     { Linked in: YES }
119     procedure Operate(AGadget: TOfficeGadget); virtual;
120 
121   end;
122 
123   { ... Non-essential code omitted ... }
124 
125 procedure TOfficeManager.InstantiateGadget;
126 begin { Live code }
127   FOfficeGadget := TOfficeGadget.Create;
128 end;
129 
130 procedure TOfficeManager.Operate(AGadget: TOfficeGadget);
131 { Live code, virtual method of a constructed class }
132 begin
133   AGadget.Whirr;
134   AGadget.Buzz; { This touches the static method body }
135 end;
136 
137 var
138   X: TBaseGadget;
139   M: TOfficeManager;
140 begin
141   X := TKitchenGadget.Create;
142   M := TOfficeManager.Create;
143 
144   M.InstantiateGadget;
145 
146   X.Free;
147   M.Free;
148 end.

Figure 3: With a call to InstantiateGadget, the construction of TOfficeGadget 
becomes live and all of TOfficeGadget's virtual methods are linked. 

Life in the Real World. Let's examine a slightly more complex (and more 
interesting) example of this virtual smart linking technique inside the VCL. 

The Delphi streaming system has two parts: TReader and TWriter, which descend from 
a common ancestor, TFiler:

TReader contains all the code needed to load components from a stream. 
TWriter contains everything needed to write components to a stream. 

These classes were split because many Delphi applications never need to write 
components to a stream - most applications only read forms from resource streams at 
program start up. If the streaming system was implemented in one class, all your 
applications would wind up carrying around all the stream output code, although 
many don't need it. 

So, splitting the streaming system into two classes improved smart linking. End of 
story? Not quite. 

In a careful examination of the code linked into a typical Delphi application, the 
Delphi R&D team noticed that bits of TWriter were being linked into the .EXE. This 
seemed odd, because TWriter was definitely never instantiated in the test program. 
Some of those TWriter bits touched a lot of other bits that piled up rather quickly 
into a lot of unused code. Let's backtrack a little to see what lead to this code 
getting into the .EXE, and its surprising solution. 

Delphi's TComponent class defines virtual methods that are responsible for reading 
and writing the component's state in a stream, using TReader and TWriter classes. 
Because TComponent is the ancestor of just about everything of importance in 
Delphi, TComponent is almost always linked into your Delphi programs, along with 
all the virtual methods of TComponent.

Some of TComponent's virtual methods use TWriter methods to write the component's 
properties to a stream. Those TWriter methods were static methods. 

Therefore, TComponent virtual methods are always included in Delphi form-based 
applications, and some of those virtual methods (e.g. TComponent.WriteState) call 
static methods of TWriter (e.g. TWriter.WriteData). Thus, those static method 
bodies of TWriter were being linked into the .EXE. TWriter.WriteData is the kingpin 
method that drives the entire stream output system, so when it is linked in, almost 
all the rest of TWriter tags along (everything, ironically, except TWriter.Create). 

The solution to this code bloat (caused indirectly by the TComponent.WriteState 
virtual method) may throw you for a loop: To eliminate the unneeded TWriter code, 
make more methods of TWriter (e.g. WriteData) virtual! 

The all-or-none clumping of virtual methods that we curse for working against the 
smart linker can be used to our advantage, so that TWriter methods that must be 
called by live code are not actually included unless TWriter itself is instantiated 
in the program. Because methods such as TWriter.WriteData are always used when you 
use a TWriter, and TWriter is a mule class (no descendants), there is no 
appreciable cost to making TWriter.WriteData virtual. 

The benefits, however, are appreciable: Making TWriter.WriteData virtual shaved 
nearly 10KB off the size of a typical Delphi 2 .EXE. Thanks to this and other code 
trimming tricks, Delphi 2 packs more standard features (e.g. form inheritance and 
form linking) into smaller .EXEs than Delphi 1. 

What's Really in Your Executables? The simplest way to find out if a particular 
routine is linked into a particular project is to set a breakpoint in the body of 
that routine and run the program in the debugger. If the routine is not linked into 
the .EXE, the debugger will complain that you have set an invalid breakpoint. 

To get a complete picture of what's in your .EXE or DLL, configure the linker 
options to emit a detailed map file. From Delphi's main menu, select Project | 
Options to display the Project Options dialog box. Select the Linker tab. In the 
Map File group box, select Detailed. Now recompile your project. The map file will 
contain a list of the names of all the routines (from units compiled with $D + 
debug information) that were linked into the .EXE. 

Because the 32-bit Delphi Compiled Unit (.DCU) file has none of the capacity 
limitations associated with earlier, 16-bit versions of the Borland Pascal product 
line, there is little reason to ever turn off debug symbol information storage in 
the .DCU. Leave the $D, $L, and $Y compiler switches enabled at all times so the 
information is available when you need it in the integrated debugger, map file, or 
object browser. (If hard disk space is a problem, collect the loose change beneath 
the cushions of your sofa and buy a new 1GB hard drive.) 

Novelty of Inverse Virtual Smart Linking. This technique of using virtual methods 
to improve smart linking is not unique to Delphi, but because Delphi's smart linker 
has a much finer granularity than other compiler products, this technique is much 
more effective in Delphi than in other products. 

Most compilers produce intermediate code and limited symbol information in an .OBJ 
format, and most linkers' atom of granularity for smart linking is the .OBJ file. 
If you touch something inside a library of routines stored in one .OBJ module, the 
entire .OBJ module is linked into the .EXE. Thus, C and C++ libraries are often 
broken into swarms of little .OBJ modules in the hope of minimizing dead code in 
the .EXE. 

Delphi's linker granularity is much finer - down to individual variables, 
procedures, and classes. If you touch one routine in a Delphi unit that contains 
lots of routines, only the thing you touch (and whatever it uses) is linked into 
the .EXE. Thus, there is no penalty for creating large libraries of 
topically-related routines in one Delphi unit. What you don't use will be left out 
of the .EXE. 

Developing clever techniques to avoid touching individual routines or classes is 
generally more rewarding in Delphi than in most other compiled languages. In other 
products, the routines you so carefully avoided will probably be linked into the 
.EXE anyway because you are still using one of the other routines in the same 
module. Measuring with a micrometer is futile when your only cutting tool is a 
chainsaw. 

Conclusion

Virtual methods are often maligned for bloating applications with unnecessary code. 
While it's true that virtuals can drag in code that your application doesn't need, 
this series has shown that careful and controlled use of virtual methods can 
achieve greater smart linking efficiency than would be possible with static methods 
alone.
Vote: How useful do you find this Article/Tip?
Bad											Excellent
	1	2	3	4	5	6	7	8	9	10
Advertisement
Share this page
Advertisement