[ Pobierz całość w formacie PDF ]
and adds a header at block start; the release version of new doesn't perform either of these
tasks. Furthermore, a release version of an executable might have been optimized already
in several ways, including the elimination of unnecessary temporary objects, loop
unrolling (see the sidebar "A Few Compiler Tricks"), moving objects to the registers, and
inlining. For these reasons, you cannot assuredly deduce from a debug version where the
performance bottlenecks are actually located.
A Few Compiler Tricks
A compiler can automatically optimize the code in several ways. The
named return value and loop unrolling are two instances of such
automatic optimizations.
Consider the following code:
int *buff = new int[3];
for (int i =0; i
buff[i] = 0;
This loop is inefficient: On every iteration, it assigns a value to the next
array element. However, precious CPU time is also wasted on testing and
incrementing the counter's value and performing a jump statement. To
avoid this overhead, the compiler can unroll the loop into a sequence of
three assignment statements, as follows:
buff[0] = 0;
buff[1] = 0;
buff[2] = 0;
The named return value is a C++-specific optimization that eliminates the
construction and destruction of a temporary object. When a temporary
object is copied to another object using a copy constructor, and when both
these objects are cv-unqualified, the Standard allows the implementation
to treat the two objects as one, and not perform a copy at all. For example
class A
{
public:
A();
~A();
A(const A&);
A operator=(const A&);
};
A f()
{
A a;
return a;
}
A a2 = f();
The object a does not need to be copied when f() returns. Instead, the
return value of f() can be constructed directly into the object a2, thereby
avoiding both the construction and destruction of a temporary object on
the stack.
Remember also that debugging and optimization are two distinct operations. The debug
version needs to be used to trap bugs and to verify that the program is free from logical
errors. The tested release version needs to be used in performance tuning and
optimizations. Of course, applying the code optimization techniques that are presented in
this chapter can enhance the performance of the debug version as well, but the release
version is the one that needs to be used for performance evaluation.
NOTE: It is not uncommon to find a "phantom bottleneck" in the debug
version, which the programmer strains hard to fix, only to discover later
that it has disappeared anyway in the release version. Andrew Koenig
wrote an excellent article that tells the story of an evasive bottleneck that
automatically dissolved in the release version ("An Example of Hidden
Library Overhead", C++ Report vol. 10:2, February 1998, page 11). The
lesson that can be learned from this article is applicable to everyone who
practices code optimization.
Declaration Placement
The placing of declarations of variables and objects in the program can have significant
performance effects. Likewise, choosing between the postfix and prefix operators can
also affect performance. This section concentrates on four issues: initialization versus
assignment, relocation of declarations to the part of the program that actually uses them, a
constructor's member initialization list, and prefix versus postfix operators.
Prefer Initialization to Assignment
C allows declarations only at a block's beginning, before any program statements. For
example
void f();
void g()
{
int i;
double d;
char * p;
f();
}
In C++, a declaration is a statement; as such, it can appear almost anywhere within the
program. For example
void f();
void g()
{
int i;
f();
double d;
char * p;
}
The motivation for this change in C++ was to allow for declarations of objects right
before they are used. There are two benefits to this practice. First, this practice guarantees
that an object cannot be tampered with by other parts of the program before it has been
used. When objects are declared at the block's beginning and are used only 20 or 50 lines
later, there is no such guarantee. For instance, a pointer to an object that was allocated on
the free store might be accidentally deleted somewhere before it is actually used.
Declaring the pointer right before it is used, however, reduces the likelihood of such
mishaps.
The second benefit in declaring objects right before their usage is the capability to
initialize them immediately with the desired value. For example
#include
using namespace std;
void func(const string& s)
{
bool emp = s.empty(); //local declarations enables immediate
initialization
}
For fundamental types, initialization is only marginally more efficient than assignment;
or it can be identical to late assignment in terms of performance. Consider the following
version of func(), which applies assignment rather than initialization:
void func2() //less efficient than func()? Not necessarily
{
string s;
bool emp;
emp = s.empty(); //late assignment
}
My compiler produces the same assembly code as it did with the initialization version.
However, as far as user-defined types are concerned, the difference between initialization
and assignment can be quite noticeable. The following example demonstrates the
[ Pobierz całość w formacie PDF ]