C/C++ Code Optimization Techniques: Rules To Remember

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

C/C++ Code Optimization Techniques

Rules to Remember
1. Find the bottleneck of your Application / module. 80 - 20 rule also applies over here which means most of the execution goes in a smaller part of the code. You should use some profiling tool to fing the bottle nect of your application before going into the optimization part. 2. C/C++ compilers are queit efficient and they procuce optimized binary for the written code. So micro-optimization like using register variables will not be usesful. Let the compiler find the best way to utilize the variables. 3. Some guidelines for optimizing C/C++ code: a. Get same calculations/comparisons out of the loop.

for (int i=0; i<iMax; i++) { if (Val > CONST_VAL) { ... } else { .... } }

This can be optimized as:

if (Val > CONST_VAL) { for (int i=0; i<iMax; i++) { ..... } } else { for (int i=0; i<iMax; i++)

{ ..... } } b. Avoid type conversions, use same type of variables including signed and unsigned. unsigned long x, sum; for (int i = 0; i<Max_I; i++) { sum = x*i; } Optimized: unsigned long x, sum; for (unsigned long i = 0; i<Max_I; i++) { sum = x*i; }

c. Interger calculations are faster than floating point. d. Keep in mind the processors word length. On 32 bit processor 32-bit calculations and copying will be faster or equal to 16-bit or 8-bit.

e. Interger Multiply/Divide by two is fastest using bit shift operations.

f. Multiplication and Division operations are generally costlier than Addition Substraction. Try to convert multiplication/division to addition/substraction.

An Example:
Before These are constants to the loop extentY, extentX, TopLeftX, TopLeftY and are of type unsigned long #define Target(cx,cy) pOffBits[serverResX * (cy) + (cx)] #define Source(cx,cy) pBitsI[extentX * (cy) + (cx)] for (int y = 0; y<extentY; y++) for (int x = 0; x<extentX; x++)

{ Target(x + TopLeftX, y + TopLeftY) = Source(x,y); }

Optimized unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY + intermC; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; } Step By Step 1. Replacing the #define and making the loop variable data type same as that of the others for (unsigned long y = 0; y<extentY; y++) for (unsigned long x = 0; x<extentX; x++) { pOffBits[serverResX * (y + TopLeftY) + (x + TopLeftX)] = pBitsI[extentX * y + x]; } 2. Rearranging the calculation pOffBits[serverResX * y + x + serverResX * TopLeftY + TopLeftX] = pBitsI[extentX * y + x];

Here (serverResX * TopLeftY + TopLeftX) is constant so we can take it out of the loop unsigned long intermC = serverResX * TopLeftY + TopLeftX; for (unsigned long y = 0; y<extentY; y++) for (unsigned long x = 0; x<extentX; x++) { pOffBits[serverResX * y + x + intermC] = pBitsI[extentX * y + x];

3. Now here both serverResX * y and extentX * y are constant in the inner loop so we can take it out of the innder loop. unsigned long intermC = serverResX * TopLeftY + TopLeftX; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = serverResX * y; unsigned long intermB = extentX * y; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x + intermC] = pBitsI[intermB + x]; } }

4. Now you can see the multiplication serverResX * y going in the loop. Where y ranges from 0 to extentY. We can convert it to addition like this: unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY; unsigned long intermB = extentX * y; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x + intermC] = pBitsI[intermB + x]; } intermY += serverResX; }

5. Similerly for extentX, and we gan get rid of intermA and intermB unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++)

{ for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermY + x + intermC] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; }

6. Also since intermY + intermC is constant in the inner loop we can take it out unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY + intermC; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; }

This is the optimized algorithm. Note: All datatypes are same over here

Comments, questions, please mail me. Akhilesh Singh Last Updated 8th January 2005 Crossroad | About Me | Science | Life | Code | Contact | Site Map | Search Copyright 2006 - 2009

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy