0% found this document useful (0 votes)
71 views112 pages

PF Datatypes Arrays - 11

The document discusses integer data types and how they are represented in binary and stored in memory on computers. It covers unsigned short, unsigned int, and unsigned long data types and how many bytes each uses to store integers. It also discusses issues that can arise from arithmetic operations when values exceed the range of the data type.

Uploaded by

muhammad shoaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views112 pages

PF Datatypes Arrays - 11

The document discusses integer data types and how they are represented in binary and stored in memory on computers. It covers unsigned short, unsigned int, and unsigned long data types and how many bytes each uses to store integers. It also discusses issues that can arise from arithmetic operations when values exceed the range of the data type.

Uploaded by

muhammad shoaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

DATA TYPES & ARRAYS

INTEGER DATA TYPE


Clock arithmetic
• Suppose we have a clock face, but define 12 o’clock as
“0” o’clock
• The Europeans and military already do this…

• You know that:


• 5 hours after 9 o’clock is 2 o’clock
• 7 hours before 3 o’clock is 8 o’clock
• Specifically:
• 1 hour before 0 o’clock is 11 o’clock
• 1 hour after 11 o’clock is 0 o’clock

• This is arithmetic modulo 12


int and long
• We have seen integer data types up to this point:
int
unsigned int
long
unsigned long

• It has been suggested that


• An unsigned integer stores only positive numbers (0, 1, 2, …)
• A long can store more information than an int

• We will now see how integers are stored in the computer


Binary representations
• We have already described binary numbers
• On the computer, all integers are stored in binary
• Thus, to store each of these numbers, we must store the
corresponding binary digits (bits):
3 11 2
42 101010 6
616 1001101000 10
299792458 10001110111100111100001001010 29

• To store a googol (10100), we must store 333 bits:


10010010010011010110100100101100101001100001101111100111…
01011000010110010011110000100110001001100111000001011111…
10011100010101100111001000000100011100010000100011010011…
11100101010101011001001000011000010001010100000101110100…
01111000100000000000000000000000000000000000000000000000…
00000000000000000000000000000000000000000000000000000
Storage
• Do we store as many bits as are necessary?
• You could, but this would be exceedingly difficult to manage

• Instead, each primitive data type has a fixed amount of


storage
Exponent Decimal Binary
• 8 bits are defined as 1 byte
• All data types are an integral 20 1 1
number of bytes 21 2 10
22 4 100
• Usually 1, 2, 4, 8 or 16 bytes
23 8 1000
• Because we use binary, powers of 2 24 16 10000
are very common: 25 32 100000
26 64 1000000
unsigned int
• A variable is declared unsigned int is allocated four
bytes
• 4 bytes is 4 × 8 = 32 bits
• 32 different 1s and 0s can be stored
• The smallest and largest:
00000000000000000000000000000000
11111111111111111111111111111111
• The smallest represents 0
• The largest is one less than
100000000000000000000000000000000
32 zeros

• This equals 232, thus, the largest value that can be stored as an
unsigned int is 232 – 1 = 4294967295
• Approximately 4 billion
unsigned short
• Sometimes, you don’t need to store numbers this large
• Variables declared unsigned short are allocated two
bytes
• 2 bytes is 2 × 8 = 16 bits
• 16 different 1s and 0s can be stored
• The smallest and largest:
0000000000000000
1111111111111111
• The smallest represents 0
• The largest is one less than
10000000000000000
16 zeros

• This equals 216, thus, the largest value that can be stored as an
unsigned int is 216 – 1 = 65535
unsigned long
• Sometimes, you need to store very large numbers
• Variables declared unsigned long are allocated eight bytes
• 8 bytes is 8 × 8 = 64 bits
• 64 different 1s and 0s can be stored
• The smallest and largest:
0000000000000000000000000000000000000000000000000000000000000000
1111111111111111111111111111111111111111111111111111111111111111
• The smallest represents 0
• The largest is one less than
1 0000000000000000000000000000000000000000000000000000000000000000
64 zeros
• This equals 264, thus, the largest value that can be stored as an
unsigned int is 264 – 1 = 18446744073709551615
• This 18 billion billion or 18 quintillion
Example
• Consider this program: Note:
#include <iostream> First, a is upcast to unsigned int
before to the first addition
// Function declarations Then, the result is upcast to
int main(); unsigned long before the second addition

// Function definitions
int main() {
unsigned short a{42};
unsigned int b{207500}; Output:
unsigned long c{299792458}; 300000000

std::cout << (a + b + c) << std::endl;

return 0;
}
Example
• On the stack, an appropriate number of bytes are
allocated to each variable

#include <iostream>

// Function declarations
int main();

// Function definitions
int main() {
unsigned short a{42};
unsigned int b{207500};
unsigned long c{299792458};

std::cout << (a + b + c) << std::endl;

return 0;
}
Example
• Each of these variables is then initialized

#include <iostream>

// Function declarations
int main();

// Function definitions
int main() {
unsigned short a{42};
unsigned int b{207500};
unsigned long c{299792458};

std::cout << (a + b + c) << std::endl;

return 0;
}
Example
• Generally, however, we display the bytes in memory as a
column of bytes, the values of which are concatenated

#include <iostream>

// Function declarations
int main();

// Function definitions
int main() {
unsigned short a{42};
unsigned int b{207500};
unsigned long c{299792458};

std::cout << (a + b + c) << std::endl;

return 0;
}
Wasted space?
• If an integer does not use all the bytes, the remaining bits
are never-the-less allocated until the variable goes out of
scope
• In general-purpose computing, this is often not a problem
• This is a critical issue, however, in embedded systems
• More memory:
• Costs more
• Uses more power
• Produces more heat
Determining the size of a type
• We have said short, int and long are 2, 4 and 8 bytes
• This is true on most every general-purpose computer
• Unfortunately, the C++ specification doesn’t require this
• Fortunately, the sizeof operator gives you this information
#include <iostream>
int main();
int main() {
std::cout << "An 'unsigned short' occupies "
<< sizeof ( unsigned short ) << " bytes" << std::endl;
std::cout << "An 'unsigned int' occupies "
<< sizeof ( unsigned int ) << " bytes" << std::endl;
std::cout << "An 'unsigned long' occupies "
<< sizeof ( unsigned long ) << " bytes" << std::endl;
return 0;
Output on ecelinux:
} An 'unsigned short' occupies 2 bytes
An 'unsigned int' occupies 4 bytes
An 'unsigned long' occupies 8 bytes
Memory and initial values
• Question:
• What happens if the initial value cannot be stored?
#include <iostream>

int main();

int main() {
unsigned short c{299792458};
std::cout << "The speed of light is " << c
<< " m/s." << std::endl;

return 0;
}
Memory and initial values
• Fortunately, you get a warning:
example.cpp: In function 'int main()':
example.cpp:6:31: warning: narrowing conversion of â299792458â from 'int' to
'short unsigned int' inside { } [-Wnarrowing]
unsigned short c{299792458};
^
example.cpp:6:31: warning: large integer implicitly truncated to unsigned
type [-Woverflow]

• It still compiles and executes:


The speed of light is 30794 m/s.
Memory and initial values
• Where does 30794 come from?

• The binary number 0b111100001001010 equals 30794 in


base 10
Memory and initial values
• Important:
All unsigned integers are stored:
modulo 216 for unsigned short
modulo 232 for unsigned int
modulo 264 for unsigned long
Memory and arithmetic
• What happens if the sum, difference or product of two
integers exceeds what can be stored?
#include <iostream>

int main();

int main() {
unsigned short m1{40000}, m2{42000};
int n1{40000}, n2{42000};
unsigned short sum{m1 + m2}, diff{m1 - m2}, prod{m1*m2};

std::cout << sum << "\t" << (n1 + n2) << std::endl;
std::cout << diff << "\t" << (n1 - n2) << std::endl;
std::cout << prod << "\t" << (n1*n2) << std::endl;

return 0;
Output:
}
16464 82000
63536 -2000
50176 1680000000
Memory and arithmetic
• Let’s look at the actual values and the evaluated results:
16464 0100000001010000
82000 10100000001010000

63536 1111100000110000
–2000 -0000011111010000

50176 1100010000000000
1680000000 1100100001000101100010000000000

• For the sum and product, the result ignores the higher-
order bits
• The negative number is a little odd….
Memory and arithmetic
• What happens if the sum, difference or product of two
integers exceeds what can be stored?
#include <iostream>
int main();
int main() {
unsigned short smallest{0}, largest{65535};

std::cout << "Smallest: " << smallest << std::endl;


std::cout << "Largest: " << largest << std::endl;
--smallest;
++ largest;
std::cout << "Smallest minus 1: " << smallest << std::endl;
std::cout << "Largest plus 1: " << largest << std::endl;
Output:
return 0;
Smallest: 0
}
Largest: 65535
Smallest minus 1: 65535
Largest plus 1: 0
Memory and arithmetic
• Important:
All unsigned integers arithmetic is
performed:
modulo 216 for unsigned short
modulo 232 for unsigned int
modulo 264 for unsigned long

• This is similar to all clock arithmetic being performed


modulo 12
Addition
• Addition is easy:
• Like in elementary school, line them up and occasionally you
require a carry in the next column:
• The rules are:
• 0 + 0 → 0
• 0 + 1 → 1
• 1 + 1 → 10 → 0 with a carry of 1
• 1 + 1 + 1 → 11 → 1 with a carry of 1
• For example, adding two unsigned short:
1 1 1 1 1 1
20950
0101000111010110
39620 + 1001101011000100
1110110010011010 60570
Addition
• What if we go over? Adding these two unsigned short:
1 1 1 1 1 1 1 53718
1101000111010110
39620 + 1001101011000100
11110110010011010
126106

• The additional bit is discarded—addition is calculated


modulo 216
• Thus, the answer is 1110110010011010 which is 60570
Subtraction
• Subtraction is more difficult:
• Like in elementary school, you learned to “borrow”, but borrowing
may require you to look way ahead:
0100000001010000
- 0001101011000101
?
• Our salvation: we are performing arithmetic modulo 65536
Subtraction
• Going back to the clock:
• Subtracting 10 is the same as adding 2
• Subtracting 4 is the same as adding 8
• Subtracting 9 is the same as adding 3
• Thus, to subtract n, add 12 – n

• In our case, to subtract n, add 65536 – n


1 1 1 1 1

0100000001010000 16464 0100000001010000


- 0001101011000101 + 1110010100111011
6853 ? 10010010110001011 58683

• The answer is 0010010110001011


9611
Subtraction
• The million-dollar question:
How do you calculate 65536 – n???

• Subtract any number from 9999999999999, no borrows are


needed
9999999999999
– 5501496383498
4498503616501
• Thus, to calculate 10000000000000 – n, instead calculate
(10000000000000 – 1) – n + 1 = (9999999999999 – n) + 1

• For example: This is called the base-10 complement


10000000000000 or “10’s complement”
– 5501496383498 – this is how older adding machines
4498503616502 performed subtraction
Subtraction
• In binary, the equivalent is base-2 complement or “2’s
complement”
• To calculate 65536 – 1970, calculate (65535 – 1970) + 1:
1111111111111111
– 0000011110110010
1111100001001101
+ 1
1111100001001110
• Thus, to calculate 2018 – 1970, just add the 2’s complement of
1970 to 2018:
0000011111100010
+ 1111100001001110
10000000000110000
• This is the binary representation of 48 = 25 + 24 = 32 + 16
• Remember, we ignore the leading 1
2’s complement
• To calculate the 2’s complement:
1. Complement all of the bits in the number
• This includes leading zeros
2. Add 1
• For example, the 2’s complement of the speed of light is
stored as an unsigned int is
00010001110111100111100001001010
11101110001000011000011110110101
+ 1
11101110001000011000011110110110
2’s complement
• There is a faster way to compute it without the addition:
• Scan from right-to-left
• Find the first 1, and then flip each bit to the left of that

• The 2’s complement of each of the following is given


below it
1011011111011111
0100100000100001

1010111111100000
0101000000100000

0000100100101100
1111011011010100
2’s complement
• The 2’s complement of 0 stored as an unsigned int is
00000000000000000000000000000000
11111111111111111111111111111111
+ 1
100000000000000000000000000000000

• This makes sense: any number minus zero is unchanged


2’s complement
• The 2’s complement algorithm is self-inverting:
• If n is a number, then 216 – (216 – n) = n
• The 2’s complement of the 2’s complement of a number is the
number itself
1110110010111110
0001001101000001
+ 1
0001001101000010
1110110010111101
+ 1
1110110010111110

• That is, f –1 = f or f(f(n)) = n


Memory and arithmetic
• Try it yourself:
#include <iostream>

int main();

//////////////////////////////////////////////////
// Verify that every possible sum, difference,
// product, division and remainder option on
// 'unsigned short' is the actual operation
// modulo 65536
//////////////////////////////////////////////////

int main() {
long const TWO_15{32768};
long const TWO_16{2*TWO_15};

for ( long i{0}; i < TWO_16; ++i ) {


// Just print out the count when we get to multiples of 1024
if ( (i % 1024) == 0 ) {
std::cout << i << std::endl;
}

for ( long j{0}; j < TWO_16; ++j ) {


unsigned short si{i};
unsigned short sj{j};

// Perform the sum as unsigned shorts and as signed longs


unsigned short sk{si + sj};
long k{(i + j) % TWO_16};

if ( k < 0 ) {
k += TWO_16;
}

// If they differ, print out a warning


if ( k != sk ) {
std::cout << i << " + " << j << " = " << k
<< " != " << sk << std::endl;
}

// Perform the difference as unsigned shorts and as signed longs


sk = si - sj;
k = (i - j) % TWO_16;

if ( k < 0 ) {
k += TWO_16;
}

// If they differ, print out a warning


if ( k != sk ) {
std::cout << i << " - " << j << " = " << k
<< " != " << sk << std::endl;
}

// Perform the difference as unsigned shorts and as signed longs


sk = si * sj;
k = (i * j) % TWO_16;

if ( k < 0 ) {
k += TWO_16;
}

// If they differ, print out a warning


if ( k != sk ) {
std::cout << i << " * " << j << " = " << k
<< " != " << sk << std::endl;
}

// We cannot perform division or remainder when the


// second operand is zero, so skip these operations
if ( j != 0 ) {
// Perform the division as unsigned shorts and as signed longs
sk = si / sj;
k = (i / j) % TWO_16;

if ( k < 0 ) {
k += TWO_16;
}

// If they differ, print out a warning


if ( k != sk ) {
std::cout << i << " / " << j << " = " << k
<< " != " << sk << std::endl;
}

// Perform the remainder as unsigned shorts and as signed longs


sk = si % sj;
k = (i % j) % TWO_16;

if ( k < 0 ) {
k += TWO_16;
}

// If they differ, print out a warning


if ( k != sk ) {
std::cout << i << " % " << j << " = " << k
<< " != " << sk << std::endl;
}
}
}
}

return 0;
}
Summary so far
• We have the following:
• Unsigned integers are stored as either 1, 2, 4 or 8 bytes
• The value is stored in the binary representation

Approximate
Type Bytes Bits Range
Range
unsigned char 1 8 0, …, 28 – 1 0, …, 255
unsigned short 2 16 0, …, 216 – 1 0, …, 65535
unsigned int 4 32 0, …, 232 – 1 0, …, 4.3 million
unsigned long 8 64 0, …, 264 – 1 0, …, 1.8 trillion

• You should not memorize the exact ranges


Useful tool…
• Note that 210 = 1024, so 210 ≈ 1000 = 103
• We can use this to estimate magnitudes:
• 212 = 22 210 ≈ 4× 1000 = 4000
• 216 = 26 210 ≈ 64× 1000 = 64000
• 224 = 24 220 = 24 (210)2 ≈ 16× 1000 = 16 million
• 232 = 22 230 = 22 (210)3 ≈ 4× 10003 = 4 billion

• This approximation will underestimate by approximately 2%


Signed types
• We’ve seen that short, int and long all allows you to store
both positive and negative integers
• How do we store such negative numbers?
• Because we have two choices (positive or negative), we could
use one bit to represent the sign: 0 for positive, 1 for negative
• For example:
The sign bit
65535 0111111111111111
2 0000000000000010
1 0000000000000001
0 0000000000000000
–0 ? 1000000000000000 –0 = 0, so do we
–1 1000000000000001 have two zeros?
–2 1000000000000010
–65535 1111111111111111
Signed types
• This is similar to marking the hours of a clock as follows:

• Unfortunately, this leads to ugly arithmetic operations…


–1 + 1 = 0 or –0, but 7 + 1 = 8
–5 + 2 = –3, but 11 + 2 = 1
Signed types
• A better solution:

• Note that
–1 + 1 = 0, but also 11 + 1 = 0
–5 + 2 = –3, but also 7 + 2 = 9, which we are equating to –3
Signed integers
• Here is a workable solution:
• If the leading bit is 0:
• Assume the remainder of the number is the integer represented
• For short, this includes
0000000000000000 0
0111111111111111 215 – 1 = 32767
• This includes 215 different positive numbers
• If the leading bit is 1:
• Assume the number is negative and its magnitude can be found by
applying the 2’s complement algorithm
• Recall the 2’s complement algorithm is self-inverting
Signed integers
• For negative numbers stored as a short:
1000000000000000
0111111111111111
+ 1
1000000000000000
• This is the representation of the largest negative number: –215

1111111111111111
0000000000000000
+ 1
0000000000000001
• This is the representation of the smallest negative number: –1
Signed integers
• Here, you can compare these two techniques
• In both cases, we go from –12/2 to 12/2 – 1 and –216/2 to 216/2 – 1
Signed integers
• For example, 1111111111010110 is a negative short
1111111111010110
0000000000101001
+ 1
0000000000101010
• Thus, it represents –42
• Let’s calculate –42 + 91 = 49 and –42 – 91 = –133:
1111111111010110
+ 0000000001011011
10000000000110001
49
1111111111010110
–91
+ 1111111110100101
131
11111111101111011
–133 10000101
Summary
• To summarize:
• Integer types are stored as either 1, 2, 4 or 8 bytes
• Negative numbers are stored in the 2’s complement representation
Approximate
Type Bytes Bits Range
Range
unsigned char 1 8 0, …, 28 – 1 0, …, 255
unsigned short 2 16 0, …, 216 – 1 0, …, 65535
unsigned int 4 32 0, …, 232 – 1 0, …, 4.3 million
unsigned long 8 64 0, …, 264 – 1 0, …, 1.8 trillion
signed char 1 8 –27, …, 27 – 1 –128, …, 127
short 2 16 –215, …, 215 – 1 –32768, …, 32767
int 4 32 –231, …, 231 – 1 –2.15 million, …, 2.15 million
long 8 64 –263, …, 263 – 1 –0.9 trillion, …, 0.9 trillion
Summary
• Following this lesson, you now
• Understand the representation of unsigned integers
• Know how to perform subtraction using 2’s complement
• Similar to 10’s complement used a century ago
• Understand that signed integers store negative numbers in their
2’s complement representation
• Know that char is actually just an integer type
• It can be interpreted as a printable character if necessary
• Understand the ranges stored by char, short, int and long
References
[1] Wikipedia
https://en.wikipedia.org/wiki/Integer_(computer_science)
https://en.wikipedia.org/wiki/Two%27s_complement
FLOATING POINT DATA
TYPES
Scientific notation
• Recall from secondary school scientific notation that
allows us to write numbers clearly and succinctly:
Conventional notation Scientific notation
0.0000000000667408 6.67408 × 10–11
299792458 2.99792458 × 108
0.0000000000000000000000000000000006626070040 6.626070040 × 10–34
0.00000000000000000016021766208 1.6021766208 × 10–19
8.3144598 8.3144598 × 100
3.14159265358979323 3.14159265358979323 × 100

6.67408 × 10–11
Exponent
Mantissa
Base
Scientific notation
• The number of decimal digits used is the precision:

Scientific notation Precision


6.67408 × 10–11 6
2.99792458 × 108 9
6.626070040 × 10–34 10
1.6021766208 × 10–19 11
8.3144598 × 100 8
3.14159265358979323 × 100 18
Scientific notation
• Without going into detail, each data type has an
approximate maximum precision it can store
Approximate
Data type maximum precision
(decimal digits)
float 7
double 16

• There is generally only one situation where float has


acceptable precision for engineering applications:
• Computer graphics
Scientific notation
• How could you store a floating-point number?
• Store the exponent and mantissa separately, and assume a
decimal point comes after the first digit
Representation*
Scientific notation
float double
6.67408 × 10–11 -11 6674080 -011 +6674080000000000
2.99792458 × 108 +08 2997925 +008 +2997924580000000
6.626070040 × 10–34 -34 6626070 -034 +6626070040000000
1.6021766208 × 10–19 -19 1602177 -019 +1602176620800000
8.3144598 × 100 +00 8314460 +000 +8314469800000000
3.14159265358979323 +00 3141593 +000 +3141592653589793

* In reality, these are stored in binary


Scientific notation
• This fixed precision leads to some weaknesses
• If the exponent is too large, the number cannot be stored

Data type Minimum Maximum


float ± 1.401 × 10–45 ± 3.403 × 1038
double ± 4.941 × 10–324 ± 1.798 × 10308

• There are special values for ±∞ for numbers too large to represent
• There are other values for NAN (not-a-number) to represent
calculations such as 0.0/0.0 and ∞ – ∞
• Numbers too small are represented by 0.0
Weaknesses
• This fixed precision leads to some weaknesses
• It can happen that x + y = x even if y ≠ 0
• The calculation x – y can be problematic if x ≈ y

• In your courses on numerical analysis you will learn how


to mitigate these weaknesses
Weaknesses
• Non-zero numbers act like zero
• Suppose we add these two numbers:

+000 3141592653589793
-019 5749522264293560 e42
3.141592653589793
• Calculating this:
+ 0.0000000000000000005749522264293560
3.1415926535897930005749522264293560

• The representation of this sum is


+000 3141592653589793

• There is no difference…
Weaknesses
Pi + 1e-10 = 3.1415926536897931
• For example: (Pi + 1e-10) - Pi = 1.000000082740371e-10
#include <iostream>
#include <cmath>

int main();
Pi + 1e-16 = 3.1415926535897931
(Pi + 1e-16) - Pi = 0
int main() {
std::cout.precision( 17 ); // Print floating-point numbers to 17 digits of precision

double x{std::acos(-1.0)};
double y{1e-10};
double z{x + y};

std::cout << " Pi + 1e-10 = " << z << std::endl;


std::cout << "(Pi + 1e-10) - Pi = " << (z - x) << std::endl;
std::cout << std::endl;

y = 1e-16;
z = x + y;

std::cout << " Pi + 1e-16 = " << z << std::endl;


std::cout << "(Pi + 1e-16) - Pi = " << (z - x) << std::endl;

return 0;
}
Weaknesses
• Subtraction results in a loss of precision
• Suppose we subtract these two numbers:
-001 8414709848079505
sin 1.0000000000001
-001 8414709848078965 sin 1

• Calculating this: 0.8414709848079505


 0.8414709848078965
0.0000000000000540

• The representation of this sum is

-014 5400000000000000

• The correct answer is 5.403023058680976 × 10–14


Weaknesses
• We can define the derivative of sin(x) at x = 1:

d sin 1  h   sin 1  h 
sin 1  lim
dx h 0 2h

• Thus, in theory, as h gets smaller and smaller,


sin 1  h   sin 1  h 
2h

should be a better and better approximation


Weaknesses
• Let’s try this in C++:
#include <iostream>
#include <cmath> sin 1  h   sin 1  h 
int main(); 2h
int main() {
std::cout.precision( 17 );
double h{1e-10};

std::cout << "Calculating the derivative of sin(x) at x = 1:" << std::endl;


std::cout << " cos(1) = " << std::cos( 1.0 ) << std::endl;

double dsin1{(std::sin( 1.0 + h ) - std::sin( 1.0 - h ))/(2*h)};


std::cout << "When h = " << h << ", " << dsin1 << std::endl;

h = 1e-15;
dsin1 = (std::sin( 1.0 + h ) - std::sin( 1.0 - h ))/(2*h);
std::cout << "When h = " << h << ", " << dsin1 << std::endl;

return 0;
} Calculating the derivative of sin(x) at x = 1:
cos(1) = 0.54030230586813977
When h = 1e-10, 0.54030224738710331
When h = 1.0000000000000001e-15, 0.55511151231257827
IEEE 754-2008
• Originally written in 1985, this document specifies the
representations of both float and double

• Whether you use C++, FORTRAN, Python, or MATLAB,


your calculations will result in exactly the same result
• Only the quality of your algorithms will affect your outcomes
• Java is not IEEE 754 compliant… 
Kahan and Darcy, How Java’s Floating-Point Hurts Everyone Everywhere

• Some computers internally store approximately 20


decimal digits of precision for intermediate calculations
• As soon as the number is written to main memory, only 16 decimal
digits are stored…
Summary
• Following this lesson, you now
• Know floating-point numbers are stored using fixed-precision
scientific notation
• Understand that there are issues—they are not perfect
• In a course on numerical analysis, you will learn to mitigate these
weaknesses
• The float data type is insufficiently precise for most engineering
computation
• Graphics are the one exception…
• Understand that this is defined by the IEEE754 standard
ARRAYS
• In this lesson, we will:
– Describe the limitations of variables
– Introduce arrays
– Describe their design and use
– Consider all the consequences of using arrays
Limitations of parameters
• To this point, we have only had the possibility of supplying
either a fixed number of parameters or having a fixed
number of local variables
• Passing arguments to a function is expensive:
• Each argument must be copied onto the call stack

• Additionally, the number of parameters may vary


Limitations of primitive data types
• Suppose we want to calculate the average of five values:
double average( double x0, double x1, double x2,
double x3, double x4 ) {
return (x0 + x1 + x2 + x3 + x4)/5.0;
}

• Suppose we want to calculate the average of seven values:


double average( double x0, double x1, double x2,
double x3, double x4, double x5,
double x6 ) {
return (x0 + x1 + x2 + x3 + x4 + x5 + x6)/7.0;
}
Limitations of primitive data types
• In some cases, we don’t know how much data we have or
require:
The Good, the Bad and the
• You don’t always know how much memory Ugly
will be required
A Bridge Too Far
• Additional operations may require The Godfather Series
Lawrence of Arabia
arbitrary amounts of additional memory In the Heat of the Night
The Matrix
Kill Bill
The Bridge on the River Kwai
• For example, your list of your favour Doctor Zhivago
Dr. Strangelove
movies may change over time: Apocalypse Now
A Clockwork Orange
Beaufort
Forest Gump
Letters from Iwo Jima
Thomas Crown Affair (both)
The Day of the Jackal
Star Wars
On Her Majesty's Secret
Service
Living Daylights
Hurt Locker
The Alien Series
Ghostbusters
Arrays
• The logical approach is to use an approach similar to a
mathematical sequence:
a0, a1, a2, a3, a4, a5, …, an – 1

• Each entry in this sequence of n items can take on a


different value
• The first could be the most recent voltage reading, the next the
next-most recent reading, and so on
• The wiring in a circuit may have n nodes labeled 0 through n – 1
• Nodal analysis allows you to find the voltages at each of the nodes
Arrays
• We will now look at:
• Array declarations
• Array storage
• Initializing arrays
• Accessing array entries
• Assigning to array entries
Array declarations
• An array of capacity n is identified by the declaration
typename array_identifier[n];

• The capacity n must be a non-negative number


• The compiler allocates sufficiently many contiguous bytes to
store n instances of the given datatype

• Examples:
int temperatures[10]; // an array of 10
integers
double voltages[23]; // an array of 23
floating-
// point numbers
Array storage
• An array of 10 int requires 40 bytes
• Each int requires 4 bytes

int temperatures[10]; // an array of 10


integers

• An array of 23 double requires 23 × 8 = 184 bytes


• Each double requires 8 bytes

double voltages[23]; // an array of 23


floating-
// point numbers
Array entries
• The entries of an array store values of the given type and
may be used like local variables
• The entries of
int data[4]; // an array of 4
integers
are access with
data[0] data[1] data[2] data[3]

• The indices of
datatype array_name[n];
always go from 0 to n - 1
Array initialization
• Consider this uninitialized array:
int main() {
double data[4];

std::cout << data[0] << std::endl;


std::cout << data[1] << std::endl;
std::cout << data[2] << std::endl;
std::cout << data[3] << std::endl;

return 0; These two, by chance, are zero


} The output is
0
0
2.0733e-317
2.0731e-317
Array initialization
• This array has its four entries initialized:
int main() {
double data[4]{47.2, 48.3, 48.9, 49.4};

std::cout << data[0] << std::endl;


std::cout << data[1] << std::endl;
std::cout << data[2] << std::endl;
std::cout << data[3] << std::endl;

return 0;
} The output is
47.2
48.3
48.9
49.4
Array initialization
• If you don’t give enough initial values, the rest are set to
zero:
int main() {
// Sets all entries to 0
double data[4]{};

std::cout << data[0] << std::endl;


std::cout << data[1] << std::endl;
std::cout << data[2] << std::endl;
std::cout << data[3] << std::endl;
The output is
return 0; 0
} 0
0
0
Array initialization
• You can initialize only some of the entries:
int main() {
// Entries 2 and 3 are set to 0
double data[4]{93.5, 97.2};

std::cout << data[0] << std::endl;


std::cout << data[1] << std::endl;
std::cout << data[2] << std::endl;
std::cout << data[3] << std::endl;

return 0; The output is


} 93.5
97.2
0
0
Array initialization
• If you give too many, the compiler will let you know:
int main() {
// Too many initial values
double data[4]{1, 2, 3, 4, 5};

std::cout << data[0] << std::endl;


std::cout << data[1] << std::endl;
std::cout << data[2] << std::endl;
std::cout << data[3] << std::endl;

return 0;
}
example.cpp:6:33: error: too many initializers for 'double [4]'
double data[4]{1, 2, 3, 4, 5};
^
Array entries
• If an array has four entries, those four entries can be
accessed using an index from 0 to 3:
double data[4]; // an array of 4 integers

// Do something with the array...


double average{(data[0] + data[1] + data[2] + data[3])/4.0};

std::cout << "The average entry is " << average << std::endl;

• We can use an array entry exactly the same as we would


any other local variable or parameter of the same type
• The entries of an array of bool can be used in logical expressions
Array entries
• We can use a for-loop to step through an array:
double data[4]; // an array of 4 integers
// Do something with the array...

double maximum{data[0]};
if ( data[1] > maximum ) {
maximum = data[1];
}
if ( data[2] > maximum ) {
maximum = data[2];
}
if ( data[3] > maximum ) {
maximum = data[3];
}

std::cout << "The maximum entry is " << maximum << std::endl;
Array entry assignment
• Each of the ten entries of this array can be assigned a value
int temperature[10]{}; // an array of 10
integers

• The entries are accessed or manipulated like local variables by


using an index:
temperature[0] = 32;
temperature[1] = 35;
temperature[2] = 35;
// ...
temperature[9] = 31;

The indices for an array of capacity n go from 0 to n - 1


Array properties
• Like other local variables:
• Arrays go out of scope
• May or may not be initialized

• An array of double is not a double


• Suppose we declare:
double data[10]{};
• You can use data[3] in an arithmetic expression
• You cannot use data in an arithmetic expression

• Suppose we declare:
bool flags[5]{};
• You can use flags[2] in a logical expression
• You cannot use flags in a logical expression
Looping through an array
• Alternatively, we can loop through an array:
int main() {
double data[4]{25.23, 27.59, 28.10, 28.86};

for ( typename k{0}; k < 4; ++k ) {


std::cout << data[k] << std::endl;
}

return 0;
}

Question: what type for the index k?


int?
unsigned int?
Looping through an array
• Problem: 'unsigned int' is 4 bytes
• The largest index it can store is 232 – 1
• On a 64-bit processors, arrays can have a capacity as large as 264

• Solution: Use 'unsigned long'?


• Real solution: It depends on your processor…
Register size Maximum Appropriate
(bits) array capacity type
64 264 unsigned long
32 232 unsigned int
16 216 unsigned short
8 28 unsigned char
Looping through an array
• Your compiler has a solution:
• Your compiler is written for a specific processor
• It is aware of the specifications of your processor
• The standard library has a specific type just for array capacities and
indices:
std::size_t

• Most non-built-in types are identified with a trailing _t


• std::size_t is an unsigned integer type:
• On a 64-bit processor, it will be 8 bytes
• On a 16-bit processor, it will be 2 bytes
Looping through an array
• Thus, we can loop through the array as follows:
int main() {
double data[4]{25.23, 27.59, 28.10, 28.86};

for ( std::size_t k{0}; k < 4; ++k ) {


std::cout << data[k] << std::endl;
}

return 0;
}
Looping through an array
• Here is another example:
int main() {
double data[4]{25.23, 27.59, 28.10, 28.86};

double maximum{data[0]};

for ( std::size_t k{1}; k < 4; ++k ) {


if ( data[k] > maximum ) {
maximum = data[k];
}
}

std::cout << "The maximum is " << maximum << std::endl;

return 0;
}
Array capacities
• The array capacity need not be known at compile time:
int main();

int main() {
std::size_t capacity{};
std::cout << "Enter the number of data points: ";
std::cin >> capacity;

double data[capacity];

for ( std::size_t k{0}; k < capacity; ++k ) {


std::cout << "Enter datum #" << k << ": ";
std::cin >> data[k];
}

// Do something with the array of data


}
Array capacities
• When declared, however, a capacity must be given:
void f() {
// 'data' is local to f()
// - it must have a specified capacity
double data[];
}

example.cpp: In function 'void f()':


example.cpp:19:16: error: storage size of 'data' isn't known
double data[];
^
Value of an array variable
• We can assign values to the entries of an array
• Question: What is the value of the array itself?

• What is the output of this program?


int main();

int main() {
double data[10]{};

std::cout << data << std::endl;

return 0; A hexadecimal address:


} 0x7fff2fa3bac0
Value of an array variable
• The “value” of an array is the address in memory where
the entries of the array are stored
• In this case, at address 0x7fff2fa3bac0
• Each double is 8 bytes, so we can determine exactly where each
entry is in memory:

0x7fff2fa3bac0 data[0]
0x7fff2fa3bac8 data[1]
0x7fff2fa3bad0 data[2]
0x7fff2fa3bad8 data[3]
0x7fff2fa3bae0 data[4]
0x7fff2fa3bae8 data[5]
0x7fff2fa3baf0 data[6]
0x7fff2fa3baf8 data[7]
0x7fff2fa3bb00 data[8]
0x7fff2fa3bb08 data[9]
Value of an array variable
• Unlike other local variables/parameters, you cannot
assign to arrays
#include <iostream>
int main();
int main() {
double pi{3.14};
double data[10];
double tmp_array[10];

pi = 3.1415926535897932; // This is okay


data = 0x0123456789abcdef;
data = tmp_array;

example.cpp: In return 0; 'int main()':


function
example.cpp:10:10:
} error: incompatible types in assignment of 'long int' to 'double [10]'
data = 0x0123456789abcdef;
^
example.cpp:11:10: error: invalid array assignment
data = tmp_array;
^
Value of an array variable
• Like local variables and parameters, the memory is on the
call stack:
#include <iostream>
int main();
int main() {
double data[10];
double pi{3.14};

return 0;
}
Arrays as parameters
• When a function is called, the arguments are evaluated
and copied to the locations for the parameters on the call
stack
• The parameters are variables restricted to the function
• The arguments can be local variables, but they can also be
expressions

int main() {
double x{3.14};
std::cout << std::sin( x ) << std::endl;
std::cout << std::sin( 2*x + 1 ) << std::endl;

return 0;
}
Arrays as parameters
• Recalling these images:
• Suppose main() has three local variables
• The memory for these variables is on the stack
Arrays as parameters
• If main() calls f(...), the arguments are evaluated and
copied to the appropriate locations reserved for the
parameters
Arrays as parameters
• When f(...) is called, additional space for any local
variable for f(...) is also reserved on the stack
• Inside f(...), you can modify the parameters and local, but when
the function exists, those changes are lost
Arrays as parameters
• We can write a function that accepts an array as a
parameter
int main();
double average( double data[4] );

int main() {
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};

std::cout << average( speeds ) << std::endl;


double average( double data[4] ) {
return 0; double sum{0.0};
}
for ( std::size_t k{0}; k < 4; ++k ) {
sum += data[k];
}

return sum/4.0;
}
Arrays as parameters
• But what is copied to the parameter?
int main();
void print_array( double array[] );

int main() {
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};
std::cout << "Inside main: " << speeds << std::endl;
print_array( speeds );
return 0;
}

void print_array( double array[] ) {


std::cout << "Inside print_array: " << array << std::endl;
} Output:
Inside main: 0x7fff3428d430
Inside print_array: 0x7fff3428d430
Arrays as parameters
• When main() calls print_array(...), it copies the
value of 'speeds' to the location of the parameter
int main() {
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};
std::cout << "Inside main: " << speeds << std::endl;
print_array( speeds );
return 0;
}
Arrays as parameters
• Problem: what if we don’t know the capacity of the array a
priori?
double average( double data[4] );

double average( double data[4] ) {


double sum{0.0};

for ( std::size_t k{0}; k < 4; ++k ) {


sum += data[k];
}

return sum/4.0;
}
Arrays as parameters
We will accept an array of any capacity
— as long as they are double
• We can separately pass the capacity:
double average( double data[], std::size_t capacity );

double average( double data[], std::size_t capacity ) {


double sum{0.0};

for ( std::size_t k{0}; k < capacity; ++k ) {


sum += data[k];
}

return sum/capacity;
}
Arrays as parameters
• We can now call this average as follows:
int main();
double average( double data[], std::size_t capacity );

int main() {
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};

std::cout << average( speeds, 4 ) << std::endl;

return 0;
}
Arrays as parameters
• Suppose we author and then call this function:
double initialize( double array[], std::size_t capacity );

// Set all the entries of the array to 0.0


double initialize( double array[], std::size_t capacity ) {
for ( std::size_t k{0}; k < capacity; ++k ) {
array[k] = 0.0;
}
}
Arrays as parameters
• When we call this function
int main() {
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};
initialize_array( speeds, 4 );
return 0;
}

the address of the array is copied to the parameter


• When inside initialize_array(...), we assign to array[0],
this changes the original array entry speeds[0]
Arrays as parameters
• Thus, the output of
// drone speed in m/s
double speeds[4]{178.2, 182.5, 187.1, 191.6};

for ( std::size_t k{0}; k < 4; ++k ) {


std::cout << "speeds[" << k << "] = " speeds[k]
<< " m/s" << std::endl;
}
Output:
std::cout << std::endl; speeds[0] = 178.2 m/s
initialize_array( speeds, 4 ); speeds[1] = 182.5 m/s
speeds[2] = 187.1 m/s
for ( std::size_t k{0}; k < 4; ++k ) { speeds[3] = 191.6 m/s
std::cout << "speeds[" << k << "] = " speeds[k]
<< " m/s" << std::endl; speeds[0] = 0 m/s
} speeds[1] = 0 m/s
speeds[2] = 0 m/s
speeds[3] = 0 m/s
Exceeding array bounds
• The array
double data[5]{3.7, 4.0, 2.9, 8.6, 1.5};
has entries data[0] through data[4]

• Problem: What will happen if you try to access or


assign to data[-1] or data[5] or even
data[299792458]?
• Solution: It will just look in the appropriate location…

• Question: What is there?


• Answer: Other data including, but not limited to other
local variables and other arrays
Exceeding array bounds
• Well, here is the memory:
double data[5]{3.7, 4.0, 2.9, 8.6, 1.5};
Exceeding array bounds
int main() {
double lengths_of_beetles[5]{3.7, 4.0, 2.9, 8.6, 1.5}; // mm
int account_balances[4]{5923423, 234232, 52351, 2343232}; // cents

return 0;
}

void f() {
// 'x' is uninitialized
double x;
std::cout << "f: The unitialized local variable x = " << x << std::endl;
x = 3.14;
std::cout << "f: The assigned local variable x = " << x << std::endl;
}
Exceeding array bounds
int main() {
double data[5]{3.7, 4.0, 2.9, 8.6, 1.5};

f();
std::cout << "main: data[-5] = " << data[-5] << std::endl;
std::cout << "main: Assigning data[-5] the value 2.71828..." << std::endl;
data[-5] = 2.71828;
f();
std::cout << "main: data[-5] = " << data[-5] << std::endl;

return 0;
}

void f() {
// 'x' is uninitialized
double x;
std::cout << "f: The unitialized local variable x = " << x << std::endl;
x = 3.14;
std::cout << "f: The assigned local variable x = " << x << std::endl;
}
Exceeding array bounds
• The output is:
f: The unitialized local variable x = 6.9167e-310
f: The assigned local variable x = 3.14
main: data[-5] = 3.14
main: Assigning data[-5] the value 2.71828...
f: The unitialized local variable x = 2.71828
f: The assigned local variable x = 3.14
main: data[-5] = 3.14
Exceeding array bounds
• How about this program?
#include <iostream>

int main();

int main() {
double data[10]{3.7, 4.0, 2.9, 8.6, 1.5};

std::cout << data[299792458] << std::endl;

Output:
return 0;
Segmentation fault (core dumped)
}
or some other catastrophic error…
– The program execution is terminated
Exceeding array bounds
• The most common error:
void initialize( double array[], std::size_t capacity );

void initialize( double array[], std::size_t capacity ) {


for ( std::size_t k{1}; k <= capacity; ++k ) {
array[k] = 0.0;
}
}

• Forgetting that an array of capacity 32 has entries


indexed from 0 to 31 one of the most significant issues for
novice programmers
Exceeding array bounds
• Given this program:
int main() {
double data[5]{3.7, 4.0, 2.9, 8.6, 1.5};

initialize( data, 5 );
return 0;
}
• The initialized memory for the array data is here

• We don’t know what is at data[5]


Exceeding array bounds
• Given this program:
int main() {
double data[5]{3.7, 4.0, 2.9, 8.6, 1.5};

initialize( data, 5 );
for ( std::size_t k{1}; k <= capacity; ++k ) {
return 0; array[k] = 0.0;
} }

• After we call initialize(...), we have:


• We just overwrote something…
Summary
• Following this lesson, you now
• Understand how to declare an array as a local variable and
initialize its entries
• Know how to access and assign to array entries
• That array entries can be treated like local variables or parameters of
the same type
• Arrays cannot be used in arithmetic or logical expressions
• Can step through an array with a for loop
• Know that array variables are assigned the address in memory
where that array is stored
• Understand that arrays, if passed as arguments to a function,
simply pass that address
• Changing an array entry of a parameter changes the argument
• Access entries outside the array bounds is dangerous

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy