02 Bits Ints
02 Bits Ints
02 Bits Ints
1
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
2
Binary Representations
0 1 0
3.3V
2.8V
0.5V
0.0V
3
Encoding Byte Values
Byte = 8 bits
▪ Binary 000000002 to 111111112 0 0 0000
▪ Decimal: 010 to 25510 1 1 0001
2 2 0010
▪ Hexadecimal 0016 to FF16 3 3 0011
▪ Base 16 number representation 4 4 0100
5 5 0101
▪ Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 6 6 0110
7 7 0111
▪ Write FA1D37B16 in C as 8 8 1000
– 0xFA1D37B 9 9 1001
A 10 1010
– 0xfa1d37b B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111
4
Byte-Oriented Memory Organization
•••
5
Machine Words
Machine Has “Word Size”
▪ Nominal size of integer-valued data
▪ Including addresses
▪ Most current machines use 32 bits (4 bytes) words
▪ Limits addresses to 4GB
▪ Becoming too small for memory-intensive applications
▪ High-end systems use 64 bits (8 bytes) words
▪ Potential address space ≈ 1.8 X 1019 bytes
▪ x86-64 machines support 48-bit addresses: 256 Terabytes
▪ Machines support multiple data formats
▪ Fractions or multiples of word size
▪ Always integral number of bytes
6
Word-Oriented Memory Organization
32-bit 64-bit
Bytes Addr.
Addresses Specify Byte Words Words
Locations 0000
Addr
▪ Address of first byte in word =
0001
0000
?? 0002
▪ Addresses of successive words differ Addr
0003
by 4 (32-bit) or 8 (64-bit) =
0000
?? 0004
Addr
=
0005
0004
?? 0006
0007
0008
Addr
=
0009
0008
?? 0010
Addr
= 0011
0008
?? 0012
Addr
=
0013
0012
?? 0014
0015
7
Data Representations
C Data Type Typical 32-bit Intel IA32 x86-64
char 1 1 1
short 2 2 2
int 4 4 4
long 4 4 8
long long 8 8 8
float 4 4 4
double 8 8 8
long double 8 10/12 10/16
pointer 4 4 8
8
Byte Ordering
How should bytes within a multi-byte word be ordered in
memory?
Conventions
▪ Big Endian: Sun, PPC Mac, Internet
▪ Least significant byte has highest address
▪ Little Endian: x86
▪ Least significant byte has lowest address
9
Byte Ordering Example
Big Endian
▪ Least significant byte has highest address
Little Endian
▪ Least significant byte has lowest address
Example
▪ Variable x has 4-byte representation 0x01234567
▪ Address given by &x is 0x100
10
Reading Byte-Reversed Listings
Disassembly
▪ Text representation of binary machine code
▪ Generated by program that reads the machine code
Example Fragment
Address Instruction Code Assembly Rendition
8048365: 5b pop %ebx
8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx
804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)
Deciphering Numbers
▪ Value: 0x12ab
▪ Pad to 32 bits: 0x000012ab
▪ Split into bytes: 00 00 12 ab
▪ Reverse: ab 12 00 00
11
Examining Data Representations
Code to Print Byte Representation of Data
▪ Casting pointer to unsigned char * creates byte array
typedef unsigned char *pointer;
Printf directives:
%p: Print pointer
%x: Print Hexadecimal
12
show_bytes Execution Example
int a = 15213;
printf("int a = 15213;\n");
show_bytes((pointer) &a, sizeof(int));
Result (Linux):
int a = 15213;
0x11ffffcb8 0x6d
0x11ffffcb9 0x3b
0x11ffffcba 0x00
0x11ffffcbb 0x00
13
Decimal: 15213
Representing Integers Binary: 0011 1011 0110 1101
Hex: 3 B 6 D
16
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
17
Boolean Algebra
Developed by George Boole in 19th Century
▪ Algebraic representation of logic
▪ Encode “True” as 1 and “False” as 0
And Or
◼ A&B = 1 when both A=1 and B=1 ◼ A|B = 1 when either A=1 or B=1
18
Application of Boolean Algebra
Applied to Digital Systems by Claude Shannon
▪ 1937 MIT Master’s Thesis
▪ Reason about networks of relay switches
▪ Encode closed switch as 1, open switch as 0
A&~B
Connection when
A ~B
A&~B | ~A&B
~A B
~A&B = A^B
19
General Boolean Algebras
Operate on Bit Vectors
▪ Operations applied bitwise
01101001 01101001 01101001
& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001
01000001 01111101
01111101 00111100
00111100 10101010
10101010
All of the Properties of Boolean Algebra Apply
20
Representing & Manipulating Sets
Representation
▪ Width w bit vector represents subsets of {0, …, w–1}
▪ aj = 1 if j ∈ A
▪ 01101001 { 0, 3, 5, 6 }
▪ 76543210
▪ 01010101 { 0, 2, 4, 6 }
▪ 76543210
Operations
▪ & Intersection 01000001 { 0, 6 }
▪ | Union 01111101 { 0, 2, 3, 4, 5, 6 }
▪ ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
▪ ~ Complement 10101010 { 1, 3, 5, 7 }
21
Bit-Level Operations in C
Operations &, |, ~, ^ Available in C
▪ Apply to any “integral” data type
▪ long, int, short, char, unsigned
▪ View arguments as bit vectors
▪ Arguments applied bit-wise
Examples (Char data type)
▪ ~0x41 → 0xBE
▪ ~010000012 101111102
▪ ~0x00 → 0xFF
▪ ~000000002 111111112
▪ 0x69 & 0x55 → 0x41
▪ 011010012 & 010101012 010000012
▪ 0x69 | 0x55 → 0x7D
▪ 011010012 | 010101012 011111012
22
Contrast: Logic Operations in C
Contrast to Logical Operators
▪ &&, ||, !
▪ View 0 as “False”
▪ Anything nonzero as “True”
▪ Always return 0 or 1
▪ Early termination
Examples (char data type)
▪ !0x41 → 0x00
▪ !0x00 → 0x01
▪ !!0x41 → 0x01
23
Shift Operations
Left Shift: x << y Argument x 01100010
▪ Shift bit-vector x left y positions << 3 00010000
– Throw away extra bits on left
Log. >> 2 00011000
▪ Fill with 0’s on right
Arith. >> 2 00011000
Right Shift: x >> y
▪ Shift bit-vector x right y positions
▪ Throw away extra bits on right Argument x 10100010
Undefined Behavior
▪ Shift amount < 0 or ≥ word size
24
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
25
Encoding Integers
Unsigned Two’s Complement
w−1 w−2
B2U(X) = xi 2 i
B2T(X) = − xw−1 2 w−1
+ xi 2 i
i=0 i=0
Sign Bit
▪ For 2’s complement, most significant bit indicates sign
▪ 0 for nonnegative
▪ 1 for negative
26
Encoding Example (Cont.)
x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
28
Values for Different Word Sizes
W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808
Observations C Programming
▪ |TMin | = TMax + 1 ▪ #include <limits.h>
▪ Asymmetric range ▪ Declares constants, e.g.,
▪ UMax = 2 * TMax + 1 ▪ ULONG_MAX
▪ LONG_MAX
▪ LONG_MIN
▪ Values platform specific
29
Unsigned & Signed Numeric Values
X B2U(X) B2T(X) Equivalence
0000 0 0 ▪ Same encodings for nonnegative
0001 1 1 values
0010 2 2
Uniqueness
0011 3 3
0100 4 4 ▪ Every bit pattern represents
0101 5 5 unique integer value
0110 6 6 ▪ Each representable integer has
0111 7 7 unique bit encoding
1000 8 –8 Can Invert Mappings
1001 9 –7
1010 10 –6
▪ U2B(x) = B2U-1(x)
1011 11 –5 ▪Bit pattern for unsigned
1100 12 –4 integer
1101 13 –3 ▪ T2B(x) = B2T-1(x)
1110 14 –2 ▪ Bit pattern for two’s comp
1111 15 –1 integer
30
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
31
Mapping Between Signed & Unsigned
w–1 0
ux + + + ••• +++
x - ++ ••• +++
x x0
ux = w
x + 2 x0
Large negative weight
becomes
Large positive weight
35
Conversion Visualized
2’s Comp. → Unsigned
▪ Ordering Inversion UMax
▪ Negative → Big Positive
UMax – 1
TMax + 1 Unsigned
TMax TMax Range
2’s Complement
0 0
Range
–1
–2
TMin
36
Signed vs. Unsigned in C
Constants
▪ By default are considered to be signed integers
▪ Unsigned if have “U” as suffix
0U, 4294967259U
Casting
▪ Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
37
Casting Surprises
Expression Evaluation
▪ If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
▪ Including comparison operations <, >, ==, <=, >=
▪ Examples for W = 32: TMIN = -2,147,483,648 , TMAX = 2,147,483,647
Constant1 Constant2 Relation Evaluation
0 0 0U
0U == unsigned
-1 -1 00 < signed
-1 -1 0U
0U > unsigned
2147483647
2147483647 -2147483647-1
-2147483648 > signed
2147483647U
2147483647U -2147483647-1
-2147483648 < unsigned
-1 -1 -2
-2 > signed
(unsigned)-1
(unsigned) -1 -2
-2 > unsigned
2147483647
2147483647 2147483648U
2147483648U < unsigned
2147483647
2147483647 (int)2147483648U
(int) 2147483648U > signed
38
Code Security Example
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, MSIZE);
printf(“%s\n”, mybuf);
}
40
Malicious Usage /* Declaration of library function memcpy */
void *memcpy(void *dest, void *src, size_t n);
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, -MSIZE);
. . .
}
41
Summary
Casting Signed ↔ Unsigned: Basic Rules
Bit pattern is maintained
But reinterpreted
Can have unexpected effects: adding or subtracting 2w
42
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
43
Sign Extension
Task:
▪ Given w-bit signed integer x
▪ Convert it to w+k-bit integer with same value
Rule:
▪ Make k copies of sign bit:
▪ X = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
k copies of MSB w
X •••
•••
X ••• •••
k w 44
Sign Extension Example
short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;
45
Summary:
Expanding, Truncating: Basic Rules
Expanding (e.g., short int to int)
▪ Unsigned: zeros added
▪ Signed: sign extension
▪ Both yield expected result
46
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Summary
47
Negation: Complement & Increment
Claim: Following Holds for 2’s Complement
~x + 1 == -x
Complement
▪ Observation: ~x + x == 1111…111 == -1
x 10011101
+ ~x 0 1 1 0 0 0 1 0
-1 11111111
Complete Proof?
48
Complement & Increment Examples
x = 15213
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
~x -15214 C4 92 11000100 10010010
~x+1 -15213 C4 93 11000100 10010011
y -15213 C4 93 11000100 10010011
x=0
Decimal Hex Binary
0 0 00 00 00000000 00000000
~0 -1 FF FF 11111111 11111111
~0+1 0 00 00 00000000 00000000
49
Unsigned Addition
Operands: w bits u •••
+v •••
True Sum: w+1 bits
u+v •••
Discard Carry: w bits UAddw(u , v) •••
u+ v u + v 2w
UAdd w (u,v) = w
u + v − 2 u + v 2w
50