2 Data - Information - Storage v2

Information Storage
CSE 311 – Computer Organization

Assoc. Prof. Ahmed Fares
What is a bit?
• All data stored in computer systems (hard drives, memory, SD cards, etc.) is
stored as binary digits (bits)
• A bit represents one of two states, “on/off”, “true/false”, “1/0”
• How that bit is stored depends on the medium
• Magnetic (hard drive, floppy disk)
• Electronic (RAM, Registers)
• Optical (CD / DVD / Punch Cards)
• Regardless of how it’s stored

• A bit takes on the value of either 0 or 1
• Why use bits? The hardware for storing and performing computations on bits is
simple and reliable
2
Bytes and Words
• A bit is not very much data, so we usually group a bunch of bits together into logical groupings
• A byte is a group of 8 bits:
• 01100110
• 01100011
• 00110010
• 10101010
• How many unique bytes are there?
• 28 = 256, so a byte can represent a set of up to 256 items
• Bytes are still too small to be the basic size of data for a computer, so we use word size instead
• Indicates the size of pointer data (memory address size)
• For the past 20 years or so the basic word size of most computers was 32 bits (4 GB address space)
• Today newer machines have a word size of 64 bits (16 exabytes address space)
• 00101010010110101010011001101101101101101110000100110001111110100
• Looking at a string of 64 bits is somewhat overwhelming and not a great way of transmitting information
• We’ll see a way to represent words more compactly later
3
Representing Numbers: Decimals
• One of the first things we’d like to do is represent numbers in binary
• But first a quick review of decimal numbers to help us understand binary
numbers
• Decimal numbers are known as a base 10 system
• Numbers are represented by the ten symbols: 0-9
• A decimal number has a one’s place (100), a ten’s place (101), a hundred’s place
(102), a thousand’s place (103) and so on
• Example: decimal number 224
4
Binary (base 2) Number System
• Binary is base 2
• Numbers are represented by the symbols 0 and 1
• A binary number has a one’s place (20), a two’s place (21), a four’s place (22), an
eight’s place (23), and so on
• Used to represent unsigned integers
• Example: 01012 = 510
5
Reading Binary
• With practice, you will soon become comfortable
reading binary numbers up to 255
• All you need to do is add combinations of:
• 1, 2, 4, 8, 16, 32, 64, and 128
• Just like decimal numbers the least significant

digit (or least significant bit LSB) is on the right
• In the number 123, the rightmost digit (3) is in the
one’s place, etc.
• Example: for the binary number 1010
• There is a 1 in the 2’s place and a 1 in the 8’s place, so the
value is 1 + 4 = 5
6
Converting from decimal to binary
• Example: 4210
• 32 is the largest power of two number <= 42 so we know we have a one in the 32’s
column and we subtract 32 from 42, leaving 10
• 16 > 10, so we have a zero in the 16’s column
• 8 <= 10, so we have a one in the 8’s column and we subtract 8 from 10, leaving 2
• 2 >= 2, so we have a one in the 2’s column and we subtract 2 from 2, leaving 0
• Putting it all together 4210 = 1010102
7
Hexadecimal (base 16) Number System 0
1
0000
0001
0
• Hexadecimal (hex) is a base 16 representation 2 0010 2
• We use the letters A-F as the extra “digits” so we count: 3 0011 3

• 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F 4 0100 4
10 11 12 13 14 15
• A hexadecimal number has a one’s place (160), a sixteen’s place 5 0101 5
(161), a two-hundred-fifty-six’s place (162), and so on 6 0110 6

• 0x54B54CDB6DC263F4 7 0111 7
• Each hexadecimal digit represents how many bits? 8 1000 8
• 4 9 1001 9
• How many hexadecimal digits are in a byte? 10 1010 A
• 2 11 1011 B
• How many hexadecimal digits to represent a 64-bit 12 1100 C
number? 13 1101 D
• 16
14 1110 E
15 1111 F
8
Hexadecimal Conversion 0
1
0000
0001
0
• Mapping to and from binary and hex is straightforward 2 0010 2
because base-16 is also a power of 2 3 0011 3
• Binary to Hex: 4 0100 4
Starting from the right-hand side, group the bits in sets of four 5 0101 5
Convert each set of 4 bits to its hex digit 6 0110 6
Example: 7 0111 7
10001101010101000101
8 1000 8
10001101010101000101 -- Group into sets of 4 bits
8 D 5 4 5 -- Convert each set into a hex digit 9 1001 9
0x8D545 10 1010 A
• Hex to Binary 11 1011 B
Convert each hex digit into 4 bits: 12 1100 C

0xF7B36 13 1101 D
1111 0111 1011 0011 0110 14 1110 E
15 1111 F
9
Data Sizes in C (Typical)
• The sizes of the basic data types can vary based on compiler and machine
settings
• These are typical values on a x86-64 system
C Declaration Size in Bytes (64-bit)

char 1
short 2
int 4
long 8
char * 8
float 4
double 8
10
Storing Multi-Byte Data Integer Bytes
94
Address
0000
ADDR 53 0001
• Memory is byte addressable =
0000 7F 0010
• Every byte of memory has an address EA 0011
• Think of memory as a large array with 31 0100
the address as the index in the array ADDR 76 0101
• For multi-byte data =
0100 D9 0110
• The address specifies starting byte 5C 0111
location of the data in memory AB 1000
• The rest of the data is in the increasing ADDR 83 1001
=
memory addresses that follow 1000 75 1010
• Example BB 1011
28 1100
• The Integer at address 4 (0100) contains ADDR
the four bytes: 31 76 D9 5C 39 1101
=
1100 4E 1110
05 1111
11
Byte Ordering
• There are two conventions for the layout of multi-byte objects
• Big Endian and Little Endian
• Example
• 4-byte integer of 0x01234567 is located at memory address 0x100
• This value exists in memory locations 0x100, 0x101, 0x102, 0x103
12
Byte Ordering Takeaways
• X86-64 uses little endian
• When reading data from left to right in increasing memory order:
• The bytes will be in reverse order from how the number is written
• Bytes are always written from MSB (most significant bit) to LSB (least
significant bit) in both endian conventions
• Little endian systems will always have the end byte at the smallest (littlest)
address
• Big endian systems will always have the end byte at the largest (biggest)
address
13
Char Bytes Addr
Representing Strings ess
94 0000
• There is no native type for strings in C 53 0001
• Instead, strings are represented as an array of characters 7F 0010
(char)
EA 0011
• Terminated by the null (the literal value 0) character
‘H’ 48 0100
• Characters ‘e’ 65 0101
• 1 byte represented by an ASCII (American Standard
‘l’ 6C 0110
Code for Information Interchange) encoding
‘l’ 6C 0111
• Arrays
‘o’ 6F 1000
• Data of the same type ordered in contiguous
memory \0 00 1001
• We’ll talk more about arrays later 75 1010
• Example: string “Hello” at address 4 BB 1011

• Note the string takes up 6 bytes of memory 28 1100
because of the null termination byte 39 1101
4E 1110
05 1111 14
ASCII (American Standard Code for Information Interchange) Table
15
Representing Code
• Binary programs have a well-defined format
• Linux uses the ELF binary format
• Programs loaded into memory are represented as a stream of bytes
written in machine language
00000000000006ca <main>:
6ca: 55 push %rbp
6cb: 48 89 e5 mov %rsp,%rbp
6ce: 48 83 ec 20 sub $0x20,%rsp
6d2: 89 7d ec mov %edi,-0x14(%rbp)
6d5: 48 89 75 e0 mov %rsi,-0x20(%rbp)
6d9: 83 7d ec 02 cmpl $0x2,-0x14(%rbp)
6dd: 74 13 je 6f2 <main+0x28>
16
Boolean Algebra
• Developed by George Boole in 19th Century
• Algebraic representation of logic
• Encode “True” as 1 and “False” as 0
Exclusive OR
AND OR XOR NOT
A B A&B A B A|B A B A^B A ~A
0 0 0 0 0 0 0 0 0 0 1
0 1 0 0 1 1 0 1 1 1 0
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
17
Boolean Properties
• Where A and B are Boolean values (either 0 or 1)
• Commutative Law:
• A | B == B | A
• A & B == B & A
• Associative Law:
• A | (B | C) == (A | B) | C
• A & (B & C) == (A & B) & C
• Distributive Law:
• A & (B | C) == (A & B) | (A & C)
• A | (B & C) == (A | B) & (A | C)
• De Morgan’s Theorem:
• ~(A & B) == ~A | ~B
• ~(A | B) == ~A & ~B
18
Bit Vectors and Bit-Level Operations in C
• A bit vector is a string of zeros and ones of some fixed length w
• C supports bitwise Boolean operations on bit vectors
• AND (&), OR (|), NOT (~), and XOR (^)
• Can be used on any integral data type (char, short, int, long)
• View arguments as bit vectors
• Operation is applied bit-wise
• Example:
• char a = 0x69; char b = 0x55;
• a & b; // evaluates to 0x41
• a | b; // evaluates to 0x7D
• a ^ b; // evaluates to 0x3C
19
Shift Operations in C
• Left Shift: x << y
Argument x 01100010
• Shift bit-vector x left y positions
• Throw away extra bits on left << 3 00010000
00010000
• Fill with 0’s on right
Log. >> 2 00011000
• Right Shift: x >> y 00011000
Arith. >> 2
• Shift bit-vector x right y positions
• Throw away extra bits on right
• Logical shift Argument x 10100010
• Fill with 0’s on left << 3 00010000
• Arithmetic shift
• Replicate most significant bit on left Log. >> 2 00101000
• Undefined Behavior Arith. >> 2 11101000

• Shift amount < 0 or ≥ data size
20
Masking Operations
• Goal: modify some bits of a bit vector while leaving the other bits unchanged
• Apply a mask to a value to change only certain bits
• Example: value 0xAA (10101010), mask 0x0F (00001111)
• Setting bits to 1 (OR operator)
• 0xAA | 0x0F // evaluates to 0xAF
• Wherever the mask is 1, that bit will set be 1
• Wherever the mask is 0, the bit will be unchanged
• Clearing bits to 0 (AND operator)
• 0xAA & 0x0F // evaluates to 0x0A
• Wherever the mask is 0, that bit will be cleared to 0
• Wherever the mask is 1, that bit will remain the same.
• Flipping bits (XOR operator)
• 0xAA ^ 0x0F // evaluates to 0x5A
• Wherever the mask is 1, that bit will be flipped
• Wherever the mask is 0, that bit will remain the same
21
Extracting Bits
• Goal: get the value for some subset of bits from a bit vector
• Can do this with a combination of shifting and masking
• Example: get the middle four bits out of a byte

• char a = 0x63; // 01100011
• a = a >> 2; // 00011000
• a = a & 0xF; // 00001000
• Can extract any contagious group of bits by varying the size of the shift and
the masking bits
22
Logic Operations in C
• No Boolean type in C
• View 0 as “False”
• Anything nonzero as “True”
• Logical operators always return 0 (False) or 1 (True)
• Early termination
• Logical operators
• &&, ||, ! (logical AND, OR, NOT)
• Examples (char data type)
• !0x41 -> 0x00
• !0x00 -> 0x01
• !!0x41 -> 0x01
• 0x69 && 0x55 -> 0x01
• 0x69 || 0x55 -> 0x01
23

2 Data - Information - Storage v2

Uploaded by

Copyright:

Available Formats

2 Data - Information - Storage v2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 Data - Information - Storage v2

Uploaded by

Copyright:

Available Formats

Information Storage

CSE 311 – Computer Organization

• Regardless of how it’s stored

• Just like decimal numbers the least significant

• Hexadecimal (hex) is a base 16 representation 2 0010 2

• We use the letters A-F as the extra “digits” so we count: 3 0011 3

(161), a two-hundred-fifty-six’s place (162), and so on 6 0110 6

• Mapping to and from binary and hex is straightforward 2 0010 2

because base-16 is also a power of 2 3 0011 3

• Binary to Hex: 4 0100 4

• Hex to Binary 11 1011 B

Convert each hex digit into 4 bits: 12 1100 C

C Declaration Size in Bytes (64-bit)

• Example: string “Hello” at address 4 BB 1011

• Undefined Behavior Arith. >> 2 11101000

• Can do this with a combination of shifting and masking

• Example: get the middle four bits out of a byte

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.