2 Data - Information - Storage v2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Information Storage

CSE 311 – Computer Organization


Assoc. Prof. Ahmed Fares
What is a bit?
• All data stored in computer systems (hard drives, memory, SD cards, etc.) is
stored as binary digits (bits)
• A bit represents one of two states, “on/off”, “true/false”, “1/0”
• How that bit is stored depends on the medium
• Magnetic (hard drive, floppy disk)
• Electronic (RAM, Registers)
• Optical (CD / DVD / Punch Cards)

• Regardless of how it’s stored


• A bit takes on the value of either 0 or 1
• Why use bits? The hardware for storing and performing computations on bits is
simple and reliable
2
Bytes and Words
• A bit is not very much data, so we usually group a bunch of bits together into logical groupings
• A byte is a group of 8 bits:
• 01100110
• 01100011
• 00110010
• 10101010
• How many unique bytes are there?
• 28 = 256, so a byte can represent a set of up to 256 items
• Bytes are still too small to be the basic size of data for a computer, so we use word size instead
• Indicates the size of pointer data (memory address size)
• For the past 20 years or so the basic word size of most computers was 32 bits (4 GB address space)
• Today newer machines have a word size of 64 bits (16 exabytes address space)
• 00101010010110101010011001101101101101101110000100110001111110100
• Looking at a string of 64 bits is somewhat overwhelming and not a great way of transmitting information
• We’ll see a way to represent words more compactly later

3
Representing Numbers: Decimals
• One of the first things we’d like to do is represent numbers in binary
• But first a quick review of decimal numbers to help us understand binary
numbers
• Decimal numbers are known as a base 10 system
• Numbers are represented by the ten symbols: 0-9
• A decimal number has a one’s place (100), a ten’s place (101), a hundred’s place
(102), a thousand’s place (103) and so on
• Example: decimal number 224

4
Binary (base 2) Number System
• Binary is base 2
• Numbers are represented by the symbols 0 and 1
• A binary number has a one’s place (20), a two’s place (21), a four’s place (22), an
eight’s place (23), and so on
• Used to represent unsigned integers
• Example: 01012 = 510

5
Reading Binary
• With practice, you will soon become comfortable
reading binary numbers up to 255
• All you need to do is add combinations of:
• 1, 2, 4, 8, 16, 32, 64, and 128

• Just like decimal numbers the least significant


digit (or least significant bit LSB) is on the right
• In the number 123, the rightmost digit (3) is in the
one’s place, etc.
• Example: for the binary number 1010
• There is a 1 in the 2’s place and a 1 in the 8’s place, so the
value is 1 + 4 = 5

6
Converting from decimal to binary
• Example: 4210
• 32 is the largest power of two number <= 42 so we know we have a one in the 32’s
column and we subtract 32 from 42, leaving 10
• 16 > 10, so we have a zero in the 16’s column
• 8 <= 10, so we have a one in the 8’s column and we subtract 8 from 10, leaving 2
• 4 > 2, so we have a zero in the 4’s column
• 2 >= 2, so we have a one in the 2’s column and we subtract 2 from 2, leaving 0
• 1 > 0, so we have a zero in the 1’s column
• Putting it all together 4210 = 1010102

7
Hexadecimal (base 16) Number System 0

1
0000

0001
0

• Hexadecimal (hex) is a base 16 representation 2 0010 2

• We use the letters A-F as the extra “digits” so we count: 3 0011 3


• 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F 4 0100 4
10 11 12 13 14 15
• A hexadecimal number has a one’s place (160), a sixteen’s place 5 0101 5

(161), a two-hundred-fifty-six’s place (162), and so on 6 0110 6


• 0x54B54CDB6DC263F4 7 0111 7
• Each hexadecimal digit represents how many bits? 8 1000 8
• 4 9 1001 9
• How many hexadecimal digits are in a byte? 10 1010 A
• 2 11 1011 B
• How many hexadecimal digits to represent a 64-bit 12 1100 C
number? 13 1101 D
• 16
14 1110 E

15 1111 F
8
Hexadecimal Conversion 0

1
0000

0001
0

• Mapping to and from binary and hex is straightforward 2 0010 2

because base-16 is also a power of 2 3 0011 3

• Binary to Hex: 4 0100 4

Starting from the right-hand side, group the bits in sets of four 5 0101 5
Convert each set of 4 bits to its hex digit 6 0110 6
Example: 7 0111 7
10001101010101000101
8 1000 8
10001101010101000101 -- Group into sets of 4 bits
8 D 5 4 5 -- Convert each set into a hex digit 9 1001 9

0x8D545 10 1010 A

• Hex to Binary 11 1011 B

Convert each hex digit into 4 bits: 12 1100 C


0xF7B36 13 1101 D
1111 0111 1011 0011 0110 14 1110 E

15 1111 F
9
Data Sizes in C (Typical)
• The sizes of the basic data types can vary based on compiler and machine
settings
• These are typical values on a x86-64 system

C Declaration Size in Bytes (64-bit)


char 1
short 2
int 4
long 8
char * 8
float 4
double 8

10
Storing Multi-Byte Data Integer Bytes
94
Address
0000
ADDR 53 0001
• Memory is byte addressable =
0000 7F 0010
• Every byte of memory has an address EA 0011
• Think of memory as a large array with 31 0100
the address as the index in the array ADDR 76 0101
• For multi-byte data =
0100 D9 0110
• The address specifies starting byte 5C 0111
location of the data in memory AB 1000
• The rest of the data is in the increasing ADDR 83 1001
=
memory addresses that follow 1000 75 1010

• Example BB 1011
28 1100
• The Integer at address 4 (0100) contains ADDR
the four bytes: 31 76 D9 5C 39 1101
=
1100 4E 1110
05 1111
11
Byte Ordering
• There are two conventions for the layout of multi-byte objects
• Big Endian and Little Endian
• Example
• 4-byte integer of 0x01234567 is located at memory address 0x100
• This value exists in memory locations 0x100, 0x101, 0x102, 0x103

12
Byte Ordering Takeaways
• X86-64 uses little endian
• When reading data from left to right in increasing memory order:
• The bytes will be in reverse order from how the number is written
• Bytes are always written from MSB (most significant bit) to LSB (least
significant bit) in both endian conventions
• Little endian systems will always have the end byte at the smallest (littlest)
address
• Big endian systems will always have the end byte at the largest (biggest)
address

13
Char Bytes Addr
Representing Strings ess
94 0000
• There is no native type for strings in C 53 0001
• Instead, strings are represented as an array of characters 7F 0010
(char)
EA 0011
• Terminated by the null (the literal value 0) character
‘H’ 48 0100
• Characters ‘e’ 65 0101
• 1 byte represented by an ASCII (American Standard
‘l’ 6C 0110
Code for Information Interchange) encoding
‘l’ 6C 0111
• Arrays
‘o’ 6F 1000
• Data of the same type ordered in contiguous
memory \0 00 1001
• We’ll talk more about arrays later 75 1010

• Example: string “Hello” at address 4 BB 1011


• Note the string takes up 6 bytes of memory 28 1100
because of the null termination byte 39 1101
4E 1110
05 1111 14
ASCII (American Standard Code for Information Interchange) Table

15
Representing Code
• Binary programs have a well-defined format
• Linux uses the ELF binary format
• Programs loaded into memory are represented as a stream of bytes
written in machine language
00000000000006ca <main>:
6ca: 55 push %rbp
6cb: 48 89 e5 mov %rsp,%rbp
6ce: 48 83 ec 20 sub $0x20,%rsp
6d2: 89 7d ec mov %edi,-0x14(%rbp)
6d5: 48 89 75 e0 mov %rsi,-0x20(%rbp)
6d9: 83 7d ec 02 cmpl $0x2,-0x14(%rbp)
6dd: 74 13 je 6f2 <main+0x28>

16
Boolean Algebra
• Developed by George Boole in 19th Century
• Algebraic representation of logic
• Encode “True” as 1 and “False” as 0
Exclusive OR
AND OR XOR NOT
A B A&B A B A|B A B A^B A ~A
0 0 0 0 0 0 0 0 0 0 1
0 1 0 0 1 1 0 1 1 1 0
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0

17
Boolean Properties
• Where A and B are Boolean values (either 0 or 1)
• Commutative Law:
• A | B == B | A
• A & B == B & A
• Associative Law:
• A | (B | C) == (A | B) | C
• A & (B & C) == (A & B) & C
• Distributive Law:
• A & (B | C) == (A & B) | (A & C)
• A | (B & C) == (A | B) & (A | C)
• De Morgan’s Theorem:
• ~(A & B) == ~A | ~B
• ~(A | B) == ~A & ~B

18
Bit Vectors and Bit-Level Operations in C
• A bit vector is a string of zeros and ones of some fixed length w
• C supports bitwise Boolean operations on bit vectors
• AND (&), OR (|), NOT (~), and XOR (^)
• Can be used on any integral data type (char, short, int, long)
• View arguments as bit vectors
• Operation is applied bit-wise
• Example:
• char a = 0x69; char b = 0x55;
• a & b; // evaluates to 0x41
• a | b; // evaluates to 0x7D
• a ^ b; // evaluates to 0x3C

19
Shift Operations in C
• Left Shift: x << y
Argument x 01100010
• Shift bit-vector x left y positions
• Throw away extra bits on left << 3 00010000
00010000
• Fill with 0’s on right
Log. >> 2 00011000
• Right Shift: x >> y 00011000
Arith. >> 2
• Shift bit-vector x right y positions
• Throw away extra bits on right
• Logical shift Argument x 10100010
• Fill with 0’s on left << 3 00010000
• Arithmetic shift
• Replicate most significant bit on left Log. >> 2 00101000

• Undefined Behavior Arith. >> 2 11101000


• Shift amount < 0 or ≥ data size

20
Masking Operations
• Goal: modify some bits of a bit vector while leaving the other bits unchanged
• Apply a mask to a value to change only certain bits
• Example: value 0xAA (10101010), mask 0x0F (00001111)
• Setting bits to 1 (OR operator)
• 0xAA | 0x0F // evaluates to 0xAF
• Wherever the mask is 1, that bit will set be 1
• Wherever the mask is 0, the bit will be unchanged
• Clearing bits to 0 (AND operator)
• 0xAA & 0x0F // evaluates to 0x0A
• Wherever the mask is 0, that bit will be cleared to 0
• Wherever the mask is 1, that bit will remain the same.
• Flipping bits (XOR operator)
• 0xAA ^ 0x0F // evaluates to 0x5A
• Wherever the mask is 1, that bit will be flipped
• Wherever the mask is 0, that bit will remain the same

21
Extracting Bits
• Goal: get the value for some subset of bits from a bit vector

• Can do this with a combination of shifting and masking

• Example: get the middle four bits out of a byte


• char a = 0x63; // 01100011
• a = a >> 2; // 00011000
• a = a & 0xF; // 00001000

• Can extract any contagious group of bits by varying the size of the shift and
the masking bits
22
Logic Operations in C
• No Boolean type in C
• View 0 as “False”
• Anything nonzero as “True”
• Logical operators always return 0 (False) or 1 (True)
• Early termination
• Logical operators
• &&, ||, ! (logical AND, OR, NOT)
• Examples (char data type)
• !0x41 -> 0x00
• !0x00 -> 0x01
• !!0x41 -> 0x01
• 0x69 && 0x55 -> 0x01
• 0x69 || 0x55 -> 0x01

23

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy