0% found this document useful (0 votes)

5 views44 pages

Chapter 5 - Memory

Chapter 5 discusses the exploitation of memory hierarchy in computer systems, detailing the boot process, principles of locality, and the structure of caches. It explains the differences between SRAM and DRAM technologies, the organization of memory, and the implications of cache hits and misses. Additionally, it covers write policies and the impact of block size on cache performance.

Uploaded by

Kiet Do

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views44 pages

Chapter 5 - Memory

Uploaded by

Kiet Do

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Chapter 5

Exploiting Memory Hierarchy

C Programs
So Far in CDA 4205 #include <stdlib.h>

RISC-V Assembly int fib(int n) {

return
fib(n-1) +
fib(n-2);
ld x5, 4(x10) }
Datapath addi x6, x5, 3
beq x6, x7, foo

Caches

Memory
What Happens at Boot?
When the computer switches ON, the CPU executes instructions from some start
address (stored in Flash ROM)

CPU
Memory mapped

0x0002000:
Code to copy firmware into
regular memory and jump
into it)

PC = 0x2000 (some default value) Address Space

What Happens at Boot?
1. BIOS*: Find a storage 4. Init: Launch an application
device and load first that waits for input in loop
sector (block of data) (e.g., Terminal/Desktop/...

2. Bootloader (stored on, e.g.,

disk): Load the OS kernel from
disk into a location in memory 3. OS Boot: Initialize
and jump into it services, drivers, etc.

*BIOS: Basic Input Output System

Principle of Locality
Programs access a small proportion of their address space at any time
Temporal locality
If a data location is referenced, then it will tend to be referenced again soon. (i.e.
program access same set of memory locations for a period of time)
Better to store frequently-accessed values nearer the CPU
e.g., instructions in a loop

Spatial locality
If a data location is referenced, data locations with nearby addresses will tend to
be referenced soon.
Useful to pre-load data that is close (in address) to other recently accessed data
E.g., sequential instruction access, array data
Great Idea #3: Principle of Locality/Memory Hierarchy
Taking Advantage of Locality
Memory hierarchy
Store everything on disk
Copy recently accessed (and nearby) items from disk to
smaller DRAM memory
Main memory
Copy more recently accessed (and nearby) items from
DRAM to smaller SRAM memory
Cache memory attached to CPU
Memory Hierarchy Levels
Block (aka line): unit of copying
May be multiple words

Upper level: It is closer to the processor, smaller,

and faster than lower level, since upper level
uses more expensive technology.
If accessed data is present in upper level
Hit: access satisfied by upper level
Hit ratio: hits/accesses
If accessed data is absent
Miss: block copied from lower level
Time taken: miss penalty
Miss ratio: misses/accesses = 1 hit ratio
Then accessed data supplied from upper level
Memory Hierarchy Levels
By implementing the memory system as a
hierarchy, the program has the illusion of a
memory that is as large as the largest level of
the hierarchy but can be accessed as if it were
all built from the fastest memory.

Flash memory has replaced disks in many

personal mobile devices, and may lead to a new
level in the storage hierarchy for desktop and
server computers
SRAM Technology
SRAMs are simply integrated circuits that are memory arrays with (usually) a
single access port that can provide either a read or a write. SRAMs have a fixed
access time to any piece of information, though the read and write access
times may differ.

It need to refresh and so the access time is very close to the cycle time.
SRAMs typically use six to eight transistors per bit to prevent the information
from being disturbed when it is read. SRAM needs only minimal power to retain
the charge in standby mode.

In the past, most PCs and server systems used separate SRAM chips for either
their primary, secondary, or even tertiary caches.

Today, thanks to Law, all levels of caches are integrated onto the
processor chip, so the market for separate SRAM chips has nearly
evaporated.
DRAM Technology
Data stored as a charge in a capacitor
Single transistor used to access the charge
Must periodically be refreshed
Read contents and write back
Advanced DRAM Organization
Bits in a DRAM are organized as a rectangular array
DRAM accesses an entire row
Burst mode: supply successive words from a row with
reduced latency
Double data rate (DDR) DRAM
Get twice as much bandwidth based on the clock rate and
the data width
Quad data rate (QDR) DRAM
Separate DDR inputs and outputs
DRAM Performance Factors
Row buffer
Allows several words to be read and refreshed in parallel

Synchronous DRAM
Allows for consecutive accesses in bursts without needing
to send each address
Improves bandwidth

DRAM banking
Allows simultaneous access to multiple DRAMs
Improves bandwidth
Flash Types
NOR flash: bit cell like a NOR gate
Random read/write access
Used for instruction memory in embedded systems

NAND flash: bit cell like a NAND gate

Denser (bits/area), but block-at-a-time access
Cheaper per GB

Not suitable for direct RAM or disk replacement

Wear levelling: remap data to less used blocks
What About SSD?
Made with transistors
Nothing mechanical that turns

Fast access to all locations, regardless of address

Still much slower than register, DRAM
Read/write blocks, not bytes
Potential reliability issues

15
Memory Terms
Memory hierarchy A structure that uses multiple levels of memories; as the distance from
the processor increases, the size of the memories and the access time both increase.
Block (or line): The minimum unit of information that can be either present or not present
in a cache.
Hit rate: The fraction of memory accesses found in a level of the memory hierarchy.
Miss rate: The fraction of memory accesses not found in a level of the memory hierarchy.
Hit time: The time required to access a level of the memory hierarchy, including the time
needed to determine whether the access is a hit or a miss.
Miss penalty: The time required to fetch a block into a level of the memory hierarchy from
the lower level, including the time to access the block, transmit it from one level to the
other, insert it in the level that experienced the miss, and then pass the block to the
requestor.
The Basics of Caches
Cache: represents the level of the memory hierarchy between the
processor and main memory in the first commercial computer to have
this extra level. The memories in the Datapath are simply replaced by
caches.
Cache Memory
The level of the memory hierarchy closest to the CPU
Given accesses X1 Xn 1, Xn (references)

How do we know if the data

is present?
Where do we look?
If each word can go in exactly one place in the cache,
then it is straightforward to find the word if it is in the
cache.

The simplest way to assign a location in the cache for

each word in memory is to assign the cache location
based on the address of the word in memory.
Direct Mapped Cache
A cache structure in which each memory location is mapped to
exactly one location in the cache.
Location determined by address
Direct mapped: only one choice Blocks is a power of 2
(Block address) modulo (#Blocks in cache) Use low-order address bits

Because there are eight words in the cache, an address X maps

to the direct-mapped cache word X modulo 8.

The low-order log2(8) = 3 bits are used as the cache index.

Addresses 00001two, 01001two, 10001two, and 11001two all map to
entry 001two of the cache

while addresses 00101two, 01101two, 10101two, and 11101two all

map to entry 101two of the cache.
Tags and Valid Bits
Valid Bit: A field in the tables of a memory hierarchy that indicates that the associated
block in the hierarchy contains valid data.

Tag: A field in a table used for a memory hierarchy that contains the address
information required to identify whether the associated block in the hierarchy
corresponds to a requested word.

How do we know which particular block is stored in a cache location?

Store block address as well as the data
Actually, only need the high-order bits
Called the tag
What if there is no data in a location?
Valid bit: 1 = present, 0 = not present
Initially 0
Address Subdivision
This cache holds 1024 words or 4 KiB.

The tag from the cache is compared against the upper portion
of the address to determine whether the entry in the cache
corresponds to the requested address.

Because the cache has 210 (or 1024) words and a block size of
one word, 10 bits are used to index the cache, leaving 32 10
2 = 20 bits to be compared against the tag.

The size of the tag field: N = K + M + 2 bits

N = length of virtual address
K-bit: Tag field in each cache entry
M-bit: Middle of virtual address that points to one
cache entry
2-Bits (byte offset): Not used to address data in the
cache

If the tag and upper 20 bits of the address are equal and the
valid bit is on, then the request hits in the cache, and the word
is supplied to the processor. Otherwise, a miss occurs.
Initial state of the cache after power-on
8-blocks, 1 word/block, direct mapped

The cache is initially empty, with all valid bits (V entry in

Index V Tag Data cache) turned off (N).
000 N
Since the cache is empty, several of the 1st references
001 N are misses
010 N
011 N
100 N
101 N
110 N
111 N
After handling a miss of address (10110two)
Word addr Binary addr Hit/miss Cache block The cache is initially empty, with all valid bits (V entry in
cache) turned off (N).
22 10 110 Miss 110

The processor requests the following addresses:

Index V Tag Data 10110two (miss), 11010two (miss), 10110two (hit), 11010two
000 N (hit), 10000two (miss), 00011two (miss), 10000two (hit),
10010two (miss), and 10000two (hit).
001 N
010 N
011 N
100 N
101 N
110 Y 10 Mem[10110]
111 N
After handling a miss of address (11010two)
Word addr Binary addr Hit/miss Cache block The processor requests the following addresses:
10110two (miss), 11010two (miss), 10110two (hit), 11010two
26 11 010 Miss 010 (hit), 10000two (miss), 00011two (miss), 10000two (hit),
10010two (miss), and 10000two (hit).
Index V Tag Data
The cache contents after each miss in the sequence has
000 N been handled. When address 10010two (18) is referenced,
001 N the entry for address 11010two (26) must be replaced,
and a reference to 11010two will cause a subsequent
010 Y 11 Mem[11010] miss. The tag field will contain only the upper portion of
011 N the address. The full address of a word contained in
cache block i with tag field j for this cache is j 8 i, or
100 N equivalently the concatenation of the tag field j and the
101 N index i. For example, in cache f above, index 010two has
tag 10two and corresponds to address 10010two.
110 Y 10 Mem[10110]
111 N
After handling a miss of address (11010two)
Word addr Binary addr Hit/miss Cache block
22 10 110 Hit 110
26 11 010 Hit 010

Index V Tag Data

000 Y 10 Mem[10000]
001 N
010 Y 10 Mem[10010]
011 Y 00 Mem[00011]
100 N
101 N
110 Y 10 Mem[10110]
111 N
Tags and Valid Bits
Assigned cache block (where
Decimal address of reference Binary address of reference Hit or miss in cache
found or placed)
22 10110two miss (5.6b) (10110two mod 8) = 110two
26 11010two miss (5.6c) (11010two mod 8) = 010two
22 10110two hit (10110two mod 8) = 110two
26 11010two hit (11010two mod 8) = 010two
16 10000two miss (5.6d) (10000two mod 8) = 000two
3 00011two miss (5.6e) (00011two mod 8) = 011two
16 10000two hit (10000two mod 8) = 000two
18 10010two miss (5.6f) (10010two mod 8) = 010two
16 10000two hit (10000two mod 8) = 000two
Block Size Considerations
Larger blocks should reduce miss rate
Due to spatial locality

But in a fixed-sized cache

Larger blocks fewer of them
More competition increased miss rate
Larger blocks pollution

Larger miss penalty

Can override benefit of reduced miss rate
Early restart and critical-word-first can help
Cache Misses
A request for data from the cache that cannot be filled because
the data is not present in the cache.
On cache hit, CPU proceeds normally
On cache miss
Stall the CPU pipeline
Fetch block from next level of hierarchy
Instruction cache miss
Restart instruction fetch
Data cache miss
Complete data access
Write Policy
Write-through
Update both upper and lower levels
Simplifies replacement, but may require write buffer
Write-back
Update upper level only
Update lower level when block is replaced
Need to keep more state
Virtual memory
Only write-back is feasible, given disk write latency
Write-Through
A scheme in which writes always update both the cache and the
next lower level of the memory hierarchy, ensuring that data is
always consistent between the two.
On data-write hit, could just update the block in cache
But then cache and memory would be inconsistent
But makes writes take longer
Solution: write buffer
Holds data waiting to be written to memory
CPU continues immediately
Only stalls on write if write buffer is already full
Write-Back
Alternative: On data-write hit, just update
the block in cache
Keep track of whether each block is dirty

When a dirty block is replaced

Write it back to memory
Can use a write buffer to allow replacing block
to be read first
Write Allocation
What should happen on a write miss?
Alternatives for write-through
Allocate on miss: fetch the block

Since programs often write a whole block before reading it (e.g.,

initialization)
For write-back
Usually fetch the block
Associative Caches
Fully associative
A cache structure in which a block can be placed in any location in the cache

Set associative
A cache that has a fixed number of locations (at least two) where each block can be
placed.

Direct Mapped - There is a direct mapping from any block address in memory to a single
location in the upper level of the hierarchy.
Associative Cache Example
In direct-mapped placement, there is only
one cache block where memory block 12
can be found, and that block is given by
(12 modulo 8) = 4.

In a two-way set-associative cache, there

would be four sets, and memory block 12
must be in set (12 mod 4) = 0; the memory
block could be in either element of the set.

In a fully associative placement, the

memory block for block address 12 can
appear in any of the eight cache blocks.
Spectrum of Associativity
For a cache with 8 entries
The total size of the cache in blocks is equal to the
number of sets times the associativity.

Thus, for a fixed cache size, increasing the

associativity decreases the number of sets while
increasing the number of elements per set. With
eight blocks, an eight-way set- associative cache is
the same as a fully associative cache.
Block address Cache block

Associativity Example 0
6
(0 modulo 4) = 0
(6 modulo 4) = 2

Compare 4-block caches 8 (8 modulo 4) = 0

Direct mapped, 2-way set associative,

fully associative
Block access sequence: 0, 8, 0, 6, 8

Direct mapped
Block Cache Hit/miss Cache content after access
address index
0 1 2 3
0 0 miss Mem[0]
8 0 miss Mem[8]
0 0 miss Mem[0]
6 2 miss Mem[0] Mem[6]
8 0 miss Mem[8] Mem[6]
Associativity Example
2-way set associative
Cache content after access Block address Cache set
Block Cache
address index Hit/miss
Set 0 Set 1 0 (0 modulo 2) = 0
0 0 miss Mem[0]
6 (6 modulo 2) = 0
8 0 miss Mem[0] Mem[8]
8 (8 modulo 2) = 0
0 0 hit Mem[0] Mem[8]
6 0 miss Mem[0] Mem[6]
8 0 miss Mem[8] Mem[6]

Fully associative
Block Hit/miss
address Cache content after access
0 miss Mem[0]
8 miss Mem[0] Mem[8]
0 hit Mem[0] Mem[8]
6 miss Mem[0] Mem[8] Mem[6]
8 hit Mem[0] Mem[8] Mem[6]
Set Associative Cache Organization
The comparators determine which element of
the selected set (if any) matches the tag. The
output of the comparators is used to select the
data from one of the four blocks of the
indexed set, using a multiplexor with a
decoded select signal.

In some implementations, the Output enable

signals on the data portions of the cache RAMs
can be used to select the entry in the set that
drives the output. The Output enable signal
comes from the comparators, causing the
element that matches to drive the data
outputs. This organization eliminates the need
for the multiplexor.
Block Placement
Determined by associativity
Direct mapped (1-way associative)
One choice for placement
N-way set associative
n choices within a set
Fully associative
Any location
Higher associativity reduces miss rate
Increases complexity, cost, and access time
Finding a Block
Associativity Location method Tag comparisons
Direct mapped Index 1
n-way set associative Set index, then search n
entries within the set
Fully associative Search all entries #entries
Full lookup table 0

Hardware caches
Reduce comparisons to reduce cost
Virtual memory
Full table lookup makes full associativity feasible
Benefit in reduced miss rate
Cache Design Trade-offs

Design change Effect on miss rate Negative Terminology:

performance effect Compulsory Miss (aka cold-
Increase cache size Decrease capacity May increase access start miss): first time block is
misses time accessed
Capacity Miss: replaced
Increase associativity Decrease conflict May increase access block is later accessed again,
misses time occurs due to finite cache size
Increase block size Decrease compulsory Increases miss Conflict Miss (aka collision
misses penalty. For very large miss): due to competition for
block size, may entries in a set; would not
increase miss rate occur in a fully associative
due to pollution. cache of same size
Interface Signals

Read/Write Read/Write
Valid Valid
32 32
Address Address
CPU Write Data
32 Cache Write Data
128 Memory
32 128
Read Data Read Data
Ready Ready

Multiple cycles
per access

CS2115 chapter-6
No ratings yet
CS2115 chapter-6
45 pages
Guideline - en Simotion PDF
No ratings yet
Guideline - en Simotion PDF
73 pages
Module 4: Memory System Organization & Architecture
No ratings yet
Module 4: Memory System Organization & Architecture
97 pages
Coin Based Mobile Charger Project Report
92% (13)
Coin Based Mobile Charger Project Report
49 pages
week10
No ratings yet
week10
59 pages
Lecture11 Cda3101
No ratings yet
Lecture11 Cda3101
73 pages
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
No ratings yet
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
77 pages
13_Large and Fast Exploiting Memory Hierarchy Final
No ratings yet
13_Large and Fast Exploiting Memory Hierarchy Final
118 pages
V162-7A IH2 MVME162LX Installation Oct97
No ratings yet
V162-7A IH2 MVME162LX Installation Oct97
153 pages
Lecture 03
No ratings yet
Lecture 03
37 pages
5 Memory Hierarchy
No ratings yet
5 Memory Hierarchy
99 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
76 pages
CMP3010L08 Memory
No ratings yet
CMP3010L08 Memory
45 pages
Embedded Systems: Applications in Imaging and Communication
No ratings yet
Embedded Systems: Applications in Imaging and Communication
71 pages
LPC55S06 Manual
No ratings yet
LPC55S06 Manual
1,029 pages
Cache Mapping
100% (1)
Cache Mapping
44 pages
BiD 05
No ratings yet
BiD 05
97 pages
Lecture 13 16 Post
No ratings yet
Lecture 13 16 Post
24 pages
06_Memory System_I
No ratings yet
06_Memory System_I
63 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Computer Architecture: Memory Hierarchy Design
No ratings yet
Computer Architecture: Memory Hierarchy Design
60 pages
Chapter 3 Large and Fast
No ratings yet
Chapter 3 Large and Fast
86 pages
CH10 - Memory Hierarchy
No ratings yet
CH10 - Memory Hierarchy
106 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
Unit 4 Memory Hierarchy
No ratings yet
Unit 4 Memory Hierarchy
66 pages
Winter 2022 Model Answer Paper
75% (4)
Winter 2022 Model Answer Paper
25 pages
CS140 Computer Organization: Chapter 6: Memory
No ratings yet
CS140 Computer Organization: Chapter 6: Memory
81 pages
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
No ratings yet
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
32 pages
UNIT IV.ppt
No ratings yet
UNIT IV.ppt
61 pages
Memory Design
No ratings yet
Memory Design
36 pages
Computer Half practicals XI Sindh board
No ratings yet
Computer Half practicals XI Sindh board
29 pages
Chap 6
No ratings yet
Chap 6
48 pages
Memory Organization AndCache Mapping Study 13
No ratings yet
Memory Organization AndCache Mapping Study 13
55 pages
Lecture 04 IS064
No ratings yet
Lecture 04 IS064
41 pages
Memory Organization: Dr. Bernard Chen PH.D
No ratings yet
Memory Organization: Dr. Bernard Chen PH.D
77 pages
CH7- Memory Organization
No ratings yet
CH7- Memory Organization
38 pages
Week 12 - Lecture 12 - Memory
No ratings yet
Week 12 - Lecture 12 - Memory
27 pages
ACA Unit 2
No ratings yet
ACA Unit 2
45 pages
ch5-1
No ratings yet
ch5-1
44 pages
CH 06
No ratings yet
CH 06
58 pages
Cache Memory
No ratings yet
Cache Memory
57 pages
Computer Architecture: Memory Organization
No ratings yet
Computer Architecture: Memory Organization
65 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
Cache Memory
No ratings yet
Cache Memory
89 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
EC 5001 - Memory 1
No ratings yet
EC 5001 - Memory 1
56 pages
CH05
No ratings yet
CH05
56 pages
Lectures9 10 Computer Memory 1
No ratings yet
Lectures9 10 Computer Memory 1
15 pages
Memory
No ratings yet
Memory
57 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
HVAC Exam Answers
No ratings yet
HVAC Exam Answers
10 pages
CH 4.ppt Type I
No ratings yet
CH 4.ppt Type I
60 pages
Associative Memory
No ratings yet
Associative Memory
31 pages
cache_memory
No ratings yet
cache_memory
51 pages
Unit 4 Coa - Memory-1
No ratings yet
Unit 4 Coa - Memory-1
12 pages
Memory Organization Assignment
No ratings yet
Memory Organization Assignment
61 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Memory Sub-System: CT101 - Computing Systems
No ratings yet
Memory Sub-System: CT101 - Computing Systems
46 pages
Large and Fast: Exploiting Memory Hierarchy: Topics To Be Covered
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: Topics To Be Covered
13 pages
Module 5
No ratings yet
Module 5
30 pages
4. IT Support Technician
No ratings yet
4. IT Support Technician
13 pages
MODULE 2: Materials For Memory and Display Systems (8hr)
100% (1)
MODULE 2: Materials For Memory and Display Systems (8hr)
20 pages
Memory Organization
No ratings yet
Memory Organization
30 pages
Cache Memory CAD
No ratings yet
Cache Memory CAD
16 pages
Operating systems-UNIT-1
No ratings yet
Operating systems-UNIT-1
26 pages
Chapter 6
No ratings yet
Chapter 6
37 pages
Mekelle Institute of Technology: PC Hardware Troubleshooting (CSE501) Lecture - 4
No ratings yet
Mekelle Institute of Technology: PC Hardware Troubleshooting (CSE501) Lecture - 4
63 pages
Lecture 3 - Arm Architecture
No ratings yet
Lecture 3 - Arm Architecture
30 pages
Xi-CA em Study Material 2024-2025
No ratings yet
Xi-CA em Study Material 2024-2025
104 pages
Fuelino Service Commands V1.1 20161027
No ratings yet
Fuelino Service Commands V1.1 20161027
16 pages
The CPU, Instruction Fetch & Execute: 2.1 A Bog Standard Architecture
No ratings yet
The CPU, Instruction Fetch & Execute: 2.1 A Bog Standard Architecture
15 pages
toshiba satellite L510
No ratings yet
toshiba satellite L510
2 pages
Profibus 9
No ratings yet
Profibus 9
25 pages
Ahmed Hosni Al-Saeed
No ratings yet
Ahmed Hosni Al-Saeed
10 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
Computer Applications Training Manual
No ratings yet
Computer Applications Training Manual
76 pages
unit 1
No ratings yet
unit 1
5 pages
Ideas For Design
No ratings yet
Ideas For Design
6 pages
FT 2106x PCI
No ratings yet
FT 2106x PCI
2 pages
200 Indiabix IT Question - M A Latif Limon
No ratings yet
200 Indiabix IT Question - M A Latif Limon
17 pages
Power Saving Conveyor For Industrial Application
No ratings yet
Power Saving Conveyor For Industrial Application
40 pages
Microcochip
No ratings yet
Microcochip
2 pages
System Unit: Property of STI
No ratings yet
System Unit: Property of STI
2 pages
Online Food Ordering System: Admin
No ratings yet
Online Food Ordering System: Admin
2 pages
PY001
No ratings yet
PY001
6 pages
BIOS Basics - BIOS Central
No ratings yet
BIOS Basics - BIOS Central
3 pages
Field PG Power PG PDF
No ratings yet
Field PG Power PG PDF
6 pages
Internal Parts of Computer
No ratings yet
Internal Parts of Computer
2 pages
CHERIoT Programmers' Guide: CHERIoT, #1
From Everand
CHERIoT Programmers' Guide: CHERIoT, #1
David Chisnall
No ratings yet
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 5 - Memory

Uploaded by

Chapter 5 - Memory

Uploaded by

Chapter 5

Exploiting Memory Hierarchy

RISC-V Assembly int fib(int n) {

PC = 0x2000 (some default value) Address Space

2. Bootloader (stored on, e.g.,

*BIOS: Basic Input Output System

Upper level: It is closer to the processor, smaller,

Flash memory has replaced disks in many

NAND flash: bit cell like a NAND gate

Not suitable for direct RAM or disk replacement

Fast access to all locations, regardless of address

How do we know if the data

The simplest way to assign a location in the cache for

Because there are eight words in the cache, an address X maps

The low-order log2(8) = 3 bits are used as the cache index.

while addresses 00101two, 01101two, 10101two, and 11101two all

How do we know which particular block is stored in a cache location?

The size of the tag field: N = K + M + 2 bits

The cache is initially empty, with all valid bits (V entry in

The processor requests the following addresses:

Index V Tag Data

Index V Tag Data

Index V Tag Data

But in a fixed-sized cache

Larger miss penalty

When a dirty block is replaced

Since programs often write a whole block before reading it (e.g.,

In a two-way set-associative cache, there

In a fully associative placement, the

Thus, for a fixed cache size, increasing the

Compare 4-block caches 8 (8 modulo 4) = 0

Direct mapped, 2-way set associative,

In some implementations, the Output enable

Design change Effect on miss rate Negative Terminology:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.