Slot05 CH04 CacheMemory 35 Slides

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

+

Chapter 4 Cache Memory


William Stallings, Computer Organization and Architecture, 9 th Edition
+ Objectives 2

CLO4: Describe in detail the essential


elements of computer organisation
including internal bus, memory,
Input/Output ( I/O) organizations and
interfacing standards and discuss how
these elements function;
+ Objectives 3

 How are internal memory elements of a computer structured?


 After studying this chapter, you should be able to:
 Present an overview of the main characteristics of computer
memory systems and the use of a memory hierarchy.
 Describe the basic concepts and intent of cache memory.
Discuss the key elements of cache design.
 Distinguish among direct mapping, associative mapping, and
set-associative mapping.
 Explain the reasons for using multiple levels of cache.
 Understand the performance implications of multiple levels of
memory.
+ 4

Contents

4.1- Computer Memory Systems Overview


4.2- Cache Memory Principles
4.3- Elements of Cache Design
+ Questions must be answered: 5

No. Question
1 List characteristics of a component in computer’s memory system.
2 What are the differences among sequential access, direct access, and random
access?
3 What is the general relationship among access time, memory cost, and
capacity?
4 What is the cache memory? – Refer to the figure 4.3.
5 How does the principle of locality relate to the use of multiple memory
levels?
6 Explain about key elements of cache design.
7 Distinguish among direct mapping, associative mapping, and set-associative
mapping.
8 A memory system has only one 20-line cache using direct mapping. What
will the cache line be used if the 1024th main memory block is accessed?
9 For a direct-mapped cache, a main memory address is viewed as consisting
of three fields. List and define the three fields. – Refer to the textbook.
10 For an associative cache, a main memory address is viewed as consisting of
two fields. List and define the two fields– Refer to the textbook.
+ 6

4.1- Computer Memory System


Overview

 Characteristics of Memory System.


 The Memory Hierarchy
7
Key Characteristics of Computer Memory Systems
+ Characteristics of Memory Systems 8

 Location
 Refers to whether memory is internal and external to the computer
 Internal memory is often equated (make equal) with main memory
 Processor requires its own local memory, in the form of registers
 Cache is another form of internal memory
 External memory consists of peripheral storage devices that are accessible
to the processor via I/O controllers

 Capacity
 Memory is typically expressed in terms of bytes

 Unit of transfer
 For internal memory the unit of transfer is equal to the number of electrical
lines into and out of the memory module
Method of Accessing Units of Data
9

Random
Direct access Associative
Sequential access Access
(Disk) (Cache)
(Main memory)

Each addressable location in


A word is retrieved based on a
Memory is organized into Involves a shared read-write memory has a unique,
portion of its contents rather
units of data called records mechanism physically wired-in
than its address
addressing mechanism

Each location has its own


The time to access a given
Individual blocks or records addressing mechanism and
Access must be made in a location is independent of the
have a unique address based retrieval time is constant
specific linear sequence sequence of prior accesses
on physical location independent of location or
and is constant
prior access patterns

Any location can be selected


Cache memories may employ
Access time is variable Access time is variable at random and directly
associative access
addressed and accessed

More details: Next slides Main memory and some


cache systems are random
access
Method of Accessing Units of Data: 10

Direct Access on Disks


Location of each sector is idenfified
by a unique number

T1: seek time, time for moving


the head to the accessed track
T2: Rotational delay, time for
rotating the disk to position the
head to the beginning of the
accessed sector
T3: Transfer time, time for
rotating the disk to access all the
accessed sector
Each sector is accessed using
Access time = T1 + T2 + T3
different access time
Method of Accessing Units of Data: 11

Random Access

CPU Address bus Mem.


Decoder

The time to access a given location is independent


of the sequence of prior accesses and is constant.
Tín hiệu điện trong add bus ở đầu vào của
decoder sẽ kích hoạt duy nhất 1 memory word
 Thời gian kích hoạt một word là giống
nhau cho mọi word.
Main memory
Capacity and Performance:
12

The two most important characteristics of memory:


Capacity - Performance

Three performance parameters are used:


Time + Transfer rate
Memory cycle time
Access time (latency) • Access time plus any additional time
Transfer rate
• For random-access memory it is the time required before second access can • The rate at which data can be transferred
it takes to perform a read or write commence into or out of a memory unit
operation • Additional time may be required for • For random-access memory it is equal to
• For non-random-access memory it is the transients to die out on signal lines or to 1/(cycle time)
time it takes to position the read-write regenerate data if they are read
mechanism at the desired location destructively
• Concerned with the system bus, not the
processor
+ Memory 13

 The most common forms are:


 Semiconductor memory
 Magnetic surface memory
 Optical
 Magneto-optical

 Several physical characteristics of data storage are important:


 Volatile memory
 Information decays naturally or is lost when electrical power is switched off
 Nonvolatile memory
 Once recorded, information remains without deterioration until deliberately changed
 No electrical power is needed to retain information
 Magnetic-surface memories : Are nonvolatile
 Semiconductor memory : May be either volatile or nonvolatile
 Nonerasable memory
 Cannot be altered, except by destroying the storage unit
 Semiconductor memory of this type is known as read-only memory (ROM)

 For random-access memory the organization is a key design issue


 Organization refers to the physical arrangement of bits to form words
+ Memory Hierarchy 14

 Design
constraints on a computer’s memory can be
summed up by three questions:
 How much (capacity), how fast (performance), how expensive
(cost)

 There is a trade-off among capacity, access time, and cost


 Faster access time, greater cost per bit Càng nhanh càng mắc
 Greater capacity, smaller cost per bit Dung lượng càng lớn thì càng rẻ
Ở đĩa càng lớn thì truy xuất càng
 Greater capacity, slower access time
chậm
 Theway out of the memory dilemma is not to rely on a
single memory component or technology, but to employ a
memory hierarchy
A great capacity memory but cheap and low speed + one or some small capacity
memories but fast and more expensive (cache memory) .
+ Memory Hierarchy… 15

Nơi đặt
Nhanh
hơn/ mắc
tiền hơn
+ 16

4.2- Cache Memory Principles

 What is cache?
 Cache and Main Memory
17
What is a Cache?
Cache: A small
size, expensive,
memory which has
high-speed access
is located between
CPU and RAM
(large memory size,
cheaper, and lower-
speed
Memory).

CPU does not access main


memory  a MMU (memory
Management Unit) is needed
to transfer content between
caches and main memory
Cache/Main Memory Structure
18

Program in main
memory is divided
into the same size
blocks. A cache
line is tight fit a
memory block.

Each ordering line Đọc phần note để


(cache unit) includes xem thêm lời giải
a tag that identifies thích về mối quan
which particular hệ/ánh xạ giữa
block is currently cache và memory
being stored.
4.3- Elements of Cache Design 19

Overview of
cache design
parameters
+ 20

Cache Addresses: Virtual Address


 Virtual memory
 Facility that allows programs to address memory from a logical point of view,
without regard to the amount of main memory physically available. Only some
needed small parts of a program are loaded to main memory at a time. So, a
large program can run although memory size is smaller
 When used, the address fields of machine instructions contain virtual addresses
 For reads from and writes to main memory, a hardware memory management
unit (MMU) translates each virtual address into a physical address in main
memory

Lệnh <opcode, addr> ban đầu trong bộ nhớ chính (phấn addr là địa
chỉ trong bộ nhớ). CPU lấy lệnh trong cache, data của lệnh cũng ở
trong cache. Như vậy thành phần addr ban đầu cần phải được MMU
hiệu chỉnh thành addr trong cache (nên được gọi là địa chỉ ảo)
+ MMU ĐÃchuyển add trong mem thành add trong
cache phù hợp nên CPU sẽ truy xuất trực tiếp cache
Logical
and
Physical
Caches
CPU truy xuất
cache nên addr
cần phù hợp
theo địa chỉ
của cache

MMU CHƯA chuyển add trong mem thành add trong cache phù hợp nên
khi CPU truy xuất cache, CPU cần nhờ MMU tính toán lại địa chỉ phù hợp
22

Mapping Function: Mem   Cache


 Because there are fewer cache lines than main memory blocks, an algorithm is
needed for mapping main memory blocks into cache lines

 A mapping specifies relationship between a cache lines and blocks in main


memory

 Three techniques can be used:


Direct Associative Set Associative
• The simplest technique • Permits each main memory block • A compromise that
• Maps each block of main to be loaded into any line of the exhibits the strengths of
memory into only one cache both the direct and
possible cache line • The cache control logic interprets associative approaches
a memory address simply as a Tag while reducing their
and a Word field disadvantages
• To determine whether a block is in • Read by yourself
the cache, the cache control logic
must simultaneously examine
every line’s Tag for a match
23
+
Direct
Mapping
Block J luôn luôn
được nạp vào
Line I = J mod m
m: line number

Ưu điểm: Đơn giản.


Khối j sẽ được nạp vào line
 Thông tin trong TAG ít
bit  Hiệu suất lưu trữ cao

Nhược điểm: Khi nạp 1 khối ngoài, toàn bộ M


lines phải được chép hết vào bộ nhớ chính và
chép cả M blocks mới  Tốn thới gian.
+Associate 24

A block in main memory can be


loaded to any line of the cache
Mapping

Ưu điểm: Khi thay block, chỉ một line Nhược điểm: Thông tin trong tag
được thay  Chỉ chép 1 line ra sẽ nhiều bit hơn  Giảm hiệu suất
memory rồi nạp 1 block vào cache  lưu trữ của cache
Chi phí hoán đổi thấp.
+ 25

Set Associative Mapping


 Cache lines are divided into some sub-sets
 A block can be loaded to any line of a given subset
 Compromise (thỏa hiệp) that exhibits the strengths of both the direct
and associative approaches while reducing their disadvantages
 Example: 2 lines per set
 2 way associative mapping
 A given block can be in one of 2 lines in only one set

Read textbook for more information about cache organization


+ Replacement Algorithms 26

How to choose a line for loading a new block?

 Two situations:
 Cache hit: Accessed address exists in cache
 Cache miss: Accessed address does not exist in cache. The memory
block containing it must be loaded to the cache

 Once the cache has been filled, when a new block is brought into
the cache, one of the existing blocks must be replaced
 For direct mapping there is only one possible line for any
particular block and no choice is possible
 For the associative and set-associative techniques a replacement
algorithm is needed
 To achieve high speed, an algorithm must be implemented in
hardware
Đọc NOTE để có thêm lới giải thích
+ 27

The four most common replacement


algorithms are:
 Least recently used (LRU) – Ít dùng nhất hiện hành- Đọc phần NOTE
 Most effective
 Replace that block in the set that has been in the cache longest with no
reference to it
 Because of its simplicity of implementation, LRU is the most popular
replacement algorithm

 First-in-first-out (FIFO)
 Replace that block in the set that has been in the cache longest
 Easily implemented as a round-robin or circular buffer technique

 Least frequently used (LFU) – Ít được dùng nhất


 Replace that block in the set that has experienced the fewest references
 Could be implemented by associating a counter with each line
28
Write Policy
Cơ chế ghi cache ra memory khi thay line

When a block that is resident in the


There are two problems to contend
cache is to be replaced there are two
(tranh cãi) with:
cases to consider:

If the old block in the cache has not been More than one device may have
altered then it may be overwritten with a new
block without first writing out the old block access to main memory

If at least one write operation has been A more complex problem occurs when
performed on a word in that line of the cache multiple processors are attached to the same
then main memory must be updated by writing bus and each processor has its own local
the line of cache out to the block of memory cache - if a word is altered in one cache it could
before bringing in the new block conceivably invalidate a word in other caches
+ 29

Write Through and Write Back


Ghi sao cho hiệu quả

 Write through- Ghi thẳng


 All write operations are made to main memory as well as to
the cache
(CPU cập nhật data trong cache và giao cho MMU cập nhật data
trong MEM)
 Advantage: Simplest technique
 The main disadvantage of this technique is that it generates
substantial (heavy) memory traffic and may create a
bottleneck
+ 30

Write Through and Write Back


Ghi sao cho hiệu quả

 Write back-Ghi ngầm


 Updates are made only in the cache
 Chỉ cập nhật data trong cache rồi đánh dấu (mark) rằng line này đã
cập nhật data. Đợi lúc nào phù hợp (bus rảnh) mới ghi line ra block.
 Advantage: Minimizes memory writes
 Disadvantages:
 Portions of main memory are invalid and hence accesses by
I/O modules can be allowed only through the cache
 This makes for complex circuitry and a potential bottleneck
Line Size: Capacity of a line
31

Principal of Locality – Nguyên lý cục bộ  Thói quen của con người


At a time, data items which are near each others are accessed.

Line sise = Block size


Line size increases  number of program’s blocks decrease 
Cache hit increases  Higher performance
When line size becomes bigger  Cost for writing lines to main
memory increases.
The relationship between block size and hit ratio is complex,
depending on the locality characteristics of a particular program,
and no definitive optimum value has been found. A size of from 8
to 128 bytes seems reasonably close to optimum
+ 32

Multilevel Caches
 As logic density has increased it has become possible to have a cache on the
same chip as the processor

 The on-chip cache reduces the processor’s external bus activity and speeds up
execution time and increases overall system performance
 When the requested instruction or data is found in the on-chip cache, the bus access is
eliminated
 On-chip cache accesses will complete appreciably faster than would even zero-wait
state bus cycles
 During this period the bus is free to support other transfers

 Two-level cache:
 Internal cache designated as level 1 (L1)
 External cache designated as level 2 (L2)

 Potential savings due to the use of an L2 cache depends on the hit rates in both
the L1 and L2 caches

 The use of multilevel caches complicates all of the design issues related to
caches, including size, replacement algorithm, and write policy
Hit Ratio (L1 & L2) For 8 Kbyte and 16Kbyte L1 33
+ 34

Unified Versus Split Caches


 Has become common to split cache:
One dedicated to instructions and one dedicated to data (only data need updates). Both
exist at the same level, typically as two L1 caches

 Advantages of unified cache: Higher hit rate


 Balances load of instruction and data fetches automatically
 Only one cache needs to be designed and implemented

 Trend is toward split caches at the L1 and unified caches for higher levels

 Advantages of split cache:


 Eliminates cache contention (tranh chấp) between instruction fetch/decode unit and
execution unit (access data)
 Important in pipelining (cơ chế đường ống, output của xử lý này là input của xử lý
kế tiếp)
+ Summary
35

Cache
Memory
Chapter 4

 Characteristics of Memory  Elements of cache design


Systems  Cache addresses
 Location
 Cache size
 Capacity
 Mapping function
 Unit of transfer
 Replacement algorithms
 Memory Hierarchy  Write policy
 How much?  Line size
 Number of caches
 How fast?
 How expensive?
 Cache memory principles

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy