0% found this document useful (0 votes)

19 views33 pages

Cache Overview

Uploaded by

Raghu Venkatesan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views33 pages

Cache Overview

Uploaded by

Raghu Venkatesan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 33

Cache Overview

- vardhamana.hegde@wipro.com
Agenda
• Introduction
• Cache structure
• Cache Organization
• Principle of locality
• Hit and Miss!
• Block placement
• Block identification
• Block replacement
• Interaction policies with main memory
• Cache coherency
• MESI protocol
• Some terms
• Benefits of larger cache – the Xeon case
• Comparing Intel Processors
• Cache in AMD64
• References
06/23/23 Cache Overview - vardhamana.hegde@wipro.com 2
Introduction
• Pronounced as – “cash”
• It is also a memory!
• Contains the most recently accessed pieces of main memory
• Bottleneck in processors – slower memories
• Benefits?
– For a typical desktop application
– For Pentium with 16Kbyte cache
Contains about 90% of the addresses requested by the processor!
• Basic Model

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 3

Introduction – contd..

• Memory hierarchy considerations

– Make the Common Case Fast
• Amdahl’s Law – “the performance improvement to be gained from using some
faster mode of execution is limited by the fraction of the time the faster mode can
be used”
– Principle of locality
• The properties of programs that you want to exploit
– Smaller is Faster
• Generally and also true for memories!

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 4

Introduction – contd..

– Block Placement
• Where should a block be placed in a cache?
– Block Identification
• How to find if a block is found in the cache?
– Block Replacement
• Which block should be replaced if not found in cache?
– Interaction Policies with Main Memories
• What happens on reads and writes to the cache?
• In reality?
Faster/Smaller/Costlier/Power Hungry/Less dense
Bigger/Slower/Cheaper/Less Power/Denser

Decrease
Registers (FF)
in access Increase in
Cache (SRAM) cost per
time
byte
Main Memory (DRAM)
Virtual Memory (Disk)

Storage (Disk/Tape)

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 5

Introduction – contd..

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 6

Cache structure
• Functional Blocks
– SRAM
• The block that holds the data
• Size determines the size of the
cache
– Tag RAM
• Small piece of SRAM
• Stores the address of the data
that is stored in SRAM
– Cache Controller
• Performs snoops and snarfs
• Update SRAM and TRAM
• Implement Write policy
• Determine if the memory request
is cacheable (not all requests
need to be cacheable!)
• Determine if a memory access is
a cache hit or miss

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 7

Cache Organization

• Cache Page
– “equal” pieces of main memory
– Size is dependent on cache size

• Cache Line
– Smaller pieces of a cache page
– Size is determined by both
processor design and cache
design
– In Pentium, cache line is 32
bytes

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 8

Principle of Locality
• Locality of reference
– Make use of the properties of the programs
– Tend to reuse data and instructions used recently
– 90/10 rule = “ A program spends 90% of its time in 10% of it’s code”

• Temporal Locality
– Recently accessed items are likely to be accessed in the near future

• Spatial Locality
– Items whose addresses are near one another tend to be referenced close
together “in time”

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 9

Hit and Miss!
• Cache Hit - memory accesses that result in finding the data in the cache

• Cache Miss – the memory access, where the cache does not contain the
information requested

– Compulsory – a first reference

• Larger cache  greater compulsory misses
• Smaller lines  more compulsory misses

– Capacity – a value was evicted because for lack of space

• Larger cache  lesser the misses
• Increase associativity  increase in capacity misses

– Conflict – another location with the same mapping was loaded

• Increased size  lower the conflict
• Increased associativity  decreased conflicts
• Sensitive to code/data placement

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 10

Block Placement
Where do you want to place the data in the
“cache”?

• Fully Associative
– A block of data can be placed anywhere
in cache
– Main memory and cache memory are
both divided into lines of equal size
– Provides the best performance – store
the line anywhere!
– Complexity in implementing
• Determine if the data is present or not
• Need to compare the address within the
TRAM (done in parallel) within the timing
requirements
– Hence used in caches of smaller size –
typically less than 4K

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 11

Block Placement – contd..

Where do you want to place the

data in the “cache”?

• Direct Map
– A block of data can be placed
anywhere in cache
– One way set associative?
– Main memory = n (cache size)
– Least complex
– Less flexible for jump kind of
instructions – lower performance

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 12

Block Placement – contd..

Where do you want to place the

data in the “cache”?

• Set Associative
– A block of data can be placed in
restrictive “set” of places in the
cache
– A combination of “Fully
Associative” and “Direct Mapped”
schemes
– Cache is divided into equal
“cache ways”
– Cache Page = Cache Way
– Cache Way – Direct Mapped

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 13

Block Placement – contd..

• Most common cache organizations

– direct mapped / two way set associative / four way set associative

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 14

Block Identification
How to find out if the block is present in cache?

• Directory
– Address tags – checked to match the block address
• Checked in parallel for speed of operation
• Address from CPU = f (block offset, index, tag)
– Control bits – indicate that the content of a block of data is valid

Block Address
Offset
Tag Index

Stored in Cache and Selects the set Selects data within

compared with CPU the block
address

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 15

Block Identification – contd.

• Basic Algorithm

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 16

Block Replacement
Which block should be replaced?
• Random
• Least Recently Used (LRU)
• First In First Out (FIFO)
• Most Recently Used (MRU)
• Least Frequently Used (LFU)
• Most Frequently Used (MFU)

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 17

Main Memory Interaction Policies
What happens on a read to cache?

• Read policies

– Look Aside
• Less Expensive
• Better response to cache miss
• Processor cannot access cache
when another bus master is
accessing the main memory

– Look Through
• More complex
• Processor runs out of cache
• Memory access on cache miss
is slower
• Good when
– cache hits are higher and
– There are other bus masters

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 18

Main Memory Interaction Policies
What happens on a write to cache?

• Write policies on “write hit” - Write Through and Write Back

– Write Back – cache acts like buffer
• Advantages
– Write occur at the speed of the cache – greater performance
– Write to main memory can happen when the system bus is available
– Multiple writes within a block requires only one write to main memory
– As a result requires less memory bandwidth
• Disadvantages
– Harder to implement
– Main memory is not always consistent with cache
– Reads that result in replacement may cause write of dirty blocks to main memory

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 19

Main Memory Interaction Policies
– Write Through
• Advantages
– Easy to implement
– Main memory has always the most current copy of the data
– Read miss never results in writes to main memory
• Disadvantages
– Write is slower
– Every write needs a main memory access
– Requires more bandwidth to write to main memory
• Write policies on “write miss”
– Write allocate
• write into the cache
• Usually associated with write back caches
– Write around (No Write allocate) – write directly to main memory

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 20

Cache Coherency

• Case of shared memory, multiprocessor systems

• Goal – reduce memory access and also network traffic
• Problem – caches to be consistent with main memory
• To decide – memory entry should be updated or invalidated

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 21

Cache Coherency – contd.

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 22

Cache Coherency – contd.

• Pentium – MESI protocol to maintain cache consistency/coherency

– Each cache line – “Modified”, “Exclusive”, “Shared” and “Invalid”
• Modified – is modified
• Exclusive – is stored in this cache only and is not changed by write access yet
• Shared – may be present in other caches – cannot modify
• Invalid – is invalid, fetch to satisfy any access

• Other protocols
– MSI – PowerPC755
– MESI (Illinois) – Pentium Class
– MOESI – UltraSPARC and AMD64
• MESI + Owned – Is the owner, has to communicate
– Berkley
– Firefly
– Futurebus+

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 23

Cache Coherency – MESI
• “Simplified” MESI protocol
– Start with “invalid”

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 24

Some Terms
• Locality of reference
– Make use of the properties of the programs
– 90/10 rule = “ A program spends 90% of its time in 10% of it’s code”
– Tend to reuse data and instructions used recently
• Temporal Locality
– Recently accessed items are likely to be accessed in the near future
• Spatial Locality
– Items whose addresses are near one another tend to be referenced close
together “in time”
• Fetch rate
– the number of bits in cache for each memory access
• Memory stall cycles is a function of
– Instruction Count
– Memory references per instruction
– The fraction of accesses that are not in the cache
– The additional time to service the miss

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 25

Some Terms – contd..

• Cache Hit
– memory accesses that result in finding the data in the cache = f (size, fetch
rate, locality of reference)
• Cache Miss
– the memory access, where the cache does not contain the information
requested
• Main memory
– Physical memory such as RAM and ROM (but not cache memory) that is
installed in a particular computer system
• Physical memory
– Actual memory, consisting of main memory and cache
• Virtual Address cache
– High speed buffer between CPU and MMU
– Uses virtual address to decide the presence of data in the cache
– MMU translation can be avoided
– Susceptible to cache aliasing problems

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 26

Some Terms – contd..

• Physical Address Cache

– high speed buffer between the MMU and physical memory
– CPU  (Virtual Address)  MMU  Cache  (Physical Address) 
Physical Memory
– Uses the physical address to determine data in cache
– For every access an MMU translation must be completed regardless of data
present in cache
– Greatly reduces potential cache aliasing problems
• Cache Aliasing
– Two or more sets of data or instruction addresses that have the same lower
order bits and therefore occupy the same cache address
• Cache Consistency
– Cache has a copy of the content of the main memory – must reflect the
content of (be consistent with) the main memory
• Probe
– A check for an address in a processor’s caches or internal buffers. “External
probes” originate outside the processor and “internal probes” originate within
the processor

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 27

Some Terms – contd..

• Snoop
– to watch the address lines to check if it contains the data for any memory
transactions
• Snarf
– taking the data from the data lines – to update and maintain consistency
• Dirty data
– Data held in cache that is more recent than the copy held in main memory
• Stale Data
– The data available in cache, when data is modified within main memory but
not modified in cache
• Flush
– when used with cache – write back, if modified, and invalidate – “flush the
cache line”
• Write Merging
– Block are often larger than a machine “word”
– To save write buffer space and memory traffic

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 28

Some Terms – contd..

• Split Cache vs. Unified Cache

– Unified Cache
• All memory requests to a single cache
• Less hardware but lesser bandwidth and more opportunity for collision
– I & D cache
• Because they have different access patterns – can customize
• Separate memory for instruction and data
• Requires additional hardware, I cache is read only
• Shadow Cache
– Transient accesses evict cache values – some times
– Small separate cache store evicted values
– Replacement policy first checks shadow cache and will not evict such values
(which are recently evicted and reloaded)
• Write Buffers
– To avoid stall on writes
– A smaller cache that can hold few values waiting to go into memory
– Helps when writes are clustered
– Does not entirely eliminate stall (what if the buffer becomes full)
06/23/23 Cache Overview - vardhamana.hegde@wipro.com 29
Some Terms – contd..

• Prefetch
– Increase in line size  increase in conflict misses but reduction in
compulsory misses
Tradeoff?
– Prefetch buffer  holds additional line beyond the one recently fetched
– Instructions for prefetch
• Non Blocking Cache
– Case of superscalar processors
– No need to wait if one of the pipeline faces a miss
– Make sure the interdependency is maintained correctly
• Synonym Problem
– Virtual memory maps multiple logical locations to same physical memory
– Different logical locations in the cache at the same time
– CPU gets unexpected values
– Physical cache invalidates the correct one

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 30

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 31
Comparing Intel Processors

Pent P-II P-III P-4

Split? Yes Yes Yes Yes
Data size 8 K 16K 16K 8K
Instruction 8 K 16K 16K ~96K
Associativity 2-Way D4/I2 2-Way D4/ I-trace
Line size 32B 32B 32B 32B
256K same
Level 2 Off chip 256K 256K
package
L2 Assoc. 8-Way 8-Way

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 32

Thank You!

References
- An Overview of Cache – http://www.intel.com/design/intarch/papers/cache6.htm
- Memory Hierarchy Design –
http://www.cs.iastate.edu/~prabhu/Tutorial/CACHE/mem_title.html
- Memory Hierarchy in Cache Based Systems –
http://www.sun.com/blueprints/1102/817-0742.pdf
- Cache coherency issues for real time multiprocessing -
http://www.embedded.com/97/feat9702.htm

04 Cache Memory
No ratings yet
04 Cache Memory
71 pages
CH05-COA11e
100% (1)
CH05-COA11e
43 pages
Lecture-04 & 05, Adv. Computer Architecture, CS-522
No ratings yet
Lecture-04 & 05, Adv. Computer Architecture, CS-522
63 pages
cache_memory
No ratings yet
cache_memory
51 pages
COA_PPT
No ratings yet
COA_PPT
158 pages
CS2115 chapter-6
No ratings yet
CS2115 chapter-6
45 pages
Computer Organization & Architecture: Cache Memory
No ratings yet
Computer Organization & Architecture: Cache Memory
52 pages
11 Cache Memory
No ratings yet
11 Cache Memory
40 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
72 pages
CH04
No ratings yet
CH04
46 pages
Lecture 13- Introduction to Cache
No ratings yet
Lecture 13- Introduction to Cache
47 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
Computer Organization and Architecture: Cache Memory
100% (1)
Computer Organization and Architecture: Cache Memory
57 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
54 pages
CAO - Lecutre7 Cache Memory
100% (1)
CAO - Lecutre7 Cache Memory
39 pages
L14
No ratings yet
L14
17 pages
04_Cache Memory
No ratings yet
04_Cache Memory
61 pages
04 Cache Memory
No ratings yet
04 Cache Memory
36 pages
6.Module 2_Part 2
No ratings yet
6.Module 2_Part 2
39 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
cache_ppt
No ratings yet
cache_ppt
38 pages
OVERVIEW_OF_CACHES_AND_DIRECT-MAPPED_CACHES
No ratings yet
OVERVIEW_OF_CACHES_AND_DIRECT-MAPPED_CACHES
10 pages
Lec 4
No ratings yet
Lec 4
31 pages
04 - Cache Memory (Compatibility Mode)
No ratings yet
04 - Cache Memory (Compatibility Mode)
12 pages
Wk10a Cache PDF
No ratings yet
Wk10a Cache PDF
25 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
Cache Memory
No ratings yet
Cache Memory
61 pages
EE6304 Lecture9 Mem Caches
No ratings yet
EE6304 Lecture9 Mem Caches
61 pages
Module4 CAche Performance
No ratings yet
Module4 CAche Performance
40 pages
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
No ratings yet
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
47 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
72 pages
CH05 COA11e
No ratings yet
CH05 COA11e
43 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
Caches - Basic Idea
No ratings yet
Caches - Basic Idea
11 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
04 Cache Memory
No ratings yet
04 Cache Memory
75 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Cache Memory
No ratings yet
Cache Memory
60 pages
Lecture 7 Cache Memory
No ratings yet
Lecture 7 Cache Memory
44 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
CH05
No ratings yet
CH05
56 pages
Cache Memory
No ratings yet
Cache Memory
39 pages
Understand CPU Caching Concepts
No ratings yet
Understand CPU Caching Concepts
14 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
55 pages
Computer Architecture: Memory Organization
No ratings yet
Computer Architecture: Memory Organization
65 pages
Computer Organization & Architecture: Cache Memory
No ratings yet
Computer Organization & Architecture: Cache Memory
71 pages
Elements of Cache Design Pentium IV Cache Organization
No ratings yet
Elements of Cache Design Pentium IV Cache Organization
43 pages
William Stallings Computer Organization and Architecture 7th Edition
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition
57 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
4.1 Computer Memory System Overview
No ratings yet
4.1 Computer Memory System Overview
12 pages
Caching: Acknowledgements
No ratings yet
Caching: Acknowledgements
6 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Unit 1 Part 2 (Chapter 4) Cache Memory
No ratings yet
Unit 1 Part 2 (Chapter 4) Cache Memory
53 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
51 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Cache Overview

Uploaded by

Cache Overview

Uploaded by

Cache Overview

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 3

• Memory hierarchy considerations

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 4

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 5

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 6

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 7

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 8

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 9

– Compulsory – a first reference

– Capacity – a value was evicted because for lack of space

– Conflict – another location with the same mapping was loaded

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 10

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 11

Where do you want to place the

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 12

Where do you want to place the

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 13

• Most common cache organizations

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 14

Stored in Cache and Selects the set Selects data within

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 15

// Search cache directory for Tag

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 16

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 17

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 18

• Write policies on “write hit” - Write Through and Write Back

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 19

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 20

• Case of shared memory, multiprocessor systems

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 21

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 22

• Pentium – MESI protocol to maintain cache consistency/coherency

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 23

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 24

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 25

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 26

• Physical Address Cache

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 27

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 28

• Split Cache vs. Unified Cache

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 30

Pent P-II P-III P-4

06/23/23 Cache Overview - vardhamana.hegde@wipro.com 32

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.