CEN468 Lab 3 V2
CEN468 Lab 3 V2
Lab Experiment 3
Building a Cache Memory with Quartus II and Altera DE2
Board
Prepared By:
Eng. Maha Yaghi & Eng. Gasm El Bary
Objectives
1. Understand the principles of cache memory and its role in computer architecture.
2. Implement a basic cache memory system with read, write, and replacement functionality in VHDL.
3. Design and analyze the Least Recently Used (LRU) replacement policy for managing cache data.
4. Create a multi-level cache hierarchy with L1 and L2 caches connected to main memory.
5. Test and validate the cache system on the Altera DE2 FPGA board, observing cache hits, misses, and data retrieval
patterns.
Required Equipment
(a) Altera DE2 Board [1] (b) Altera Quartus II Software [2]
Introduction
In this lab, we will explore the essential concepts and functions of cache memory by designing and testing cache systems on
the Altera DE2 FPGA board. Cache memory is a critical component in computer architecture, significantly enhancing the
speed and efficiency of data access by temporarily storing frequently used data close to the processor. By implementing a
two-level cache hierarchy, we will observe how data is managed between L1, L2, and main memory to optimize performance
in data retrieval and processing tasks.
The DE2 board features the Cyclone II FPGA chip, which includes numerous configurable components such as
switches, LEDs, and displays, enabling us to simulate and visualize the behaviors of different cache levels in real-time.
The DE2 board allows for the testing of cache hit and miss scenarios, the impact of replacement policies, and the efficiency
of cache hierarchy.
1
1 Cache Memory Overview
Cache memory operates as a high-speed intermediary between the CPU and main memory. It holds copies of frequently
accessed data, allowing the CPU to retrieve this information faster than if it had to access the main memory directly.
Most systems implement multiple levels of cache:
• L1 Cache: The primary cache, closest to the CPU, offering the fastest access speeds. L1 cache has limited capacity,
storing only essential data.
• L2 Cache: A secondary cache with a larger capacity than L1 but with slightly slower access. It serves as a backup,
holding data that may not be available in L1.
When the CPU requests data, it first checks the L1 cache, followed by L2 if the data is not found in L1. If the
data is absent in both caches (a cache miss), it is fetched from main memory. Replacement policies, such as the Least
Recently Used (LRU) policy, help determine which data to retain or replace when the cache reaches capacity, ensuring
that frequently accessed data remains in the cache for quick retrieval.
This lab is organized into three exercises, each focusing on different aspects of cache memory:
• Exercise 1: Direct-Mapped - In this exercise, we will implement a simple cache with read and write operations to
observe the fundamental behavior of a cache. This exercise will demonstrate basic cache data storage and retrieval.
• Exercise 2: LRU Replacement Policy - The second exercise introduces the Least Recently Used (LRU)
replacement policy in a two-line cache. Here, we will explore how the LRU policy manages data when the cache
reaches full capacity.
• Exercise 3: L1 and L2 Cache with Memory Access - In the final exercise, we will implement a dual-level
cache system with both L1 and L2 caches, along with access to main memory. This exercise allows us to simulate
a realistic cache hierarchy, observe cache misses, and validate the memory access flow.
Through these exercises, students will gain practical insights into cache operation, replacement policies, and the role
of cache hierarchy in minimizing data access times. This lab provides a hands-on understanding of cache memory as a
foundational concept in modern computer architecture, reinforcing the importance of caching in achieving efficient data
management and system performance.
2
Figure 3: Default view of Quartus II
2. In the first page of the Wizard, specify the working directory of the project then select a name. Note that the
selected name should be used as the top-level design entity name (Example: lab2).
3. On the second page, the Wizard asks for any source files that could be imported from other designs and projects
so that it can be used in the current project. Since no such files has to be included for this experiment, click next.
3
4. The third page asks for the target device on which the circuit will be synthesized. Check the ALTERA chip
connected to the board and note down the device family and number. In the Device family option scroll for the
device family of your board. Then, in the available device options scroll for the device number as shown in Figure 5.
Click finish to start programming.
5. To open a VHDL design file, select File from the main menu then New →
− VHDL File as shown in Figure 6.
1 library ieee;
2 use ieee.std_logic_1164.all;
4
3 use ieee.numeric_std.all;
1 entity lab3test is
2 port (
3 clk: in std_logic;
4 enable: in std_logic;
5 read: in std_logic;
6 write: in std_logic;
7 address: in std_logic_vector(1 downto 0);
8 data_in: in std_logic_vector(3 downto 0);
9 data_out: out std_logic_vector(3 downto 0);
10 hit: out std_logic
11 );
12 end lab3test;
1 process(clk)
2 begin
3 if rising_edge(clk) then
4 if enable = ’1’ then
5
5 if read = ’1’ then
6 if valid(to_integer(unsigned(address))) = ’1’ then
7 data_out <= cache(to_integer(unsigned(address)));
8 hit <= ’1’; -- Cache hit
9 else
10 data_out <= (others => ’Z’); -- High impedance for miss
11 hit <= ’0’; -- Cache miss
12 end if;
13 elsif write = ’1’ then
14 cache(to_integer(unsigned(address))) <= data_in;
15 valid(to_integer(unsigned(address))) <= ’1’; -- Mark as valid
16 hit <= ’1’;
17 end if;
18 end if;
19 end if;
20 end process;
21 end Behavioral;
Input/Output Pin
clk KEY0
enable SW0
read SW2
write SW3
address[1] SW5
address[0] SW4
data in[3] SW9
data in[2] SW8
data in[1] SW7
data in[0] SW6
data out[3] LEDR9
data out[2] LEDR8
data out[1] LEDR7
data out[0] LEDR6
hit LEDG5
6
4 Exercise 2: Implementing Least Recently Used (LRU) Cache Replacement Policy
This exercise focuses on the implementation of a Least Recently Used (LRU) replacement policy in a two-line cache.
LRU is a common replacement strategy where the cache replaces the least recently accessed data when a new entry needs
to be stored in a full cache. This exercise will demonstrate how to track and manage data to keep frequently accessed
entries available for fast retrieval.
Follow the steps below to implement the LRU policy in a direct-mapped cache. The cache will determine which line
to replace based on recent usage.
1 library ieee;
2 use ieee.std_logic_1164.all;
3 use ieee.numeric_std.all;
1 entity lab3test is
2 port(
3 Clock : in std_logic;
4 Reset : in std_logic;
5 Enable : in std_logic;
6 Read : in std_logic;
7 Write : in std_logic;
8 Address : in std_logic_vector(1 downto 0);
9 Data_in : in std_logic_vector(1 downto 0);
10 Data_out : out std_logic_vector(1 downto 0);
11 Cache_hit : out std_logic
12 );
13 end lab3test;
7
4. Create a process for read, write, and LRU replacement operations
Implement a process block with a sensitivity list containing Clock and Reset.
• When Reset is ’1’, clear all cache data, tags, valid bits, and set the lru tracker to ’0’.
• On a rising clock edge:
– If Enable is ’1’:
∗ If Write is ’1’:
(a) Check if there is an empty (invalid) line in the cache:
· Write data to the first invalid line and store the address as the tag.
· Update cache valid to mark the line as valid and set the lru tracker to point to the next line.
(b) If both lines are valid, use lru tracker to identify the least recently used line:
· Write data to the identified line, update the tag, and toggle the lru tracker.
∗ If Read is ’1’:
(a) Check if the requested data is present in either cache line by comparing tags.
(b) If a match is found, output data from the corresponding cache line and set Cache hit to ’1’.
(c) If no match is found, set Cache hit to ’0’ to indicate a cache miss.
8
44 end process;
45 end Behavioral;
Input/Output Pin
Clock KEY0
Reset SW0
Enable SW1
Read SW2
Write SW3
Address[1] SW5
Address[0] SW4
Data in[1] SW9
Data in[0] SW8
Data out[1] LEDR9
Data out[0] LEDR8
Cache hit LEDG7
9
ii. Set Data in = ”01” (SW7 low, SW6 high).
iii. Press Clock to perform the write.
Expected Outcome: Since both cache lines are occupied, the least recently used line (second cache line in
this case) should be replaced with the new data. Data in (01) should be written to the second cache line,
replacing the old data, and the lru tracker should update.
(f) Test Cache Miss and LRU Replacement with Read Operation
i. Set Read = 1 and Address = ”00” (SW5 and SW4 high) to read from an address not currently in the
cache.
ii. Press Clock to perform the read.
Expected Outcome: Since the address is not in either cache line, Cache hit should be off, indicating a
cache miss. After this read, one cache line (determined by the lru tracker) should be replaced with data
from Address = ”11”.
5 Exercise 3: Implementing Multi-Level Cache with L1, L2, and Main Memory Access
This exercise extends the cache design to include a multi-level cache system with both L1 and L2 caches, along with
access to main memory. When data is requested, the system first checks L1, then L2, and if the data is not present in
either cache, it is retrieved from main memory. This approach mirrors a real-world memory hierarchy, improving system
performance by maintaining frequently accessed data in faster, smaller caches.
The system in this exercise will simulate the behavior of a multi-level cache with a focus on understanding cache hits
and misses at different levels.
1 library ieee;
2 use ieee.std_logic_1164.all;
3 use ieee.numeric_std.all;
1 entity lab3test is
2 port(
3 Clock : in std_logic;
4 Reset : in std_logic;
5 Enable : in std_logic;
6 Read : in std_logic;
7 Write : in std_logic;
8 Address : in std_logic_vector(1 downto 0);
9 Data_in : in std_logic_vector(1 downto 0);
10 Data_out : out std_logic_vector(1 downto 0);
10
11 L1_Cache_hit : out std_logic;
12 L2_Cache_hit : out std_logic;
13 Memory_access : out std_logic
14 );
15 end lab3test;
11
19 L2_cache(to_integer(unsigned(Address))) <= Data_in;
20 L1_valid(to_integer(unsigned(Address))) <= ’1’;
21 L2_valid(to_integer(unsigned(Address))) <= ’1’;
22 L1_Cache_hit <= ’0’;
23 L2_Cache_hit <= ’0’;
24 Memory_access <= ’0’;
25 update_L1_next <= ’0’;
26
45 else
46 -- Miss in both caches: Access main memory
47 Data_out <= main_memory(to_integer(unsigned(Address)));
48 temp_data_out <= main_memory(to_integer(unsigned(Address)));
49 Memory_access <= ’1’;
50 L1_cache(to_integer(unsigned(Address))) <= main_memory(to_integer(unsigned(Address)));
51 L2_cache(to_integer(unsigned(Address))) <= main_memory(to_integer(unsigned(Address)));
52 L1_valid(to_integer(unsigned(Address))) <= ’1’;
53 L2_valid(to_integer(unsigned(Address))) <= ’1’;
54 L1_Cache_hit <= ’0’;
55 L2_Cache_hit <= ’0’;
56 update_L1_next <= ’0’;
57 end if;
58 end if;
59 end if;
60
12
Input/Output Pin
Clock KEY0
Reset SW0
Enable SW1
Read SW2
Write SW3
Address[1] SW5
Address[0] SW4
Data in[1] SW9
Data in[0] SW8
Data out[1] LEDR9
Data out[0] LEDR8
L1 Cache hit LEDG7
L2 Cache hit LEDG6
Memory access LEDG5
13
Expected Outcome: The data should now be in L1 cache, resulting in an L1 Cache Hit. L1 Cache hit
LED should light up, L2 Cache hit and Memory access LEDs should remain off. Data out should display 10,
retrieved directly from L1 cache.
References
[1] 2018. [Online]. Available: https://www.cl.cam.ac.uk/teaching/1011/ECAD+Arch/background/DE2.html
14