This document provides instructions for Assignment 3 of CS 161. It includes questions about analyzing the performance of a MIPS pipeline for different code optimizations including resolving hazards, forwarding, unrolling loops, and out-of-order execution. It also asks about the optimizations performed by the gcc compiler at different optimization levels and measuring the performance and size of compiled executables.
This document provides instructions for Assignment 3 of CS 161. It includes questions about analyzing the performance of a MIPS pipeline for different code optimizations including resolving hazards, forwarding, unrolling loops, and out-of-order execution. It also asks about the optimizations performed by the gcc compiler at different optimization levels and measuring the performance and size of compiled executables.
This document provides instructions for Assignment 3 of CS 161. It includes questions about analyzing the performance of a MIPS pipeline for different code optimizations including resolving hazards, forwarding, unrolling loops, and out-of-order execution. It also asks about the optimizations performed by the gcc compiler at different optimization levels and measuring the performance and size of compiled executables.
This document provides instructions for Assignment 3 of CS 161. It includes questions about analyzing the performance of a MIPS pipeline for different code optimizations including resolving hazards, forwarding, unrolling loops, and out-of-order execution. It also asks about the optimizations performed by the gcc compiler at different optimization levels and measuring the performance and size of compiled executables.
Download as DOCX, PDF, TXT or read online from Scribd
Download as docx, pdf, or txt
You are on page 1of 3
CS 161: Assignment 3
Due at 11:59PM on May 14, 2014
1. Consider the following control and datapath with datapath latencies in Table 1. Assume registers and data memory is edge-triggered and all hardware components can work concurrently when there is no data dependencies.
a. To avoid being on the critical path, what is the maximum time to generate each of the following control signals: MemRead, ALUOp, MemWrite, ALUSrc, RegWrite?
b. Given the control signal generation times in the following Table 2 and the datapath latencies in the above Table 1, derive the exact time to execute R-type, Load, Store, Branch, and Jump instructions.
RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite 500ps 500ps 450ps 200ps 450ps 200ps 500ps 100ps 500ps Table 2: Control signal generation times
2. Consider executing the following assembly code in MIPS five stage (IF, ID, EX, ME, WB) pipeline model:
a. Assume there is only one memory port, data forwarding is not implemented, and branch instruction stalls until the end of the WB stage. Complete the following pipeline execution diagram for one iteration. What is the CPI assuming there are infinite number of iterations?
b. Indicate all data dependences and their types (i.e., RAW, WAR, or WAW).
c. Assume structural hazards are resolved, data forwarding is implemented, and branch result is available at the end of the ID stage. Complete the following pipeline execution diagram for one iteration. What is the CPI assuming there are infinite number of iterations?
d. Assume structural hazards are resolved, data forwarding is implemented, branch result is available at the end of the ID stage, and you may reorder the code to avoid pipeline stalls. Complete the following pipeline execution diagram for one iteration. What is the CPI assuming there are infinite number of iterations?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lw $t0, 0($s1) IF ID EX ME WB
e. Assume structural hazards are resolved, data forwarding is implemented, branch result is available at the end of the ID stage, loops are unrolled twice with unnecessary loop overhead eliminated, and you may reorder the code to avoid pipeline stalls. Complete the following pipeline execution diagram for one unrolled-iteration (i.e., two original iterations). What is the CPI assuming there are infinite number of iterations?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lw $t0, 0($s1) IF ID EX ME WB
f. Assume at most two instructions (i.e., one ALU/branch and one load/store) can be issued at each cycle, data access from/to memory does not interfere the instruction fetch, data forwarding is implemented, branch prediction is perfect (i.e., branch does not stall), and you may reorder the code within one iteration to avoid the pipeline stalls. Complete the execution diagram for one iteration. What is the CPI assuming there are infinite number of iterations?
g. Assume at most two instructions (one ALU/branch and one load/store) can be issued at each cycle, data access from/to memory does not interfere the instruction fetch, data forwarding is implemented, branch prediction is perfect (i.e., branch does not stall), loops are unrolled four times with unnecessary loop overhead eliminated, and you may reorder the instructions to avoid the pipeline stalls. Complete the execution diagram for one unrolled-iteration (i.e., 4 original iterations). What is the CPI assuming there are infinite number of iterations?
Cycle ALU/branch Load/store 1 2 3 4 5 6 7 8 9 10
3. What optimizations will the gcc compiler perform on your code when you compile your code using the compilation flags -O0, -O1, -O2, and -O3 respectively? Compile the C code we provided in your Assignment2-Answer with the compilation flags -O0, -O1, -O2, and -O3 respectively and report the size (in bytes) of the executables produced by the three compilation approaches. Run the executables you get and compare the performance of different executables. Explain why the performance is different.
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.