CSN-221 Pipelines-Quiz: Enrollment No.: 18114031 Name - Hemil Panchiwala
CSN-221 Pipelines-Quiz: Enrollment No.: 18114031 Name - Hemil Panchiwala
Pipelines-Quiz
Enrollment No.: 18114031
Name – Hemil Panchiwala
1
Answer:
a) Speedup = CPU time for single stage/ CPU time for pipelined machine = 5
b) Cycles per instruction for pipelined = (CPI)normal + (% of instructions) x penalty
= 1 + (0.4) x 1 = 1.4
Speedup = 5/1.4 = 3.57
a) 1) l1 – l2 : RAW for R1
2) l2 – l3 : RAW for R1
3) l4 – l5 : RAW for R2
4) l5-l6 : RAW for R4
2
Cycle Fetch Decode Execute Memory Write
1 1
2 2 1
3 2 1
4 2 1
5 3 2 1
6 3 2
7 3 2
8 4 3 2
9 5 4 3
10 5 4 3
11 5 4 3
12 6 5 4
13 6 5
14 6 5
15 6 5
16 6
Therefore, it takes 16 cycles to finish this sequence properly using a pipeline with no forwarding.
c)
3
Cycle Fetch Decode Execute Memory Write
1 1
2 2 1
3 3 2 1
4 3 2 1
5 4 3 2 1
6 5 4 3 2
7 6 5 4 3 2
8 6 5 4 3
9 6 5 4
Therefore, it takes 9 cycles to finish this sequence using a pipeline with forwarding.
Here, due to forwarding, value obtained after execution goes straight to the execute stage of the
next instruction if needed. Special case of load-use still needs stalling.
F D A M W
210 ps 90 ps 110 ps 240 ps 50 ps
4
if the single cycle processor has a CPI of 1 and the pipelined processor
achieves a CPI of 1.2?
d. If the processor must be implemented with a 3-stage pipeline, some of the
existing 5-stages must be combined (assume that the existing 5-stages can
not be split). Which of the existing five stages (F, D, A, M, W) should be
placed into which stage of the 3-stage pipeline to minimize the resulting
clock cycle time?
Stage-1:
Stage-2:
Stage-3:
e. If the processor is to be implemented with a 6-stage pipeline, but the
design effort and time to market are such that there is only enough time to
split one of the five existing (F, D, A, M, W) stages into two new stages,
which stage would you choose to split?
Answer:
a)
If we implement using a single cycle approach :
Cycle time =
Sum of individual latencies of the stages
= 210 + 90 + 110 + 240 + 50 = 700 ps
b)
If we implement this using a 5 – stage pipeline :
Cycle time = max(individual stage time + 20ps)
= max(210,90,110,240,50) + 20
= 240 + 20 = 260 ps
c)
Speedup = (CPU time for single cycle)/(CPU time for pipelined processor)
= (CPI x cycle time)single cycle / (CPI x cycle time)pipelined
5
= 2.2436
d)
Stage 1 : F
Stage 2 : D + A
Stage 3 : M + W
Cycle time = max(210,200,290) + 20
= 290 +20 = 310 ps
e)
Splitting stage M would be wisest as that is the stage with maximum latency , so dividing it
would reduce the time of the resultant stages so that the clock cycle time reduces and hence
performance increases.