TEST BANK FOR Computer Architecture A Quantitative Approach 4th Edition By John L. Hennessy
- A-Grades
- Rating : 0
- Grade : No Rating
- Questions : 0
- Solutions : 275
- Blog : 0
- Earned : $35.00
Chapter 1 Solutions L-2
L.2
Chapter 2 Solutions L-7
L.3
Chapter 3 Solutions L-20
L.4
Chapter 4 Solutions L-30
L.5
Chapter 5 Solutions L-46
L.6
Chapter 6 Solutions L-52
L
Solutions to Case Study
Exercises
L-2
Appendix L
Solutions to Case Study Exercises
Case Study 1: Chip Fabrication Cost
1.1 a.
b.
c.
The Sun Niagara is substantially larger, since it places 8 cores on a chip rather
than 1.
1.2 a.
b.
c.
$9.38
×
.4 = $3.75
d.
Selling price = ($9.38 + $3.75)
×
2 = $26.26
Profit = $26.26 – $4.72 = $21.54
e.
Rate of sale = 3
×
500,000 = 1,500,000/month
Profit = 1,500,000
×
$21.54 = $32,310,000
$1,000,000,000/$32,310,000 = 31 months
1.3 a.
b.
Prob of one defect = 0.29
×
0.71
7
×
8 = 0.21
Prob of two defects = 0.29
2
×
0.71
6
×
28 = 0.30
Prob of one or two = 0.21
×
0.30 = 0.51
c.
0.71
8
= .06 (now we see why this method is inaccurate!)
L.1 Chapter 1 Solutions
Yield 1 0.7 × 1.99
4.0
+ -------------------------
–4
= = 0.28
Yield 1 0.75 × 3.80
4.0
+ ----------------------------
–4
= = 0.12
Yield 1 0.30 × 3.89
4.0
+ ---------------------------
–4
= = 0.36
Dies per wafer π × (30 ⁄ 2)2
3.89
= ------------------------------ π × 30
sqrt(2 × 3.89)
– ---------------------------------- = 182 – 33.8 = 148
Cost per die $500
148 × 0.36
= -------------------------- = $9.38
Yield 1 .7 × 1.86
4.0
+ ---------------------
–4
= = 0.32
Dies per wafer π × (30 ⁄ 2)2
1.86
= ------------------------------ π × 30
sqrt(2 × 1.86)
– ---------------------------------- = 380 – 48.9 = 331
Cost per die $500
331 × .32
= ----------------------- = $4.72
Yield 1 .75 × 3.80 ⁄ 8
4.0
+ -------------------------------
–4
= = 0.71
Prob of error = 1 – 0.71 = 0.29
L.1 Chapter 1 Solutions
L
-
3
d.
0.51 ⁄ 0.06 = 8.5
e.
x
×
$150 + 8.5
x
×
$100 – (9.5
x
×
$80) – 9.5
x
×
$1.50 = $200,000,000
x
= 885,938 8-core chips, 8,416,390 chips total
Case Study 2: Power Consumption in Computer Systems
1.4 a.
.70
x
= 79 + 2
×
3.7 + 2
×
7.9
x
= 146
b.
4.0 W
×
.4 + 7.9 W
×
.6 = 6.34 W
c.
The 7200 rpm drive takes 60 s to read/seek and 40 s idle for a particular job.
The 5400 rpm disk requires 4/3
×
60 s, or 80 s to do the same thing. Therefore,
it is idle 20% of the time.
1.5 a.
b.
c.
1.6 a.
See Figure L.1.
b.
Sun Fire T2000
c.
More expensive servers can be more compact, allowing more computers to be
stored in the same amount of space. Because real estate is so expensive, this
is a huge concern. Also, power may not be the same for both systems. It can
cost more to purchase a chip that is optimized for lower power consumption.
1.7 a.
50%
b.
c.
Sun Fire T2000 IBM x346
SPECjbb 213 91.2
SPECweb 42.4 9.93
Figure L.1
Power/performance ratios.
14 KW
(79 W + 2.3 W + 7.0 W)
----------------------------------------------------------- = 158
14 KW
(79 W + 2.3 W + 2 × 7.0 W)
---------------------------------------------------------------------- = 146
MTTF
1
9 × 106 ------------------ + 8 × 1
4500
----------- 1
3 × 104 + ------------------ 8 × 2000 + 300
9 × 106 ------------------------------------- 16301
9 × 106 = = ------------------
1
Failure rate
--------------------------- 9 × 106
16301
= ------------------ = 522 hours
=
Power new
Power old
-------------------------- (V × 0.50)2 × (F × 0.50)
V2 × F
------------------------------------------------------------- 0.53 = = = 0.125
.70
(1 – x) + x ⁄ 2
= -------------------------------- ; x = 60%
L-4
Appendix L
Solutions to Case Study Exercises
d.
Case Study 3: The Cost of Reliability (and Failure) in Web
Servers
1.8 a.
14 days
×
$1.4 million⁄day = $19.6 million
$4 billion – $19.6 million = $3.98 billion
b.
Increase in total revenue: 4.8/3.9 = 1.23
In the fourth quarter, the rough estimate would be a loss of 1.23
×
$19.6 million
= $24.1 million.
c. Losing $1.4 million × .50 = $700,000 per day. This pays for $700,000/$7,500
= 93 computers per day.
d. It depends on how the 2.6 million visitors are counted.
If the 2.6 million visitors are not unique, but are actually visitors each day
summed across a month: 2.6 million × 8.4 = 21.84 million transactions per
month. $5.38 × 21.84 million = $117 million per month.
If the 2.6 million visitors are assumed to visit every day: 2.6 million × 8.4 ×
31 = 677 million transactions per month. $5.38 × 677 million = $3.6 billion
per month, which is clearly not the case, or else their online service would not
make money.
1.9 a. FIT = 109⁄ MTTF
MTTF = 109⁄ FIT = 109⁄ 100 = 10,000,000
b.
1.10 Using the simplifying assumption that all failures are independent, we sum the
probability of failure rate of all of the computers:
Failure rate = 1000 × 10–7 = 10–4 = FIT = 105, therefore MTTF = = 104
1.11 a. Assuming that we do not repair the computers, we wait for how long it takes
for 3,334 computers to fail.
3,334 × 10,000,000 = 33,340,000,000 hours
b. Total cost of the decision: $1,000 × 10,000 computers = $10 million
Expected benefit of the decision: Gain a day of downtime for every
33,340,000,000 hours of uptime. This would save us $1.4 million each
3,858,000 years. This would definitely not be worth it.
Power new
Power old
-------------------------- (V × 0.70)2 × (F × 0.50)
V2
× F
------------------------------------------------------------- 0.72 = = × 0.5 = 0.245
Availability MTTF
MTTF + MTTR
-------------------------------------- 107
107 + 24
= = -------------------- = about 100%
105
109 -------- 109
105 --------
L.1 Chapter 1 Solutions L-5
Case Study 4: Performance
1.12 a. See Figure L.2.
b. See Figure L.3.
c. The arithmetic mean of the original performance shows that the Athlon 64 X2
4800+ is the fastest processor.
The arithmetic mean of the normalized processors shows that Processor X is
the fastest processor.
d. Single processors: .05
Dual processors: 1.17
e. Solutions will vary.
Chip Memory performance Dhrystone performance
Athlon 64 X2 4800+ 1.14 1.36
Pentium EE 840 1.08 1.24
Pentium D 820 1 1
Athlon 64 X2 3800+ 0.98 1.13
Pentium 4 0.91 0.5
Athlon 64 3000+ 0.98 0.5
Pentium 4 570 1.17 0.74
Widget X 2.33 0.33
Figure L.2 Performance of several processors normalized to the Pentium 4 570.
Chip Arithmetic mean
Arithmetic mean of
normalized
Athlon 64 X2 4800+ 12070.5 1.25
Pentium EE 840 11060.5 1.16
Pentium D 820 9110 1
Athlon 64 X2 3800+ 10035 1.05
Pentium 4 5176 0.95
Athlon 64 3000+ 5290.5 0.95
Pentium 4 570 7355.5 0.77
Processor X 6000 1.33
Figure L.3 Arithmetic mean of several processors.
L-6 Appendix L Solutions to Case Study Exercises
f. Dual processors gain in CPU performance (exhibited by the Dhrystone performance),
but they do not necessarily increase in memory performance. This
makes sense because, although they are doubling the processing power, dual
processors do not change the memory hierarchy very much. Benchmarks that
exercise the memory often do not fit in the size of the cache, so doubling the
cache does not help the memory benchmarks substantially. In some applications,
however, they could gain substantially due to the increased cache available.
1.13 a. Pentium 4 570: .4 × 3,501 + .6 × 11,210 = 8,126
Athlon 64 X2 4,800+: .4 × 3,423 + .6 × 20,718 = 13,800
b. 20,718/7,621 = 2.7
c. x × 3,501 + (1x) × 11,210 = x × 3,000 + (1x) × 15,220
x = .89
.89/.11 = 8x ratio of memory to processor computation
1.14 a. Amdahl’s Law:
b. Amdahl’s Law:
c. Amdahl’s Law:
d. Amdahl’s Law:
1
.6 + .4 ⁄ 2
---------------------- = 1.25x speedup
1
.01 + .99 ⁄ 2
---------------------------- = 1.98x speedup
1
.2 + .8 × (.6 + .4 ⁄ 2)
-------------------------------------------------- = 1.19x speedup
1
.8 + .2 × (.01 + .99 ⁄ 2)
-------------------------------------------------------- = 1.11x speedup
L.2 Chapter 2 Solutions L-7
Case Study 1: Exploring the Impact of Microarchitectural
Techniques
2.1 The baseline performance (in cycles, per loop iteration) of the code sequence in
Figure 2.35, if no new instruction’s execution could be initiated until the previous
instruction’s execution had completed, is 37, as shown in Figure L.4. How did I
come up with that number? Each instruction requires one clock cycle of execution
(a clock cycle in which that instruction, and only that instruction, is occupying
the execution units; since every instruction must execute, the loop will take at
least that many clock cycles). To that base number, we add the extra latency
cycles. Don’t forget the branch shadow cycle.
2.2 How many cycles would the loop body in the code sequence in Figure 2.35
require if the pipeline detected true data dependencies and only stalled on those,
rather than blindly stalling everything just because one functional unit is busy?
The answer is 27, as shown in Figure L.5. Remember, the point of the extra
latency cycles is to allow an instruction to complete whatever actions it needs, in
order to produce its correct output. Until that output is ready, no dependent
instructions can be executed. So the first LD must stall the next instruction for
three clock cycles. The MULTD produces a result for its successor, and therefore
must stall 4 more clocks, and so on.
[Solved] TEST BANK FOR Computer Architecture A Quantitative Approach 4th Edition By John L. Hennessy
- This solution is not purchased yet.
- Submitted On 10 Feb, 2022 12:38:28
- A-Grades
- Rating : 0
- Grade : No Rating
- Questions : 0
- Solutions : 275
- Blog : 0
- Earned : $35.00