TEST BANK FOR Pattern Classification 2nd Edition By David G. Stork
- GradeMaster1
- Rating : 1
- Grade : C+
- Questions : 0
- Solutions : 1124
- Blog : 0
- Earned : $278.60
1 Introduction 5
2 Bayesian decision theory 7
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3 Maximum likelihood and Bayesian parameter estimation 77
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4 Nonparametric techniques 131
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5 Linear discriminant functions 177
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
6 Multilayer neural networks 219
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
7 Stochastic methods 255
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8 Nonmetric methods 277
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
9 Algorithm-independent machine learning 295
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
10 Unsupervised learning and clustering 305
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Sample final exams and solutions 357
3
4 CONTENTS
Worked examples 415
Errata and ammendations in the text 417
First and second printings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Fifth printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Chapter 1
Introduction
Problem Solutions
There are neither problems nor computer exercises in Chapter 1.
5
Chapter 2
Bayesian decision theory
Problem Solutions
Section 2.1
1. Equation 7 in the text states
P(error|x) = min[P(ω1|x), P(ω2|x)].
(a) We assume, without loss of generality, that for a given particular x we have
P(ω2|x) ≥ P(ω1|x), and thus P(error|x) = P(ω1|x). We have, moreover, the
normalization condition P(ω1|x) = 1−P(ω2|x). Together these imply P(ω2|x) >
1/2 or 2P(ω2|x) > 1 and
2P(ω2|x)P(ω1|x) > P(ω1|x) = P(error|x).
This is true at every x, and hence the integrals obey
2P(ω2|x)P(ω1|x)dx ≥
P(error|x)dx.
In short, 2P(ω2|x)P(ω1|x) provides an upper bound for P(error|x).
(b) From part (a), we have that P(ω2|x) > 1/2, but in the current conditions not
greater than 1/α for α < 2. Take as an example, α = 4/3 and P(ω1|x) = 0.4
and hence P(ω2|x) = 0.6. In this case, P(error|x) = 0.4. Moreover, we have
αP(ω1|x)P(ω2|x) = 4/3 × 0.6 × 0.4 < P(error|x).
This does not provide an upper bound for all values of P(ω1|x).
(c) Let P(error|x) = P(ω1|x). In that case, for all x we have
P(ω2|x)P(ω1|x) < P(ω1|x)P(error|x)
P(ω2|x)P(ω1|x)dx <
P(ω1|x)P(error|x)dx,
and we have a lower bound.
7
8 CHAPTER 2. BAYESIAN DECISION THEORY
(d) The solution to part (b) also applies here.
Section 2.2
2. We are given that the density is of the form p(x|ωi) = ke−|x−ai|/bi .
(a) We seek k so that the function is normalized, as required by a true density. We
integrate this function, set it to 1.0,
k
⎡
⎣
ai
−∞
exp[(x − ai)/bi]dx +
∞
ai
exp[−(x − ai)/bi]dx
⎤
⎦ = 1,
which yields 2bik = 1 or k = 1/(2bi). Note that the normalization is independent
of ai, which corresponds to a shift along the axis and is hence indeed irrelevant
to normalization. The distribution is therefore written
p(x|ωi) =
1
2bi
e
−|x−ai|/bi .
(b) The likelihood ratio can be written directly:
p(x|ω1)
p(x|ω2)
= b2
b1
exp
−
|x − a1|
b1
+
|x − a2|
b2
.
(c) For the case a1 = 0, a2 = 1, b1 = 1 and b2 = 2, we have the likelihood ratio is
p(x|ω2)
p(x|ω1)
=
⎧⎨
⎩
2e(x+1)/2 x ≤ 0
2e(1−3x)/2 0 < x ≤ 1
2e(−x−1)/2 x > 1,
as shown in the figure.
-2 -1 1 2
0.5
1
1.5
2
2.5
3
3.5
4
0 x
p(x|ω1)
p(x|ω2)
Section 2.3
3. We are are to use the standard zero-one classification cost, that is λ11 = λ22 = 0
and λ12 = λ21 = 1.
PROBLEM SOLUTIONS 9
(a) We have the priors P(ω1) and P(ω2) = 1 − P(ω1). The Bayes risk is given by
Eqs. 12 and 13 in the text:
R(P(ω1)) = P(ω1)
R2
p(x|ω1)dx + (1 − P(ω1))
R1
p(x|ω2)dx.
To obtain the prior with the minimum risk, we take the derivative with respect
to P(ω1) and set it to 0, that is
d
dP(ω1)R(P(ω1)) =
R2
p(x|ω1)dx −
R1
p(x|ω2)dx = 0,
which gives the desired result:
R2
p(x|ω1)dx =
R1
p(x|ω2)dx.
(b) This solution is not always unique, as shown in this simple counterexample. Let
P(ω1) = P(ω2) = 0.5 and
p(x|ω1) =
1 −0.5 ≤ x ≤ 0.5
0 otherwise
p(x|ω2) =
1 0≤ x ≤ 1
0 otherwise.
It is easy to verify that the decision regions R1 = [−0.5, 0.25] and R1 = [0, 0.5]
satisfy the equations in part (a); thus the solution is not unique.
4. Consider the minimax criterion for a two-category classification problem.
(a) The total risk is the integral over the two regions Ri of the posteriors times
their costs:
R =
R1
[λ11P(ω1)p(x|ω1) + λ12P(ω2)p(x|ω2)] dx
+
R2
[λ21P(ω1)p(x|ω1) + λ22P(ω2)p(x|ω2)] dx.
We use
R2
p(x|ω2) dx = 1−
R1
p(x|ω2) dx and P(ω2) = 1 − P(ω1), regroup to
find:
R = λ22 + λ12
R1
p(x|ω2) dx − λ22
R1
p(x|ω2) dx
+ P(ω1)
(λ11 − λ22) + λ11
R2
p(x|ω1) dx − λ12
R1
p(x|ω2) dx
+ λ21
R2
p(x|ω1) dx + λ22
R1
p(x|ω2) dx
10 CHAPTER 2. BAYESIAN DECISION THEORY
= λ22 + (λ12 − λ22)
R1
p(x|ω2) dx
+P(ω1)
(λ11 − λ22) + (λ11 + λ21)
R2
p(x|ω1) dx
+ (λ22 − λ12)
R1
p(x|ω2) dx
.
(b) Consider an arbitrary prior 0 < P∗(ω1) < 1, and assume the decision boundary
has been set so as to achieve the minimal (Bayes) error for that prior. If one holds
the same decision boundary, but changes the prior probabilities (i.e., P(ω1) in
the figure), then the error changes linearly, as given by the formula in part (a).
The true Bayes error, however, must be less than or equal to that (linearly
bounded) value, since one has the freedom to change the decision boundary at
each value of P(ω1). Moreover, we note that the Bayes error is 0 at P(ω1) = 0
and at P(ω1) = 1, since the Bayes decision rule under those conditions is to
always decide ω2 or ω1, respectively, and this gives zero error. Thus the curve
of Bayes error rate is concave down for all prior probabilities.
P*(ω1) 0.5 1
P(ω1)
0.1
0.2
E(P(ω1))
(c) According to the general minimax equation in part (a), for our case (i.e., λ11 =
λ22 = 0 and λ12 = λ21 = 1) the decision boundary is chosen to satisfy
R2
p(x|ω1) dx =
R1
p(x|ω2) dx.
We assume that a single decision point suffices, and thus we seek to find x∗ such
that
x
∗
−∞
N(μ1, σ2
1) dx =
∞
x∗
N(μ2, σ2
2) dx,
where, as usual, N(μi, σ2
i ) denotes a Gaussian. We assume for definiteness and
without loss of generality that μ2 > μ1, and that the single decision point lies
between the means. Recall the definition of an error function, given by Eq. 96
PROBLEM SOLUTIONS 11
in the Appendix of the text, that is,
erf(x) =
2 √
π
[Solved] TEST BANK FOR Pattern Classification 2nd Edition By David G. Stork
- This solution is not purchased yet.
- Submitted On 16 Nov, 2021 03:03:11
- GradeMaster1
- Rating : 1
- Grade : C+
- Questions : 0
- Solutions : 1124
- Blog : 0
- Earned : $278.60