Densityin the previous section
32 Theory:Probabilistic Classifiers
Lemma 2.14 For any 0 ≤ ε ≤ H(Pm P ), let Sε⊆ S be the set of all samples
claim:
Claim 1 Let Pm be a product distribution defined by the marginals Piover
D(P||Pm P ) = �P(x) log P(x) −�P(x) log Pm P (x)
= −H(P) −�P(x1, . . . xn) log Pm P (x1, . . . xn)
= −H(P) −��{log Pi(xi)}
(x1,...xi−1,xi+1,...xn)∈X� P(x1, . . . xn)
�H(Pi) = H(Pm P ),
Density of Distributions 33
(2.34) | |||
---|---|---|---|
and by integrating over the range [0, ε] we get Theorem 2.12. | |||
5.1 |
|
In the previous section, Theorem 2.12 was phrased in terms of the number of sample sets that share the same marginal distribution and thus yield the same classifier. We now prove a similar result directly in terms of the number of joint distributions at a certain distance from their induced product distribution. We assume a fixed resolution τ for the representation of real numbers; two real numbers are indistinguishable if their difference is smaller than τ.
|
|
(2.35) |
Pm | |||||
---|---|---|---|---|---|
Consider a joint probability distribution P ∈ P over X × X with marginals pa = P(X1= 1) and pb = P(X2= 1). Define δ = P11 algebra we get − papb. By simple |
|||||
P = [P11 | P10 | P01 | P00 |