Call these mismodelling errors bias model noise and bias

ECE 830 Fall 2011 Statistical Signal Processing instructor: R. Nowak Lecture 1: Elements of Statistical Signal Processing

Figure 1: Statistical Signal Processing: Measurement, modeling and inference.

Lecture 1: Elements of Statistical Signal Processing 2

How do we model noise and bias, these uncertain errors? We need a calculus for uncertainty, and among many that have been proposed and used, the probabilistic framework appears to be the most successful, and in many situations it is physically plausible as well.

2 A Detection Example

H1 : xi = −si + ϵi , i = 1, . . . , n

How well does {si} match {xi}? computing a function of the data. Functions of data are called statistics. A natural statistic in this problem How well does {−si} match {xi}? This comparison can be made by

quantify the performance of this test we need a mathematical model for the errors {ϵi}.

2.1 A Parameter Estimation Example

xk	=	A sin(ω k + φ) + wk ,

In this case we have θ = [A, ω, φ]. Specifying a probability distribution for the noises (say Gaussian), would yield a probability distribution for our data p(x|θ). Given the data x = [x1, x2, . . . , xn], how would you estimate the values of the parameters?

2.2 A Signal Estimation Example

Now imagine you are working with geneticists to develop a diagnostic tool to predict whether patients have a certain disease. The tool is to be based on genomic data from the patient. For example, suppose that a microarray experiment is used to measure the levels of gene expression in the patient. For each of m genes we have an expression level (which reflects the amount of protein that gene is producing). Let x denote an m×1 vector of the expression levels and let y denote a binary variable indicating whether or not the patient has the disease.

The goal is to design a tool to predict the value of y from x. If we had a joint probability model p(x, y), then we could use it to design the predictor. But suppose that instead of a model we have only training data in the form of pairs {(xi, yi)}n function that combines a new test vector x and the training data {(xi, yi)}n of y corresponding to x. A simple approach to this problem is to find the xi that is “closest” in some sense i=1from a set of randomly selected human subjects. Our tool will be a i=1in order to predict the value