Exact Sampling distributions
Assignment help :: Statistics :: Exact Sampling distributions(CHI SQUARE DISTRIBUTION)

11.            EXACT SAMPLING DISTRIBUTIONS (CHI SQUARE DISTRIBUTION)

 

11.1          CHI-SQUARE VARIATE:

The square of a standard normal variate is known as chi-square variate with 1 degree of freedom

Thus if X≈ N(μ, σ2), then

And is a chi-square variate with 1 degrees of freedom.

In general, if Xi, (i = 1, 2, ……, n) are n independent normal variates with mean μi and variance σi2, (i = 1, 2, ……, n), then

is a chi-square variate with n degrees of freedom.

 

11.2          DERIVATION OF THE CHI-SQUARE DISTRIBUTION:

First method – method of moment generating function

If Xi, (i = 1, 2, ……, n) are independent N(μi, σ i2), we want the distribution of

Since Xi’s are independent, Ui’s are also independent.

Since Ui’s ≈N(0,1) are identically distributed

Now

Which is the moment generating function of a Gamma variate with parameters and .

Hence by uniqueness theorem of moment generating function

Is a Gamma variate with parameters and .

Which is the required probability density function of chi square distribution with n degrees of freedom.

 

11.3          MOMENT GENERATING FUNCTION OF  DISTRIBUTION:

Let X ≈ . Then

Which is the required moment generating function of a variate with n degrees of freedom.

 

11.3.1       Cumulant generating function of  distribution:

If  X ≈ , then

k1 = coefficient of t in K(t) = n

k2 = coefficient of  in K(t) = 2n

k3= coefficient of  in K(t) = 8n

k4 = coefficient of  in K(t) = 48n

In general,

kr = coefficient of  in K(t) = n2r-1(r-1)!

Hence

Mean = k1 = n

Variance = μ2=k2=2n

μ3=k3=8n

μ4=k4+3 k22=48n + 12 n2

 

11.3.2       Limiting form of  distribution for large degrees of freedom:

If  X ≈ , then

The moment generating function of standard variate Z is given by

Or

Where are terms containing and higher powers of n in the denominator.

Which is the moment generating function of a standard normal variate. Hence by uniqueness theorem of moment generating function of Z is asymptotically normal. In other words, standard variate tends to standard normal variate as n à∞. Thus,  distribution tends to normal distribution for large degrees of freedom.

In practice for n ≥ 30, the  approximation to normal distribution is fairly good. So whenever n ≥ 30, we use the normal probability tables for testing the significance of the value of <. That is why in the tables given the significant values of have been tabulated till n = 30.

 

11.3.3       Characteristic function of  distribution:

If  X ≈ , then

 

11.3.4       Mode and skewness of  distribution:

If  X ≈ , then

------------------(1)

Mode of the distribution is the solution of

 and

Logarithmic differentiation with respect to x in (1) gives:

Since ,

It can be easily seen that at the point, x = (n-2), .

Hence mode of the chi square distribution with n degrees of freedom is (n-2).

Also Karl Pearson’s correlation of skewness is given by

Skewness = (mean – mode)/ standard deviation

Since Pearson’s coefficient of skewness is greater than zero for n≥1, the  distribution is positively skewed. Further since skewness is inversely proportional to the square root of degrees of freedom, it rapidly tends to symmetry as the degrees of freedom increases and consequently as n à ∞, the chi square distribution tends to normal distribution.

11.3.5       Additive property of variate:

The sum of independent chi square variates is also a variate. More precisely, if Xi, (i =1, 2, ……..k) are independent  variates with ni degrees of freedom respectively, then the sum is also a chi square variate with degrees of freedom.

Proof:

We have

The moment generating function of the sum is given by

Which is the moment generating function of a variate with (n1+n2+………+nk) degrees of freedom. Hence by uniqueness theorem of moment generating function is a <variate with degrees of freedom.

 

11.4          CHI-SQUARE PROBABILITY CURVE:

We get from 11.3.4

-------------(*)

Since x > 0 and f(x) being the probability density function is always non-negative, we get from (*)

if  (n-2) ≤ 0,

For all values of x. thus the probability curve for 1 and 2 degrees of freedom is monotonically decreasing. When n>2,

if x < (n-2)

 if x = (n-2)

 if x > (n-2)

This implies that for n>2, f(x) is monotonically increasing for 0 < x < (n-2) and monotonically decreasing for (n-2) < x < ∞, while at x = n -2, it attains the maximum value.

For n ≥ 1, as x increases, f(x) decreases rapidly and finally tends to zero x à ∞. Thus n >1, the probability curve is positively skewed towards higher values of x. moreover, x – axis is an asymptote to the cuve.

 

11.5          CONDITIONS FOR THE VALIDITY OF CHI-SQUARE TEST:

Chi-square test is an approximate test for large values of n. for the validity of chi square test of ‘goodness of fit’ between theory and experiment, the following conditions must be satisfied:

(i)                   The sample observations should be independent.

(ii)                 Constraints on the cell frequencies, if any , should be linear, eg., ∑ni = ∑λi or ∑Oi = ∑Ei.

(iii)                N, the total frequency should be reasonably large, say, greater than 50.

(iv)               No theoretical cell frequency should be less than 5. Distribution is essentially a continuous distribution but it cannot maintain its character of continuity if cell frequency is less than5, then for the application of chi square test, it is pooled with the preceding or succeeding frequency so that the pooled frequency is more than 5 and finally adjust for the degrees of freedom lost in pooling.

 

 

11.6 LINEAR TRANSFORMATION:

Let us suppose that the given set of variables is transformed to a new set of variables by means of the linear transformation:

.

.

.

That is

In matrix notation, this system of linear equations can be expressed symbolically as

Y =AX

Where Y =

 

From matrix theory, we know that the system has a unique solution  iff |A| ≠ 0. In other words, we can express X uniquely in terms Y if A is non singular and the solution is given by

X= A-1 Y

Where A-1 is the inverse of the square matrix A.

The linear transformation defined above is said to orthogonal if

A is an orthogonal matrix.

More elaborately

---------(**)

For every set of variables .

If we write

Then (**) implies that is a kronecker delta so that

Hence it follows that A is a orthogonal matrix.

 

11.7APPLICATIONS OF CHI-SQUARE DISTRIBUTION:

Chi square distribution has a large number of applications in statistics, some of which are enumerated below:

(i)                   To test if the hypothetical values of the population variance is σ2 = σ 02

(ii)                 To test the goodness of fit

(iii)                To test the independence of attributes

(iv)               To test the homogeneity of independent estimates of the population variance.

(v)                 To combine various probabilities obtained from independent experiments to give a single test of significance.

(vi)               To test the homogeneity of independent estimates of the population correlation coefficient.

 

11.7.1 Chi-square test for population variance:

Suppose we want to test if a random sample xi, (I = 1, 2, ., n) has been drawn from a normal population with a specified variance σ2 = σ 02,

Under the null hypothesis that the population variance is σ2 = σ 02, the statistic

Follows chi-square distribution with (n-1) degrees of freedom.

By comparing the calculated value with the tabulated value of for (n-1) degrees of freedom at certain level of significance, we may retain or reject the null hypothesis.

 

11.7.2 Chi-square test of goodness of fit:

A very powerful test for testing the significance of the discrepancy between theory and experiment was given by Prof. Karl Pearson in 1900 and is known as “Chi-Square test of goodness of fit”. It enables us to find if the deviation of the experiment from theory is just by chance or is it really due to the inadequacy of the theory to fit the observed data.

If Oi, (i= 1, 2,…, n) is a set of observed frequencies and Ei, (i= 1, 2,…, n) is the corresponding set of expected frequencies, then  Karl Pearson’s chi-square, given by

Follows chi-square distribution with (n-1) degrees of freedom.

 

11.8 YATES CORRECTION:

In a contingency table, the number of degrees of freedom is (2-1)(2-1) =1. If any one of the theoretical cell frequencies is less than 5, then the use of pooling method for chi-square test results in chi-square with zero degrees of freedom which is meaningless. In this case we apply a correction due to F. Yates, which is usually known as “Yates Correction for continuity”. This consists in adding 0.5 to the cell frequency which is less than 5 and then adjusting for the remaining cell frequencies accordingly. The chi-square test of goodness of fit is then applied without pooling method.

For a contingency table,

a

b

c

d

 

We have

According to Yate’s correction, as explained above, we subtract (1/2) from a and d and add (1/2) to b and c so that the marginal totals are not disturbed at all. Thus, corrected value of is given as

Numerator =

 

11.9 BRANDT AND SNEDECOR FORMULA FOR 2Xk CONTIGENCY TABLE:

Let the observations aij, (i=1,2:j =1,2,……, k) be arranged in a 2 x k contingency table as follows:

A

A1

A2

…………

Ai

…………….

Ak

Total

B1

a11

a12

…………

a1i

…………….

a1k

m1

B2

a21

a22

…………

a2i

…………….

a2k

m2

Total

n1

n2

…………

ni

…………….

nk

N

Under the hypothesis of independence of attributes, we have

Where

And

But

 

11.10 BARTLETT’S TEST FOR HOMOGENEITY OF SEVERAL INDEPENDENT ESTIMATES OF THE SAME POPULATION VARIANCE:

Let

Be the unbiased estimate of the population variance, obtained from the ith sample Xij,(j=1, 2, …….ni)and based on vi = (ni – 1) degrees if freedom, all the k samples being independent.

Under the null hypothesis that the samples come from the same population with variance σ2, that is the independent estimates , (i =1,2,….k) of σ2are homogeneous, Bartlet proved that the statistic

 

 

Where

Follows chi-square distribution with (k-1) degrees of freedom.

 

11.11 NON-CENTRAL CHI-SQUARE DISTRIBUTION:

The chi-square distribution defined as the sum of the squares of independent standard normal variates is often referred to as the central chi-square distribution. The distribution of the sum of the squares of independent normal variates each having unit variance but with possibly non zero means is known as non-central chi-square distribution. Thus if Xi, (i=1,2,…,n)are independent N(μi, 1), random variables then

Has the non central chi-square distribution with n degrees of freedom. Intuitively, this distribution would seem to depend upon the n parameters μ1, μ2,…….., μn but it will be seen that it depends on these parameters only through the non-centrality parameter.

And we write, .

 

 


Science help | Science homework help | Help with science | Science fair help | Science project help | Help for science | Help physical science | Help on science | Science help online | Help with science homework | Science fair project help | Earth science help | Science help me | Science helps | Kids science help | Help in science |Science projects help | Help with science project | Homework help for science | Science help for kids | Online tutoring