Statistics Hypothesis Testing

Sample: Hypothesis Testing Paper

One and Two (or more) Sample Hypothesis Testing Paper. Using data from one of the data sets available through the “Data Sets” link on your page, develop one business research question from which you will formulate a research hypothesis to test one population parameter and another to test two (or more) population parameters. Formulate both a numerical and verbal hypothesis statement regarding each of your research issue.

Perform Hypotheses Tests using the five step model. Describe and interpret the results of the test, both in statistical terms and in conversational English. Include appropriate descriptive statistics.

Solution:
Research question: To find whether there is a significant difference between wins and salary of the baseball players.

There are two leagues denoted as
1 if American League and
0 if National League
We have separated the data set as
Data set
American League:

Salary alary -mil Wins
123505125.0 123.5 95.0
208306817.0 208.3 95.0
55425762.0 55.4 88.0
73914333.0 73.9 74.0
97725322.0 97.7 95.0
41502500.0 41.5 93.0
75178000.0 75.2

99.0

45.7 80.0
56186000.0 56.2 83.0
29679067.0 29.7 67.0
55849000.0 55.8 79.0
69092000.0 69.1 71.0
87754334.0 87.8 69.0
36881000.0 36.9 56.0

Claim: There is a significant difference between wins and salary- mil of the baseball players in American League.
Hypotheses:
Null Hypothesis:
Numerical Null Hypothesis:
spss help
Verbal Null Hypothesis:
statistics tutorThere is no significant difference between wins and salary- mil of the baseball players in American League.
Alternative Hypothesis:
Numerical Alternative Hypothesis:
business statistics help
Verbal Alternative Hypothesis:
college statistics helpThere is a significant difference between wins and salary- mil of the baseball players in American League.
Level of Significance:
α = 0.05
Decision rule:
If the p value is greater than the given level of significance we may accept the null hypothesis. Otherwise reject the null hypothesis.
Test Statistic:
statistics tutor

Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàHypothesis tests à Compare two independent groups

Hypothesis Test: Independent Groups (t-test, pooled variance)
Salary -mil Wins
75.479 81.71 mean
45.930 13.07 std. dev.
14 14 n
26 df
-6.2357

difference (Salary -mil - Wins)

1,140.1793

pooled variance

33.7665 pooled std. dev.
12.7626 standard error of difference
0 hypothesized difference
-0.49 t
.6292 p-value (two-tailed)

The test statistic value is -0.49.

The p value for the test statistic is 0.6292.

Conclusion:
Since the p value of test statistic is greater than 0.05 level of significance we may accept the null hypothesis H0 at 5% level of significance. Hence, we conclude that there is no significant difference between wins and salary- mil of the baseball players in American League.
Research question: To find whether there is a significant difference between wins and salary of the baseball players.

Data set
National League:

Salary Salary -mil Wins
86457302.0 86.5 90.0
62329166.0 62.3 77.0
76799000.0 76.8 89.0
61892583.0 61.9 73.0
101305821.0 101.3 83.0
38133000.0 38.1 67.0
83039000.0 83.0 71.0
63290833.0 63.3 82.0
48581500.0 48.6 81.0
90199500.0 90.2 75.0
92106833.0 92.1 100.0
60408834.0 60.4 83.0
95522000.0 95.5 88.0
39934833.0 39.9 81.0
87032933.0 87.0 79.0
48155000.0 48.2 67.0

Claim: There is a significant difference between wins and salary- mil of the baseball players in National League.
Hypotheses:
Null Hypothesis:
Numerical Null Hypothesis:
data analysis help
Verbal Null Hypothesis:

elementary statistics help There is no significant difference between wins and salary- mil of the baseball players in National League.

Alternative Hypothesis:
Numerical Alternative Hypothesis:
help with statistics
Verbal Alternative Hypothesis:
math statistics helpThere is a significant difference between wins and salary- mil of the baseball players in National League.
Level of Significance:
α = 0.05
Decision rule:
If the p value is greater than the given level of significance we may accept the null hypothesis. Otherwise reject the null hypothesis.
Test Statistic:
online statistics help

Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàHypothesis tests à Compare two independent groups

Hypothesis Test: Independent Groups (t-test, pooled variance)

Salary -mil Wins
70.949 80.375 mean
20.669 8.831 std. dev.
16 16 n
df
-9.4257 difference (Salary -mil - Wins)
252.5883 pooled variance
15.8930 pooled std. dev.
5.6190 standard error of difference>
0 hypothesized difference
-1.68 t
.1038 p-value (two-tailed)

The test statistic value is -1.68.
The p value for the test statistic is 0.1038.

Conclusion:
Since the p value of test statistic is greater than 0.05 level of significance we may accept the null hypothesis H0 at 5% level of significance. Hence, we conclude that there is no significant difference between wins and salary- mil of the baseball players in National League.

Regression analysis:
The general multiple regression is given by
probability and statistics help
where, y is the dependent variable,
statistics assignment help’s are independent variable,
statistics help is the actual constant,
statistics help for students is the actual coefficient associated with ith independent variable,
statistics help online is the error term which models the unsystematic error of the y
The above model can be written in matrix form as
statistics math help
The General Goal of multiple regression is to determine which independent (explanatory) variables should be included in the model.
We want to first test each coefficient, statistics homework help where i=1,2,...,k, within the model, in order to determine if that individual parameter should be dropped from the model.
Next we test the goodness of fit of the model.

Hypothesis Tests:
statistics probability help
Statistics help
Procedure:
First we estimate the model as
statistics homework help
where, college statistics help is the estimated value of help with statistics and online statistics help.

For Testing Each probability and statistics help:
The test statistic is given by
statistics assignment help
where, statistics help is the standard error of the estimated coefficient business statistics help.

Goodness of fit test:
In order to test the goodness of fit test we generally compute R2, which lies between 0 and 1. As R2 tends to 1, we can say that the model is suitable for the data i.e. the model can explain the data very well.

Dependent variable:
X7- Wins
Independent variables:
X2- League
X3- Built
X4- Size
X5- Surface
X6- Salary- mil
X8- Attendance
X9- Batting
X10- ERA
X11- HR
X12- Error
X13- SB

Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàCorrelation/ Regression à Regression analysis

Regression Analysis

0.857

Adjusted R² 0.770 n

30

R 0.926 k 11
Std. Error

5.200 Dep. Var. Wins

ANOVA table

Source SS df  MS F p-value
Regression 2,917.2794 11 265.2072 9.81 1.64E-05
Residual 486.7206 18 27.0400
Total 3,404.0000 29
Regression output confidence interval
variables coefficients std. error t (df=18) p-value 95% lower 95% upper
Intercept 74.6634 133.9145 0.558 .5840 -206.6805 356.0073
League -1.2494 2.3275 -0.537 .5980 -6.1392 3.6404
Built -0.0274 0.0558 -0.491 .6291 -0.1447 0.0899
Size -0.00000401 0.00020556 -0.019 .9847 -0.00043588 0.00042787
Surface 0.5761 4.3135 0.134 .8952 -8.4863 9.6384
Salary -mil 0.0411 0.0667 0.615 .5462 -0.0992 0.1813
Attendance -0.00000085 0.00000317 -0.267 .7923 -0.00000750 0.00000581
Batting 447.7443 200.5131 2.233 .0385 26.4819 869.0067
ERA -13.6362 2.4171 -5.642 2.37E-05 -18.7143 -8.5581
HR 0.0930 0.0338 2.755 .0130 0.0221 0.1639
Error -0.1601 0.1246 -1.285 .2151 -0.4218 0.1017
SB 0.0152 0.0361 0.422 .6777 -0.0605 0.0910

The regression equation is
Wins = 74.6634 - 1.2494 League - 0.0274 Built - 0.00000401 Size + 0.5761 Surface + 0.0411 Salary -mil - 0.00000085 Attendance + 447.7443 Batting -13.6362 ERA + 0.0930 HR -0.1601 Error + 0.0152 SB


The R-Sq(adj.) value is high. So the model has good fit. But the p-values for x2, x3, x4, x5, x6, x12 and x13 are greater than 0.05. So these coefficients are insignificant. There is thus a multicollinearity problem. So we drop these variables and regress x7 on x9, x10 and x11.

Regression Analysis: x7 versus x9, x10, x11

Dependent variable:
X7- Wins
Independent variables:
X9- Batting
X10- ERA
X11- HR
Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàCorrelation/ Regression à Regression analysis

Regression Analysis
0.810
Adjusted R² 0.788 n 30
R 0.900 k 3
Std. Error 4.988 Dep. Var. Wins

ANOVA table

Source SS df MS F p-value
Regression 2,757.1594 3 919.0531 36.94 1.60E-09
Residual 646.8406 26 24.8785
Total 3,404.0000 29
Regression output confidence interval
variables coefficients std. error t (df=26)

p-value 95% lower 95% upper
Intercept 1.8499 35.0214 0.053 .9583 -70.1376 73.8374
Batting 492.4490 140.3025 3.510 .0017 204.0532 780.8449
ERA -15.9575 1.6753 -9.525 5.78E-10 -19.4011 -12.5139
HR 0.1035 0.0289 3.582 .0014 0.0441 0.1628

The regression equation is
Wins = 1.8499 + 492.4490 Batting -15.9575 ERA + 0.1035 HR


Here all the p values of the coefficients are less than 0.05 i.e. statistics help for students are significant at 5 % level of significance. The R2 value is slightly reduced after dropping the variables and it is of not that much effect and hence the model is good.

Correlation:
Research question: To find whether salary have relationship with Attendance of the baseball players.
There are two leagues denoted as
1 if American League and
0 if National League
We have separated the data set as
Data set
American League:

Salary -mil Attendance
123.5 2,847,798
208.3 4,090,440
55.4 2,108,818
73.9 2,623,904
97.7 3,404,636
41.5 2,014,220
75.2 2,342,804
45.7 2,014,995
56.2 2,034,243
29.7 1,141,915
55.8 2,525,259
69.1 2,024,505
87.8 2,724,859
36.9 1,371,181

Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàCorrelation/ Regression à Correlation Matrix

Correlation Matrix
Salary -mil Attendance
Salary -mil 1.000
Attendance .895 1.000
14 sample size

The correlation coefficient between salary- mil and attendance is 0.895. there is a strong positive correlation exist between the variables.

Null Hypothesis:
H0: ρ=0
H0: “no linear relationship” between the variables.
Alternative Hypothesis:
H1: ρ≠0
H1:“ linear relationship” between the variables.

Level of significance:
α = 0.05
Critical value:
At 5% level of significance t distribution with v = 14 - 2 degrees of freedom is 2.178813
Test statistic:
Under college statistics help
statistics help has a t distribution with v = n-2 degrees of freedom.
r= 0.895 and n = 14
data analysis help
online statistics help
probability and statistics help
statistics assignment help
statistics help

Conclusion:
Since the test statistic value is greater than the critical value there is no evidence to accept the null hypothesis at 5% level of significance. Hence we conclude that there is a relationship exist between the variables salary- mil and attendance.

Data set
National League:

Salary -mil Attendance
86.5 2,520,904
62.3 2,059,327
76.8 2,805,060
61.9 1,923,254
101.3 2,827,549
38.1 1,817,245
83 3,603,680
63.3 2,869,787
48.6 2,730,352
90.2 3,181,020
92.1 3,542,271
60.4 1,852,608
95.5 2,665,304
39.9 2,211,323
87 3,100,092>
48.2 1,914,385

Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàCorrelation/ Regression à Correlation Matrix

Correlation Matrix
Salary -mil Attendance
Salary -mil 1.000
Attendance .693 1.000
16 sample size

The correlation coefficient between salary- mil and attendance is 0.693. There is a strong positive correlation exist between the variables.
Null Hypothesis:
H0: ρ=0
H0: “no linear relationship” between the variables.
Alternative Hypothesis:
H1: ρ≠0
H1:“ linear relationship” between the variables.
Level of significance:
α = 0.05
Critical value:
At 5% level of significance t distribution with v = 16 - 2 degrees of freedom is 2.144787
Test statistic:
Under statistics homework help
help with statisticshas a t distribution with v = n-2 degrees of freedom.
r= 0.693 and n = 16
statistics assignment help
statistics
statistics help online
elementary statistics help
online statistics help

Conclusion:
Since the test statistic value is greater than the critical value there is no evidence to accept the null hypothesis at 5% level of significance. Hence we conclude that there is a relationship exist between the variables salary- mil and attendance.

Descriptive Statistics:
Using Megastat in Microsoft Excel Add- Ins:
Add- Ins à MegastatàDescriptive Statistics

Salary -mil Wins Attendance Batting ERA HR Error SB >
count 30 30 30 30 30> 30 30 30
mean 73.064 81.000 2,496,457.93 0.26443 4.2847 167.23 102.00 85.50
sample variance 1,171.965 117.379 452,766,738,769.44 0.00005 0.3206 1,225.29 130.34 1,075.43
sample standard deviation 34.234 10.834 672,879.44 0.00728 0.5662 35.00 11.42 32.79
minimum 29.679067 56 1141915 0.252 3.49 117 86 31
maximum 208.30682 100 4090440 0.281 5.49 260 125 161
range 178.62775 44 2948525 0.029 2 143 39 130
1st quartile 50.293 73.250 2,017,372.50 0.25900 3.7875 136.75 92.50 65.25
median 66.191 81.000 2,523,081.50 0.26400 4.2000 164.00 102.50 76.00
3rd quartile 87.574 88.750 2,842,735.75 0.27000 4.5500 190.50 108.75 101.25
interquartile range 37.281 15.500 825,363.25 0.01100 0.7625 53.75 16.25 36.00
mode #N/A 95.000 #N/A 0.27000 3.6100 130.00 106.00 45.00

The descriptive statistics for the whole team is given in the above table.

Inference for our research:

  • From the analysis of comparing two independent groups we obtain the result as there is no significant difference between wins and salary- mil of the baseball players in American League.
  • From the regression analysis we obtained the regression equation predicting the wins is
Wins = 1.8499 + 492.4490 Batting -15.9575 ERA + 0.1035 HR
  • From the correlation analysis we obtained the result as there is a relationship exists between the variables salary- mil and attendance of baseball players in American League.
  • From the correlation analysis we obtained the result as there is a relationship exists between the variables salary- mil and attendance of the baseball players in National League.

Hypothesis Help | Dissertation Statistics | Writing Dissertation | Dissertation Proposal | Sample Homework | Online Tutors | Online Tutoring Online Tutoring | Essay Writing Help