 Home
 Questions
 PHP4006 Advanced Statistics Assignment
PHP4006 Advanced Statistics Assignment
Section A (Answer ALL questions)
Answer each question as TRUE or FALSE in the answer book provided.
 The null hypothesis of the ShapiroWilk test can be given as “The sample comes from a normally distributed population”.Outliers are influential cases.
 SPSS has a nonparametric equivalent to a Factorial ANOVA for a completely randomized design.
 If p = .01, you can conclude that there is a 1% chance the null hypothesis is true.
 The odds of an event are the ratio of the probability of an event happening to the probability of the event not happening.
 The dependent/outcome variable in a logistic regression is a binary variable.
 In a Principal Component Analysis, the first component accounts for more of the variability in the data than the second component.
 A MANOVA assumes that the variables in each group are normally distributed.
 A loglinear model with two factors is equivalent to a contingency table analysis.
 An ANCOVA allows there to be errors in the measured covariate.
Section B (Answer THREE questions)
 You are planning a study of the effectiveness of two therapies for treating Depression. Group A will receive the current therapy and Group B will receive a new and improved therapy. During the investigation all patients will be assessed three times over a total of 12 weeks treatment. You are trying to decide if the new therapy is more effective than the current therapy.
 a) State the name and design of the statistical test you intend to use?
(3 marks) b) State the null hypotheses of your test chosen in (a). (3 marks)
 Explain what the significance level of the chosen test is. (2 marks)
 State the assumptions that the chosen test requires to hold (6 marks)
 Describe how you would assess whether the assumptions hold.
(6 marks)
 f) State two other analyses you would consider using if the assumptions do not hold. (5 marks)
 Write short notes describing the type of data, question to be answered and your understanding of how each of the following statistical techniques works
 Factor Analysis (6 marks) ii. Path Analysis (6 marks) iii. Structural Equation Modelling (6 marks)
Then describe a dataset and justify the use of ONE of the above techniques. DO
NOT use a dataset used in the lectures. (7 marks)
 You have studied the impact of cognitive fatigue on behavioural impulsivity. You measured impulsivity using a decisionmaking task where a higher score represents less impulsivity. 60 participants were allocated to one of three conditions: a control condition and two conditions inducing cognitive fatigue (Mild or High). After the treatment, participants completed the decisionmaking task. The output for the Oneway independent ANOVA is below (assumptions were met).
 Calculate the Omegasquared effect size for the differences in impulsivity scores across conditions from the information below. (9 marks)
 Calculate the Pearson’s r effect size for the statistically significant contrast result using the information below (7 marks)
 Explain some advantages of using effect sizes in addition to measures of statistical significance. (5 marks)
 Briefly summarise the main findings of the data below. (4 marks)
Descriptives
Impulsivity
95% Confidence Interval for Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Control 20 100.80 8.817 1.972 96.67 104.93 79 114 Mild 20 85.05 11.009 2.462 79.90 90.20 65 100 High 20 81.10 6.601 1.476 78.01 84.19 64 96
Total 60 88.98 12.319 1.590 85.80 92.17 64 114
ANOVA
Impulsivity

Sum of Squares

df


Mean Square

F

Sig.

Between Groups

4345.033


2

2172.517

26.874

.000

Within Groups

4607.950


57

80.841



Total

8952.983


59




Contrast Coefficients
Contrast Fatigue Condition

Control Mild High

1

2 1 1

2

0 1 1

Contrast Tests

Contrast

Value of Contrast

Std. Error

t

df

Sig. (2tailed)

Impulsivity Assume equal variances

1
2

35.45 3.95

4.925
2.843

7.198

57

.000
.170

Does not assume equal variances

1
2

35.45 3.95

4.877
2.870

7.268
1.376

37.957
31.096

.000
.179

 The data analysed below is on glucose control in diabetic patients. Good control is measured by a low value of Glucose in the blood
response

G

Glucose in the blood

predictors

K

Knowledge of the illness


F

Measure of attribution called fatalistic externalism


D

Duration of the illness in months


S

Length of schooling 0 – less than 13 years,
1 – more than 13 years

The output below is taken from the use of a forced entry regression and a forward regression to predict Glucose from Knowledge, Fatalism, Duration and Schooling.
 How many diabetic patients were in the study
(1 marks)
 Which predictor variable does the forced entry regression suggest is the best predictor of Glucose? Quote the p
 Which predictor variable does the forward regression suggest is the best predictor of Glucose? Quote the p
(2marks)
 Explain why the best predictor chosen by the forward regression does not have to be the same as that suggested by the forced entry regression. (4 marks)
 From the forced entry regression, state the model equation and predict the glucose level of a person with Duration = 141, Fatalism = 19,
Knowledge = 36 and Schooling = 1.
(10 marks)
 Using the forward regression output, what conclusions do you come to about the use of the 4 predictors to predict blood Glucose level.
 Why, from looking at the definitions of the four predictor variables would you perform further analyses and what would they be?
(4 marks)
Regression
Variables Entered/Removed ^{b}
Variables Variables
Model Entered Removed Method
K, D, S, F
 All requested variables entered.
 Dependent Variable: G
Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
18.198
 Predictors: (Constant), K, D, S, F
ANOVA^{b}
Model Sum of Squares df Mean Square F Sig.
1 Regression 5557.055 4 1389.264 4.195 .004^{a}
Residual 20863.710 63 331.170
Total 26420.765 67
 Predictors: (Constant), K, D, S, F
 Dependent Variable: G
Coefficients^{a}
1 (Constant) 130.385 6.983 .000 F .028 .485 .008 .057 .955
D .053 .028 .225 1.890 .063
S 11.283 4.850 .284 2.327 .023
K .731 .364 .267 2.008 .049
 Dependent Variable: G
Regression
Variables Entered/Removed^{a}
Variables Variables
Model Entered Removed Method
1 Forward
(Criterion:
K . Probabilityof
Ftoenter <= .050)
 Dependent Variable: G
Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
18.810
 Predictors: (Constant), K
ANOVA^{b}
Model Sum of Squares df Mean Square F Sig.
1 Regression 3067.774 1 3067.774 8.670 .004^{a}
Residual 23352.990 66 353.833
Total 26420.765 67
 Predictors: (Constant), K
 Dependent Variable: G
Coefficients^{a}
1 (Constant) 126.438 11.056 .000
K .932 .317 .341 2.945 .004
 Dependent Variable: G
Excluded Variables ^{b}



Model Beta In t Sig.

Partial Correlation

Collinearity Statistics

Tolerance

1 F .023^{a }.173 .863 D .162^{a }1.401 .166
S .231^{a }1.920 .059

.021
.171
.232

.759
.988
.889

a. Predictors in the Model: (Constant), K
b. Dependent Variable: G


