PHP4006 Advanced Statistics Assignment

Section A (Answer ALL questions)

Answer each question as TRUE or FALSE in the answer book provided.

The null hypothesis of the Shapiro-Wilk test can be given as “The sample comes from a normally distributed population”.Outliers are influential cases.

SPSS has a non-parametric equivalent to a Factorial ANOVA for a completely randomized design.
If p = .01, you can conclude that there is a 1% chance the null hypothesis is true.
The odds of an event are the ratio of the probability of an event happening to the probability of the event not happening.
The dependent/outcome variable in a logistic regression is a binary variable.
In a Principal Component Analysis, the first component accounts for more of the variability in the data than the second component.
A MANOVA assumes that the variables in each group are normally distributed.
A loglinear model with two factors is equivalent to a contingency table analysis.
An ANCOVA allows there to be errors in the measured covariate.

Section B (Answer THREE questions)

You are planning a study of the effectiveness of two therapies for treating Depression. Group A will receive the current therapy and Group B will receive a new and improved therapy. During the investigation all patients will be assessed three times over a total of 12 weeks treatment. You are trying to decide if the new therapy is more effective than the current therapy.
a) State the name and design of the statistical test you intend to use?

(3 marks) b) State the null hypotheses of your test chosen in (a). (3 marks)

Explain what the significance level of the chosen test is. (2 marks)
State the assumptions that the chosen test requires to hold (6 marks)
Describe how you would assess whether the assumptions hold.

(6 marks)

f) State two other analyses you would consider using if the assumptions do not hold. (5 marks)

Write short notes describing the type of data, question to be answered and your understanding of how each of the following statistical techniques works
Factor Analysis (6 marks) ii. Path Analysis (6 marks) iii. Structural Equation Modelling (6 marks)

Then describe a dataset and justify the use of ONE of the above techniques. DO

NOT use a dataset used in the lectures. (7 marks)

You have studied the impact of cognitive fatigue on behavioural impulsivity. You measured impulsivity using a decision-making task where a higher score represents less impulsivity. 60 participants were allocated to one of three conditions: a control condition and two conditions inducing cognitive fatigue (Mild or High). After the treatment, participants completed the decision-making task. The output for the One-way independent ANOVA is below (assumptions were met).

Calculate the Omega-squared effect size for the differences in impulsivity scores across conditions from the information below. (9 marks)
Calculate the Pearson’s r effect size for the statistically significant contrast result using the information below (7 marks)
Explain some advantages of using effect sizes in addition to measures of statistical significance. (5 marks)
Briefly summarise the main findings of the data below. (4 marks)

Descriptives

Impulsivity

95% Confidence Interval for Mean

N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum

Control 20 100.80 8.817 1.972 96.67 104.93 79 114 Mild 20 85.05 11.009 2.462 79.90 90.20 65 100 High 20 81.10 6.601 1.476 78.01 84.19 64 96

Total 60 88.98 12.319 1.590 85.80 92.17 64 114

ANOVA

Impulsivity

	Sum of Squares	df		Mean Square	F	Sig.
Between Groups	4345.033		2	2172.517	26.874	.000
Within Groups	4607.950		57	80.841
Total	8952.983		59

Contrast Coefficients

Contrast Fatigue Condition

	Control Mild High
1	-2 1 1
2	0 -1 1

Contrast Tests

Contrast

Value of Contrast

Std. Error

Sig. (2tailed)

Impulsivity Assume equal variances

-35.45 -3.95

4.925

2.843

-7.198

.000

.170

Does not assume equal variances

-35.45 -3.95

4.877

2.870

-7.268

-1.376

37.957

31.096

.000

.179

The data analysed below is on glucose control in diabetic patients. Good control is measured by a low value of Glucose in the blood

response	G	Glucose in the blood
predictors	K	Knowledge of the illness
	F	Measure of attribution called fatalistic externalism
	D	Duration of the illness in months
	S	Length of schooling 0 – less than 13 years, 1 – more than 13 years

The output below is taken from the use of a forced entry regression and a forward regression to predict Glucose from Knowledge, Fatalism, Duration and Schooling.

How many diabetic patients were in the study

(1 marks)

Which predictor variable does the forced entry regression suggest is the best predictor of Glucose? Quote the p
- marks)
Which predictor variable does the forward regression suggest is the best predictor of Glucose? Quote the p

(2marks)

Explain why the best predictor chosen by the forward regression does not have to be the same as that suggested by the forced entry regression. (4 marks)
From the forced entry regression, state the model equation and predict the glucose level of a person with Duration = 141, Fatalism = 19,

Knowledge = 36 and Schooling = 1.

(10 marks)

Using the forward regression output, what conclusions do you come to about the use of the 4 predictors to predict blood Glucose level.
- marks)
Why, from looking at the definitions of the four predictor variables would you perform further analyses and what would they be?

(4 marks)

Regression

Variables Entered/Removed ^b

Variables Variables

Model Entered Removed Method

K, D, S, F

All requested variables entered.
Dependent Variable: G

Model Summary

Adjusted R Std. Error of

Model R R Square Square the Estimate

18.198

Predictors: (Constant), K, D, S, F

ANOVA^b

Model Sum of Squares df Mean Square F Sig.

1 Regression 5557.055 4 1389.264 4.195 .004^a

Residual 20863.710 63 331.170

Total 26420.765 67

Predictors: (Constant), K, D, S, F
Dependent Variable: G

Coefficients^a

1 (Constant) 130.385 6.983 .000 F .028 .485 .008 .057 .955

D -.053 .028 -.225 -1.890 .063

S -11.283 4.850 -.284 -2.327 .023

K -.731 .364 -.267 -2.008 .049

Dependent Variable: G

Regression

Variables Entered/Removed^a

Variables Variables

Model Entered Removed Method

1 Forward

(Criterion:

K . Probability-of-

F-to-enter <= .050)

Dependent Variable: G

Model Summary

Adjusted R Std. Error of

Model R R Square Square the Estimate

18.810

Predictors: (Constant), K

ANOVA^b

Model Sum of Squares df Mean Square F Sig.

1 Regression 3067.774 1 3067.774 8.670 .004^a

Residual 23352.990 66 353.833

Total 26420.765 67

Predictors: (Constant), K
Dependent Variable: G

Coefficients^a

1 (Constant) 126.438 11.056 .000

K -.932 .317 -.341 -2.945 .004

Dependent Variable: G

Excluded Variables ^b
Model Beta In t Sig.	Partial Correlation	Collinearity Statistics
Model Beta In t Sig.	Partial Correlation	Tolerance
1 F -.023^a-.173 .863 D -.162^a-1.401 .166 S -.231^a-1.920 .059	-.021 -.171 -.232	.759 .988 .889
a. Predictors in the Model: (Constant), K b. Dependent Variable: G

PHP4006 Advanced Statistics Assignment

Regression

ANOVAb

Coefficientsa

ANOVAb

Coefficientsa

ANOVA^b

Coefficients^a

ANOVA^b

Coefficients^a