Computer applications assignment problem
COMPUTER APPLICATIONS ASSIGNMENT PROBLEM
Question# 01
The data has been extracted using the provided link for the variables under consideration. The data source is Australian Bureau of Statistics. Screen Shots of first and last ten rows of data is as follows (Abs.gov.au, 2018).
Question#02 a
Using “DATA ANALYSIS” tool pack of MS Excel, Descriptive statistics summary output has been generated and is listed below.
Retail turnover per capita  Current Prices ; Total (State) ; 
FINAL CONSUMPTION EXPENDITURE: Current prices ;  
Mean 
1971.332143 
Mean 
121985.5143 
Standard Error 
65.3066091 
Standard Error 
5970.069937 
Median 
1878.8 
Median 
102630 
Mode 
#N/A 
Mode 
#N/A 
Standard Deviation 
772.7182196 
Standard Deviation 
70638.82012 
Sample Variance 
597093.4469 
Sample Variance 
4989842908 
Kurtosis 
1.224812391 
Kurtosis 
1.026668164 
Skewness 
0.152557367 
Skewness 
0.519449274 
Range 
2809.3 
Range 
241940 
Minimum 
733.6 
Minimum 
29251 
Maximum 
3542.9 
Maximum 
271191 
Sum 
275986.5 
Sum 
17077972 
Count 
140 
Count 
140 
Confidence Level (95.0%) 
129.1227708 
Confidence Level (95.0%) 
11803.88912 
According to the descriptive statistics summary generated through MS Office, the measures of central tendency for the provided data can be explained as the arithmetic mean for final consumption expenditures is found to be equal to 121985.5143 $ while the Mean Retail turnover per capita is found to be equal to 1971.332143 $. The median is found to be equal to $102630 final consumption expense while it is found to be equal to $ 1878.8 for per capita retail turn over. Data is found to be positively skewed. The variation of data as shown by output is found to be very high for Final Consumption Expenditure while relatively low for Per Capita Retail Turn Over showing that there is high diversion in data points from the average value obtained. This can also be seen through Histogram.
Figure 1: Histogram Final Consumption Expenditure
This is the Histogram for Final Consumption Expenditure showing skewness of the data while the mean value is found to be in the smallest class of the distribution showing that the data is highly dispersed from the average calculated (Altman, 1980).
Figure 2: Histogram Retail turnover per Capita
It can be seen that the data is highly dispersed and is positively skewed while the variation from mean is also evident to be high as the mean calculated in the fourth class while the data is highly dispersed around this value.
Question#02 b
In order to see the association between the variables under consideration, a scatter plot has been generated using MS Excel. It is given below.
The relationship between the considered variable is assessed while considering the outcome impact that can be generated if Per Capita Retail Turn Over is changed therefore it is considered to be the Independent Variable or “X” while the “Total Consumption Expenditure” is considered to be the Dependent Variable or “Y”. The Scatter plot is drawn to assess the nature of association which is found to be positive as an increase in Retail Turnover results in an increase in Total Consumption Expenditures. Here the “Scatter Plot” has provided the estimated model for the association of variables.
This equation also shows the β to be positive further reinforcing that both variables are positively correlated.
Question 3a
In order to perform simple linear regression, “Data Analysis” tool has been employed of MS Excel. The output for regression is as follows.
Regression Statistics  
Multiple R 
0.972728226 
R Square 
0.946200201 
Adjusted R Square 
0.945810347 
Standard Error 
16443.77857 
Observations 
140 
This is the first table generated and it shows the correlation coefficient termed ere as “Multiple R” which is found to be equal to 0.9727 showing a highly positive relationship between the variables under study. The coefficient of determination represented in the output as R^{2 }(R squared) is found to be 0.9462 depicting that the model is a good fit and almost 95% values fit in the model and almost 95% variations in Final Consumption are caused by the Per Capita Retail Turn Over.
ANOVA  

df 
SS 
MS 
F 
Significance F 
Regression 
1 
6.56273E+11 
6.56E+11 
2427.065345 
1.84889E89 
Residual 
138 
37314903827 
2.7E+08  
Total 
139 
6.93588E+11 
The next table generated by MS Excel is ANOVA. The ANOVA tests the significance of association under the following hypothesis
Null Hypothesis or H0: β≠0
Alternate Hypothesis or Ha: β=0
The test statistic in this case is F which is found to be 2427.06534 and the chance of incurring a value greater or equal to 2427.064 is less than 0 approximately. It can be concluded that coefficient is greater than zero.
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 95.0% 
Upper 95.0%  
Intercept 
53311.13118 
3819.993082 
13.9558 
3.7113E28 
60864.41692 
45757.8 
60864.4 
45757.8 
X Variable 1 
88.92293777 
1.804982799 
49.26525 
1.84889E89 
85.35393894 
92.49194 
85.35394 
92.49194 
The third table tests the significance of Y intercept and slope of the line using t statistic. The tests have been conducted and obtained p values are found to be extremely smaller than α (level of significance). The regression coefficient is found to be 88.92 which mean that one unit change in X will change Y by 88.92 times keeping other things constant.
Queston#2b
The residuals depict the difference between actual values of Y for certain X value and the calculated Y values for that certain point using the linear equation. Using the “Data Analysis” tool pack, the residuals have been calculated and the first rows are shown in the picture below.
Furthermore a scatter plot has been generated using residuals and the independent variable using Data Analysis to assess the fitness of the model.
Figure 3: Residual plot
The residual plot illustrates a reasonably non random pattern as initial and the last values are positive while the middle values are negative. This ushaped graph depicts that the simple linear model utilized is providing an unreasonable fit to the given data.
Autocorrelation refers to statistical illustration of the level of resemblance among a specified time series and a delayed account of itself over succeeding intervals of time (Golberg & Cho, 2004). Since the data employed is time series, there are much chances of autocorrelation to exist, therefore, Durbin Watson test has been conducted using MS Excel. The results obtained show positive correlation to exist amongst the residuals. The Durbin Watson test statistic is found to be equal to 0.9354.
DurbinWatson Calculations  
Sum of Squared Difference of Errors 
130.0191 
Sum of Squared Errors 
139.0000 
DurbinWatson Statistic 
0.9354 
4. Hypothesis Testing
In order to ascertain the significance of association between the variables under consideration, hypothesis testing is performed using the sample correlation coefficient denoted by “r”. This will help in concluding if the sample coefficient is a significant predictor of population coefficient of correlation denoted by “ρ”.
Hypothesis
The following null and alternated hypothesis are drawn for the purpose
Null hypothesis: H0: ρ = 0
Alternate hypothesis: Ha: ρ ≠ 0
Significance Level
In order to test our quoted hypothesis, 0.05 level of significance or α has been selected.
Decision Rule
In order to reject the hypothesis, we need to have a pvalue that is fairly greater than the level of significance chosen.
Test Statistic
In order to check the significance t test has been conducted using the given formula.
In the formula N shows the sample size and r is the sample’s correlation coefficient
The t statistic is found to be equal to 49.2669. Now in order to conclude we need to get the p value associated with the calculated t statistic. The p value has been found to be equal to 0.00001 fairly lesser than our significance level.
Conclusion
Since the obtained p value is lesser than the selected significance level, we cannot reject our H_{0} and thus it can be concluded that that the final consumption expenditure is positively correlated with the per capita retail turnover of the whole population.
Using the regression outputs we will assess the fitness of the model.
R Square 0.946200201
Adjusted R Square 0.945810347
The R square and the R adjusted values are somewhat similar in the given case and are strongly showing that the model is a good fit to the provided data. As approximately 94% of the variations in Final Consumption expenditure is explained through this model.
References
Abs.gov.au. (2018). [Online] Available at: http://www.abs.gov.au/AUSSTATS/abs@.nsf/viewcontent?readform&view=ProductsbyCatalogue&Action=expandwithheader&Num=1 [Accessed 24 Sep. 2018].
Altman, D. G. (1980). “Statistics and ethics in medical research. VI  Presentation of results”, British Medical Journal, 281(6254), pp.15421544.
Golberg, M. A. & Cho, H. A. (2004). Introduction to Regression Analysis.WIT Press.