# Computer applications assignment problem

COMPUTER APPLICATIONS ASSIGNMENT PROBLEM

Question# 01

The data has been extracted using the provided link for the variables under consideration. The data source is Australian Bureau of Statistics. Screen Shots of first and last ten rows of data is as follows (Abs.gov.au, 2018).

Question#02 a

Using “DATA ANALYSIS” tool pack of MS Excel, Descriptive statistics summary output has been generated and is listed below.

 Retail turnover per capita - Current Prices ; Total (State) ; FINAL CONSUMPTION EXPENDITURE: Current prices ; Mean 1971.332143 Mean 121985.5143 Standard Error 65.3066091 Standard Error 5970.069937 Median 1878.8 Median 102630 Mode #N/A Mode #N/A Standard Deviation 772.7182196 Standard Deviation 70638.82012 Sample Variance 597093.4469 Sample Variance 4989842908 Kurtosis -1.224812391 Kurtosis -1.026668164 Skewness 0.152557367 Skewness 0.519449274 Range 2809.3 Range 241940 Minimum 733.6 Minimum 29251 Maximum 3542.9 Maximum 271191 Sum 275986.5 Sum 17077972 Count 140 Count 140 Confidence Level (95.0%) 129.1227708 Confidence Level (95.0%) 11803.88912

According to the descriptive statistics summary generated through MS Office, the measures of central tendency for the provided data can be explained as the arithmetic mean for final consumption expenditures is found to be equal to 121985.5143 \$ while the Mean Retail turnover per capita is found to be equal to 1971.332143 \$. The median is found to be equal to \$102630 final consumption expense while it is found to be equal to \$ 1878.8 for per capita retail turn over. Data is found to be positively skewed. The variation of data as shown by output is found to be very high for Final Consumption Expenditure while relatively low for Per Capita Retail Turn Over showing that there is high diversion in data points from the average value obtained. This can also be seen through Histogram.

Figure 1: Histogram Final Consumption Expenditure

This is the Histogram for Final Consumption Expenditure showing skewness of the data while the mean value is found to be in the smallest class of the distribution showing that the data is highly dispersed from the average calculated (Altman, 1980).

Figure 2: Histogram Retail turnover per Capita

It can be seen that the data is highly dispersed and is positively skewed while the variation from mean is also evident to be high as the mean calculated in the fourth class while the data is highly dispersed around this value.

Question#02 b

In order to see the association between the variables under consideration, a scatter plot has been generated using MS Excel. It is given below.

The relationship between the considered variable is assessed while considering the outcome impact that can be generated if Per Capita Retail Turn Over is changed therefore it is considered to be the Independent Variable or “X” while the “Total Consumption Expenditure” is considered to be the Dependent Variable or “Y”. The Scatter plot is drawn to assess the nature of association which is found to be positive as an increase in Retail Turnover results in an increase in Total Consumption Expenditures. Here the “Scatter Plot” has provided the estimated model for the association of variables.

This equation also shows the β to be positive further reinforcing that both variables are positively correlated.

Question 3a

In order to perform simple linear regression, “Data Analysis” tool has been employed of MS Excel. The output for regression is as follows.

 Regression Statistics Multiple R 0.972728226 R Square 0.946200201 Adjusted R Square 0.945810347 Standard Error 16443.77857 Observations 140

This is the first table generated and it shows the correlation coefficient termed ere as “Multiple R” which is found to be equal to 0.9727 showing a highly positive relationship between the variables under study. The coefficient of determination represented in the output as R2 (R squared) is found to be 0.9462 depicting that the model is a good fit and almost 95% values fit in the model and almost 95% variations in Final Consumption are caused by the Per Capita Retail Turn Over.

 ANOVA df SS MS F Significance F Regression 1 6.56273E+11 6.56E+11 2427.065345 1.84889E-89 Residual 138 37314903827 2.7E+08 Total 139 6.93588E+11

The next table generated by MS Excel is ANOVA. The ANOVA tests the significance of association under the following hypothesis

Null Hypothesis or H0: β≠0

Alternate Hypothesis or Ha: β=0

The test statistic in this case is F which is found to be 2427.06534 and the chance of incurring a value greater or equal to 2427.064 is less than 0 approximately. It can be concluded that coefficient is greater than zero.

 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept -53311.13118 3819.993082 -13.9558 3.7113E-28 -60864.41692 -45757.8 -60864.4 -45757.8 X Variable 1 88.92293777 1.804982799 49.26525 1.84889E-89 85.35393894 92.49194 85.35394 92.49194

The third table tests the significance of Y intercept and slope of the line using t statistic. The tests have been conducted and obtained p values are found to be extremely smaller than α (level of significance). The regression coefficient is found to be 88.92 which mean that one unit change in X will change Y by 88.92 times keeping other things constant.

Queston#2b

The residuals depict the difference between actual values of Y for certain X value and the calculated Y values for that certain point using the linear equation. Using the “Data Analysis” tool pack, the residuals have been calculated and the first rows are shown in the picture below.

Furthermore a scatter plot has been generated using residuals and the independent variable using Data Analysis to assess the fitness of the model.

Figure 3: Residual plot

The residual plot illustrates a reasonably non random pattern as initial and the last values are positive while the middle values are negative. This u-shaped graph depicts that the simple linear model utilized is providing an unreasonable fit to the given data.

Autocorrelation refers to statistical illustration of the level of resemblance among a specified time series and a delayed account of itself over succeeding intervals of time (Golberg & Cho, 2004). Since the data employed is time series, there are much chances of autocorrelation to exist, therefore, Durbin Watson test has been conducted using MS Excel. The results obtained show positive correlation to exist amongst the residuals. The Durbin Watson test statistic is found to be equal to 0.9354.

 Durbin-Watson Calculations Sum of Squared Difference of Errors 130.0191 Sum of Squared Errors 139.0000 Durbin-Watson Statistic 0.9354

4. Hypothesis Testing

In order to ascertain the significance of association between the variables under consideration, hypothesis testing is performed using the sample correlation coefficient denoted by “r”. This will help in concluding if the sample coefficient is a significant predictor of population coefficient of correlation denoted by “ρ”.

Hypothesis

The following null and alternated hypothesis are drawn for the purpose

Null hypothesis: H0: ρ = 0

Alternate hypothesis: Ha: ρ ≠ 0

Significance Level

In order to test our quoted hypothesis, 0.05 level of significance or α has been selected.

Decision Rule

In order to reject the hypothesis, we need to have a p-value that is fairly greater than the level of significance chosen.

Test Statistic

In order to check the significance t test has been conducted using the given formula.

In the formula N shows the sample size and r is the sample’s correlation coefficient

The t statistic is found to be equal to 49.2669. Now in order to conclude we need to get the p value associated with the calculated t statistic. The p value has been found to be equal to 0.00001 fairly lesser than our significance level.

Conclusion

Since the obtained p value is lesser than the selected significance level, we cannot reject our H0 and thus it can be concluded that that the final consumption expenditure is positively correlated with the per capita retail turnover of the whole population.

Using the regression outputs we will assess the fitness of the model.

R Square 0.946200201