# Numerical Descriptive Techniques Sample Assignment

MULTIPLE CHOICE QUESTIONS

1. Which of the following statements about the arithmetic mean is not always correct?

a. The sum of the deviations from the mean is zero

b. Half of the observations are on either side of the mean

c. The mean is a measure of the middle (center) of a distribution

d. The value of the mean times the number of observations equals the sum of all of the observations

2. In a histogram, the proportion of the total area which must be to the left of the median is:

a. exactly 0.50

b. less than 0.50 if the distribution is skewed to the left

c. more than 0.50 if the distribution is skewed to the right

d. between 0.25 and 0.60 if the distribution is symmetric and unimodal

3. Which of the following statements is true?

a. The sum of the deviations from the arithmetic mean is always zero

b. The sum of the squared deviations from the arithmetic mean is always zero

c. The arithmetic mean is always less than the geometric mean

d. The standard deviation is always smaller than the variance

e. The distance between the first and third quartiles is twice the distance between the first quartile and the median.

4. If two data sets have the same range:

a. the distances from the smallest to largest observations in both sets will be the same

b. the smallest and largest observations are the same in both sets

c. both sets will have the same mean

d. both sets will have the same interquartile range

5. Which of the following statements is true?

a. When the distribution is skewed to the left, mean > median > mode

b. When the distribution is skewed to the right, mean < median < mode

c. When the distribution is symmetric and unimodal, mean = median = mode

d. When the distribution is symmetric and bimodal, mean = median = mode

6. In a histogram, the proportion of the total area which must be to the right of the mean is

a. less than 0.50 if the distribution is skewed to the left

b. exactly 0.50

c. more than 0.50 if the distribution is skewed to the right

d. exactly 0.50 if the distribution is symmetric and unimodal

e. exactly 1.0 if the distribution is symmetric and bimodal

7. A sample of 20 observations has a standard deviation of 3. The sum of the squared deviations from the sample mean is:

a. 20

b. 23

c. 29

d. 60

e. 171

8. Which measure of central location is meaningful when the data are nominal?

a. The arithmetic mean

b. The geometric mean

c. The median

d. The mode

9. Which of the following are measures of the linear relationship between two variables?

a. The covariance

b. The coefficient of correlation

c. The variance

d. Both a and b

10. A perfect straight line sloping downward would produce a correlation coefficient equal to

a. +1

b. –1

c. +2

d. –2

11. Generally speaking, if two variables are unrelated (as one increases, the other shows no pattern), the covariance will be

a. a large positive number

b. a large negative number

c. a positive or negative number close to zero

d. none of the above

12. Which measure of central location is appropriate whenever we wish to find the average growth rate, or rate of change, in a variable over time?

a. The arithmetic mean

b. The geometric mean

c. The median

d. The mode

13. Which measure of central location is appropriate whenever we wish to estimate the expected mean return, or growth rate, for a single year in the future?

a. The arithmetic mean

b. The geometric mean

c. The median

d. The mode

14. Chebyshev’s Theorem states that the percentage of measurements in a data set that fall within three standard deviations of their mean is:

a. 75%

b. at least 75%

c. 89%

d. at least 89%

15. The Empirical Rule states that the approximate percentage of measurements in a data set (providing that the data set has a bell-shaped distribution) that fall within two standard deviations of their mean is approximately:

a. 68%

b. 75%

c. 95%

d. 99%

16. Since the population is always larger than the sample, the population mean:

a. is always larger than the sample mean

b. is always smaller than the sample mean

c. is always larger than or equal to the sample mean

d. is always smaller than or equal to the sample mean

e. can be smaller than, or larger than, or equal to the sample mean

17. Which of the following summary measures is affected most by outliers?

a. The median

b. The geometric mean

c. The range

d. The interquartile range

e. All of the above

18. Which of the following summary measures cannot be easily approximated from a box-and-whisker plot?

a. The range

b. The interquartile range

c. The second quartile

d. The standard deviation

e. All of the above

19. The average score for a class of 30 students was 75. The 20 male students in the class averaged 70. The 10 female students in the class averaged:

a. 75

b. 85

c. 65

d. 70

e. 80

20. Which of the following is not a measure of variability?

a. The range

b. The variance

c. The arithmetic mean

d. The standard deviation

e. The interquartile range

21. The length of the box in the box-and-whisker plot portrays the:

a. median

b. interquartile range

c. range

d. third quartile

22. Expressed in percentiles, the interquartile range is the difference between the

a. 10% and 60% values

b. 45% and 95% values

c. 25% and 75% values

d. 15% and 65% values

23. Which of the following statements is true for the following data values: 7, 5, 6, 4, 7, 8, and 12?

a. The mean, median and mode are all equal

b. Only the mean and median are equal

c. Only the mean and mode are equal

d. Only the median and mode are equal

TRUE/FALSE QUESTIONS

24. Two classifications of statistical descriptions are measures of central location and measures of variability.

25. The mean is one of the most frequently used measures of variability.

26. Measures of variability describe typical values in the data.

27. Lily has been keeping track of what she spends to eat out. The last week's expenditures for meals eaten out were \$5.69, \$5.95, \$6.19, \$10.91, \$7.49, \$14.53, and \$7.66. The mean amount Lily spends on meals is \$8.35.

28. A data sample has a mean of 107, a median of 122, and a mode of 134. The distribution of the data is positively skewed.

29. A student scores 87, 73, 92, and 86 on four exams during the semester and 95 on the final exam. If the final is weighted double and the four others weighted equally, the student's final average would be 90.

30. In a bell-shaped distribution, there is no difference in the values of the mean, median, and mode.

31. The mean is a measure of the deviation in a data set.

32. In a positively skewed distribution, the mean is larger than the median and the median is larger than the mode.

33. In a negatively skewed distribution, the mean is smaller than the median and the median is smaller than the mode.

34. The difference between the largest and smallest values in an ordered array is called the range.

35. Quartiles divide the values in a data set into four parts of equal size.

36. The range is considered the weakest measure of variability.

37. The coefficient of variation allows us to compare two sets of data based on different measurement units.

38. The standard deviation will always exceed that of the variance.

39. Chebyshev's theorem applies only to data sets that have a mound-shaped distribution.

40. The interquartile range is found by taking the difference between the 1st and 3rd quartiles and dividing that value by 2.

41. The standard deviation is expressed in terms of the original units of measurement but the variance is not.

42. The value of the standard deviation may be either positive or negative, while the value of the variance will always be positive.

43. While Chebyshev’s theorem applies to any distribution, regardless of shape, the empirical rule applies only to distributions that are bell-shaped and symmetrical.

44. The mean of fifty sales receipts is \$65.75 and the standard deviation is \$10.55. Using Chebyshev's theorem, 75% of the sales receipts were between \$44.65 and \$86.85.

45. The median of a set of data would be more representative than the mean of that data when the average of the data values is larger than most of the values.

46. If the coefficient of correlation , then the best-fit linear equation will actually include all of the data points

47. The standard deviation is the positive square root of the variance.

48. According to Chebyshev’s theorem, at least 93.75% of observations should fall within 4 standard deviations of the mean.

49. The coefficient of correlation r is a number that indicates the direction and the strength of the relationship between the dependent variable y and the independent variable x.

50. If the coefficient of correlation , then there is no linear relationship whatsoever between the dependent variable y and the independent variable x

51. Chebyshev’s Theorem states that the percentage of observations in a data set that should fall within five standard deviations of their mean is at least 96%.

52. The Empirical Rule states that the percentage of observations in a data set (providing that the data set has a bell-shaped and symmetric distribution) that fall within one standard deviation of their mean is approximately 75%.

53. Since the sample is always smaller than the population, the sample mean is always smaller than the population mean.

54. The length of the box in the box-and-whisker plot portrays the interquartile range.

55. Expressed in percentiles, the interquartile range is the difference between the 25th and 75th percentiles.

56. A sample of 15 observations has a standard deviation of 4. The sum of the squared deviations from the sample mean is 60.

57. A perfect straight line sloping upward would produce a correlation coefficient value of one.

58. When the standard deviation is expressed as a percentage of the mean, the result is the coefficient of correlation.

59. The value of the mean times the number of observations equals the sum of all of the observations

60. In a histogram, the proportion of the total area which must be to the left of the median is less than 0.50 if the distribution is skewed to the left

61. In a histogram, the proportion of the total area which must be to the left of the median is more than 0.50 if the distribution is skewed to the right

62. If two data sets have the same range, the distances from the smallest to largest observations in both sets will be the same

63. In a histogram, the proportion of the total area which must be to the right of the mean is exactly 0.50 if the distribution is symmetric and unimodal

64. The variance is a measure of the linear relationship between two variables

65. Generally speaking, if two variables are unrelated, the covariance will be a positive or negative number close to zero

66. The value of the mean times the number of observations equals the sum of all of the observations

67. The sum of the deviations from the arithmetic mean is always zero

68. The sample mean is a measure of spread (dispersion).

69. Percentiles can be converted into quintiles and deciles, where quintiles divide the data into fifths and deciles divide the data into tenths.

70. Expressed in quintiles, the interquartile range is the difference between the 1st and 3rd quintiles.

TEST QUESTIONS

71. Monthly rent data in dollars for a sample of 10 one-bedroom apartments in a small town in Iowa are given below:

220 216 220 205 210 240 195 235 204 250

a. Compute the sample monthly average rent

b. Compute the sample median

c. What is the mode?

a. \$219.50

b. \$218

c. \$220

72. A sample of 25 families were asked how many pets they owned. Their responses are summarized in the following table.

 Number of Pets 0 1 2 3 4 5 Number of Families 3 10 5 4 2 1

a. Determine the mean, the median, and the mode of the number of pets owned per family.

b. Describe briefly what each statistic in part (a) tells you about the data.

a. 1.80 pet, median = 1 pet, mode = 1 pet.

b. The “average” number of pets owned was 1.80 pets. Half the families own at most one pet, and the other half own at least one pet. The most frequent number of pets owned was one pet.

73. Suppose that a firm’s sales were \$2,500,000 four years ago, and sales have grown annually by 25%, 15%, -5%, and 10% since that time. What was the geometric mean growth rate in sales over the past four years?

If is the geometric mean, then

= (1+0.25)(1+0.15)(1-0.05)(1+0.10)=1.5022 0.1071 or 10.71%

74. What are the relative magnitudes of the mean, median, and mode for a unimodal distribution that is

a. symmetrical?

b. skewed to the left?

c. skewed to the right?

a. mean = median = mode

b. mean < median < mode

c. mode < median < mean

75. Suppose that a firm’s sales were \$3,750,000 five years ago and are \$5,250,000 today. What was the geometric mean growth rate in sales over the past five years?

If is the geometric mean, then

3,750,000 = 5,250,000 = 0.0696 or 6.96%

76. A basketball player has the following points for seven games: 20, 25, 32, 18, 19, 22, 30. Compute the following measures of central location and variability:

a. mean

b. median

c. standard deviation

d. coefficient of variation

a. 23.714

b. median = 22.0

c. 5.499

d. cv = 0.232

QUESTIONS 77 THROUGH 79 ARE BASED ON THE FOLLOWING INFORMATION:

The following data represent the number of children in a sample of 10 families from a certain community:

4 2 1 1 5 3 0 1 0 2

77. a. Compute the mean

b. Compute the median

a. 1.90

b. 1.5

78. a. Compute the range

b. Compute the variance

c. Compute the standard deviation

a. 5

b. = 2.77

c. = 1.66

79. Compute the coefficient of variation

0.87

QUESTIONS 80 THROUGH 93 ARE BASED ON THE FOLLOWING INFORMATION:

The following data represent the weights in pounds of a sample of 25 workers:

145 168 163 162 174 152 156 168 154 151

174 146 134 140 171

80. Construct a stem and leaf display for the weights.

 Stem Leaf 13 47 14 0568 15 124667 16 2345889 17 123447

81. Find the median weight.

Median = 162 pounds

82. Determine the location and value of the lower quartile of the weights.

= 6.5,

Value of = 148 + 0.50(151-148) = 149.5

83. Determine the location and value of the upper quartile of the weights.

= 19.5,

Value of = 169 + 0.50(171-169) = 170

84. Determine the location and value of the 60th percentile of the weights.

= 15.6,

Value of the 60th percentile = 164 + 0.60(165-164) = 164.6

85. Compute the sample mean weight.

159.04

86. Compute the sample variance, and sample standard deviation.

156.12, and 12.49

87. Compute the range and interquartile range of the data.

Range = 43,

IQR = = 170 – 149.5 = 20.5

88. Construct a frequency distribution for the data, using five class intervals, and the

value 130 as the lower limit of the first class.

 Class Limits Frequency 130 up to 140 2 140 up to 150 4 150 up to 160 6 160 up to 170 7 170 up to 180 6 Total 25

89. Construct a relative frequency histogram for the data, using five class intervals and

the value 10 as the lower limit of the first class.

90. Construct a box plot for the weights.

91. Are there any outliers?

IQR = 20.5; there are no outliers.

92. Compare the information regarding skewness conveyed by your box plot in Question 90 with that of the histogram constructed in Question 89.

The box plot and the histogram both indicate negative skewness.

93. Calculate the 3rd and 7th deciles of the data.

Location of 3rd decile = = 7.8,

Value of 3rd decile = 151 + 0.80(152 – 151) = 151.80

Location of 7th decile = =18.2,

Value of 7th decile = 168 + 0.20(169 – 168) = 168.20

94. Is it possible for the standard deviation of a data set to be larger than its variance? Explain.

Yes. A standard deviation is larger than its corresponding variance when the variance is between 0 and 1 (exclusive).

QUESTIONS 95 THROUGH 97 ARE BASED ON THE FOLLOWING INFORMATION:

Data from three samples are shown below:

Sample A: 17 22 20 18 23

Sample B: 30 28 35 40 25

Sample C: 44 39 54 21 52

95. Examine the three samples. Without performing any calculations, indicate which sample has the largest amount of variability and which sample has the least amount of variability.

Sample C has the largest variability, with values ranging from 21 to 54. Sample A has the least variability, with all values close to 20.

equals zero for each of the three samples. = 0 is always true.

97. Calculate the variance and the range for the three samples.

= 6.50, 35.3, and 174.5 for samples A, B, and C, respectively.

Range = 6, 12, and 33 for samples A, B, and C, respectively.

98. Suppose that the following data provide the hours of television viewing per week for a sample of 15 high school students in Grand Rapids, Michigan:

5 11 25 19 18 20 27 13

8 10 15 19 18 9 12

a. Determine the location and value of the first, second, and third quartiles.

b. Calculate the interquartile range.

c. Interpret the value of the interquartile range.

a. = 4, Value of = 10, = 8, Value of = 15, = 12, Value of = 19

b. IQR = = 9

c. The middle 50% of television viewing hours are between 10 and 19 hours.

99. The number of hours a college student spent studying during the final exam week was recorded as follows:

7 6 4 9 8 5 10

Compute the range, , and s for these data. Express each number in appropriate units.

Range = 6 hours, 7 hours, 4.667 = 2.16 hours

100. The annual percentage rates of return over the past 10 years for two mutual funds are as follows:

Fund A: 7.1 -7.4 19.7 -3.9 32.4 41.7 23.2 4.0 1.9 29.3

Fund B: 10.8 -4.1 5.1 10.9 26.5 24.0 16.9 9.4 -2.6 10.1

Which fund would you classify as having the higher level of risk?

Variance of returns will be used as the measure of risk of an investment. Since, , fund A has the higher level of risk.

QUESTIONS 101 THROUGH 116 ARE BASED ON THE FOLLOWING INFORMATION:

The following data represent the ages in years of a sample of 25 employees from a government department:

31 43 56 23 49 42 33 61 44 28

48 38 44 35 40 64 52 42 47 39

53 27 36 35 20

101. Construct a stem and leaf display for the ages.

 Stem Leaf 2 0378 3 1355689 4 022344789 5 236 6 14

102. Find the median age.

Median = 42 years

103. Find the lower quartile of the ages.

= 6.5,

Value of 33 + 0.50(35 – 33) = 34 years

104. Find the upper quartile of the ages.

=19.5,

Value of 48 + 0.50(49 – 48) = 48.50 years

105. Find the 60th percentile of the ages.

= 15.6,

Value of the 60th percentile = 43 + 0.60(44 – 43) = 43.6 years

106. Compute the range and interquartile range of the data.

Range = 44 years

IQR = = 48.5 – 34 = 14.5

107. Compute the sample mean age.

41.2

108. Compute the sample variance, and sample standard deviation.

124.83, and 11.17

109. Calculate the 4th decile of the data.

Location of 4th decile = = 10.4,

Value of 4th decile = 38 + 0.40(39 - 38) = 38.40

110. Compute the 8th decile of the data.

Location of 8th decile = = 20.8,

Value of 8th decile = 49 + 0.80(52 – 49) = 51.4

111. Calculate the 1st quintile

Location of 1st quintile = = 5.2,

Value of 1st quintile = 31 + 0.20(33 - 31) = 31.40

112. Calculate the 2nd quintile

Location of 2nd quintile = = 10.4,

Value of 2nd quintile = 38 + 0.40(39 - 38) = 38.40

113. Construct a box plot for the ages and identify any outliers.

There are no outliers.

114. Construct a relative frequency distribution for the data, using five class intervals and the value 20 as the lower limit of the first class.

 Class Limits Relative Frequency 20 up to 30 0.16 30 up to 40 0.28 40 up to 50 0.36 50 up to 60 0.12 60 up to 70 0.08 Total 1.00

115. Construct a relative frequency histogram for the data, using the relative frequency distribution constructed in Question 114.

116. Compare the information regarding skewness conveyed by your box plot constructed in Question 113 with that of the histogram constructed in Question 115.

The box plot indicates symmetry, while the histogram incorrectly indicates positive skewness. A histogram using a class width of 9 would indicate symmetry.

QUESTIONS 117 AND 124 ARE BASED ON THE FOLLOWING INFORMATION:

The following data represent the salaries (in thousands of dollars) of a sample of 13 employees of a firm:

26.5 23.5 29.7 24.8 21.1 24.3 20.4

22.7 27.2 23.7 24.1 24.8 28.2

117. Compute the mean salary.

24.692

118. Compute the median salary.

Median = 24.3

119. Compute the variance, and standard deviation of the salaries.

7.097, and 2.664

120. Compute the coefficient of variation.

cv = 0.108

121. Compute the range.

Range = 9.3

122. Compute the lower quartile.

= 3.5,

Value of 22.7 + 0.50(23.5 – 22.7) = 23.1

123. Compute the upper quartile.

=10.5,

Value of 26.5 + 0.50(27.2 – 26.5) = 26.85

124. Compute the 90th percentile.

= 12.6,

Value of the 90th percentile = 28.2 + 0.60(29.7 – 28.2) = 29.1

125. Consider the following population of measurements:

162 152 177 157 184 176 165 181 170 163

a. Compute the mean.

b. Compute the median.

c. Compute the variance and the standard deviation.

a. 168.7

b. Median = 167.5

c. 101.61 and 10.08

126. A supermarket has determined that daily demand for egg cartons has an approximate mound-shaped distribution, with a mean of 55 cartons and a standard deviation of six cartons.

a. For what percentage of days can we expect the number of cartons of eggs sold to be between 49 and 61?

b. For what percentage of days can we expect the number of cartons of eggs sold to be more than 2 standard deviations from the mean?

c. If the supermarket begins each morning with a stock of 77 cartons of eggs, for what percentage of days will there be an insufficient number of cartons to meet the demand?

a. Approximately 68%

b. Approximately 5%

c. Approximately 2.5%

127. A sample of 12 measurements has a mean of 25 and a standard deviation of 4. Suppose that the sample is enlarged to 14 measurements, by including two additional measurements having common value of 25 each.

a. Find the mean of the sample of 14 measurements.

b. Find the standard deviation of the sample of 14 measurements.

a. 25

b. 3.679

128. The price-earnings ratios of a sample of stocks have a mean value of 13.5 and a standard deviation of 2. If the ratios have a mound-shaped distribution, what can we say about the proportion of ratios that fall between

a. 11.5 and 15.5?

b. 9.5 and 17.5?

c. 7.5 and 19.5?

a. The interval contains approximately 68% of the ratios.

b. The interval contains approximately 95% of the ratios.

c. The interval contains approximately 99.7 of the ratios.

129. The mean of a sample of 15 measurements is 35.6. Suppose that the sample is enlarged to 16 measurements, by including one additional measurement having a value of 42. Find the mean of the sample of 16 measurements.

36

130. Suppose that an analysis of a set of data reveals that:

a. What do these statistics tell you about the shape of the distribution?

b. What can you say about the relative position of each of the observations 34, 84, and 104?

c. Calculate the interquartile range.

d. What does the interquartile range tell you about the data?

a. The fact that 40 is greater that 20 indicates that the distribution is skewed to the left.

b. Since 34 is less than , the observation 34 is among the lowest 25% of the values. The value 84 is a bit smaller than the middle value, which is 85. Since 105, the value 104 is larger than about 75% of the values.

c. IQR = 60

d. The middle 50% of the values are between 45 and 105.

131. Given the following sample data

 x 420 610 625 500 400 450 550 650 480 565 y 2.8 3.6 3.75 3 2.5 2.7 3.5 3.9 2.95 3.3

a. Calculate the covariance and the correlation coefficient.

b. Comment on the relationship between x and y.

c. Determine the least squares line.

d. Draw the scatter diagram and plot the least squares line.

a. = 41.25

b. There is a very strong positive linear relationship between X and Y.

c. The least squares line is .

d.

QUESTIONS 132 THROUGH 134 ARE BASED ON THE FOLLOWING INFORMATION:

A sample of eight observations of variables x and y is shown below:

 x 5 3 7 9 2 4 6 8 y 20 23 15 11 27 21 17 14

132. Calculate the covariance and the coefficient of correlation, and comment on the

relationship between x and y.

= = -12.714

There is a very strong (almost perfect) negative linear relationship between X and Y.

133. Determine the least squares line, and use it to estimate the value of y for x = 6.

.

When x = 6,

134. Draw the scatter diagram and plot the least squares line.

135. How is the value of the correlation coefficient r affected in each of the following cases?

a. each x value is multiplied by 4.

b. each x value is switched with the corresponding y value.

c. each x value is increased by 2.

In parts (a), (b), and (c), the value of the correlation coefficient does not change.

136. a. Calculate the covariance and the coefficient of correlation for the following sample.

 x 9 6 7 5 8 y 19 14 16 12 15

b. What do these statistics tell you about the relationship between x and y?