# STAT213 Assignment 6 | Complete Solution

The number of attempts available for each question is noted beside the question. If you are having trouble figuring out your error,
you should consult the textbook, or ask a fellow student, one of the TA’s or your professor for help.
There are also other resources at your disposal, such as the Engineering Drop in Centre and the Mathematics Continuous Tutorials.
Don’t spend a lot of time guessing – it’s not very efficient or effective.
Make sure to give lots of significant digits for (floating point) numerical answers. For most problems when entering numerical
of 2, (2+tan(3)) (4􀀀sin(5))^6􀀀7=8 instead of 27620.3413, etc.
1. (1 pt) Match the following sample correlation coefficients
with the explanation of what that correlation coefficient means.
Type the correct letter in each box.
1. r = 􀀀1
2. r = 0
3. r = :1
4. r = :92
A. a perfect negative relationship between x and y
B. a weak positive relationship between x and y
C. no relationship between x andy
D. a strong positive relationship between x and y

2. (1 pt) Match the correlation coefficients with their scatterplots.
Select the letter of the scatterplot below which corresponds
to the correlation coefficient. (Click on image for a
larger view.)
? 1. r = 0:76
? 2. r = 􀀀0:97
? 3. r = 􀀀0:49
? 4. r = 0:22
A B C D

3. (1 pt) Use a scatterplot and the linear correlation coefficient
r to determine whether there is a correlation between the
two variables. (Note: Use software, and don’t forget to look at
the scatterplot!)
x 0 1:4 2:7 3:3 4:1 5:7 6:7 7:6 8:9 9:4 10:7 11:9 12y 0 1:9 1:6 3:2 4:7 7:3 8:5 9:5 10:7 8:4 11:4 10:5 10(a) r =
(b) There is
A. a perfect negative correlation between x and y
B. a positive correlation between x and y
C. a perfect positive correlation between x and y
D. a nonlinear correlation between x and y
E. a negative correlation between x and y
F. no correlation between x and y

4. (1 pt)
Keeping water supplies clean requires regular measurement of
levels of pollutants. The measurements are indirect- a typical
analysis involves forming a dye by a chemical reaction with the
dissolved pollutant, then passing light through the solution and
measuring its ” absorbence.” To calibrate such measurements,
the laboratory measures known standard solutions and uses regression
to relate absorbence and pollutant concentration. This
is usually done every day. Here is one series of data on the absorbence
for different levels of nitrates. Nitrates are measured
in milligrams per liter of water.
Nitrates 100 100 150 150 250 600 800 1200 1500 Absorbance 5.1 7.2 12.6 20.7 46.2 94.6 140.2 195.7 209.2 Chemical theory says that these data should lie on a straight
line. If the correlation is not at least 0.997, something went
wrong and the calibration procedure is repeated.
(a) Find the correlation r.
r =
(b) Must the calibration be done again? (Answer YES or
NO).
5. (1 pt) For each problem, select the best response.
(a) You have data for many years on the average price of a
barrel of oil and the average retail price of a gallon of unleaded
regular gasoline. When you make a scatterplot, the explanatory
variable on the x -axis
A. is the price of oil.
B. can be either oil price or gasoline price.
C. is the price of gasoline.
D. None of the above.
(b) What are all the values that a correlation r can possibly
take?
A. -1 r 1
B. 0 r 1
C. r 0
D. None of the above.
(c) In a scatterplot of the average price of a barrel of oil and
the average retail price of a gallon of gasoline, you expect to see
A. a positive association.
B. very little association.
C. a negative association.
D. None of the above.

6. (1 pt) For each problem, select the best response.
(a) A researcher wishes to determine whether the rate of
water flow (in liters per second) over an experimental soil bed
can be used to predict the amount of soil washed away (in kilograms).
In this study, the explanatory variable is the
A. depth of the soil bed.
B. amount of eroded soil.
C. size of the soil bed.
D. rate of water flow.
E. None of the above.
(b) The Columbus Zoo conducts a study to determine
whether a household’s income can be used to predict the amount
of money the household will give to the zoo’s annual fund drive.
The response variable in this study is
A. the amount of money a household gives to the zoo’s
annual fund drive.
B. the Columbus Zoo.
C. a household’s income.
D. all households in Columbus.
E. None of the above.
(c) A researcher measures the correlation between two variables.
This correlation tells us
A. whether there is a relation between two variables.
B. the strength of a straight line relation between two
variables.
C. whether a cause-and-effect relation exists between
two variables.
D. whether or not a scatterplot shows an interesting pattern.
E. None of the above.

7. (1 pt) For each problem, select the best response.
(a) Smokers don’t live as long (on the average) as nonsmokers,
and heavy smokers don’t live as long as light smokers. You
regress the age at death of a group of male smokers on the number
of packs per day they smoked. The slope of your regression
line
A. must be between -1 and 1.
B. will be less than zero.
C. will be greater than zero.
D. can’t tell without seeing the data.
(b) The points on a scatterplot lie close to the line whose
equation is y = 4x􀀀5. The slope of the line is
A. -4
B. 5
C. 9
D. 4
E. None of the above.
(c) Measurements on young children in Mumbai, India,
found this least-squares line for predicting height y from
armspan x:
ˆ y = 6:4+0:93x
All measurements are in centimeters (cm). How much on
the average does height increase for each additional centimeter
of armspan?
A. 6.4 cm
B. 0.93 cm
C. 7.33 cm
D. 0.64 cm
E. None of the above.
8. (1 pt) A study of king penguins looked for a relationship
between how deep the penguins dive to seek food and how long
they stay underwater. For all but the shallowest dives, there is
a linear relationship that is different for different penguins. The
study report gives a scatterplot for one penguin titled “ The relation
of dive duration (DD) to depth (D).” Duration DD is measured
in minutes and depth D is in meters. The report then says,
“ The regression equation for this bird is: DD = 2.48 + 0.0035
D.
(a) What is the slope of the regression line?.
(b) According to the regression line, how long does a typical
dive to a depth of 400 meters last?

9. (1 pt) We have data on the lean body mass and resting
metabolic rate for 12 women who are subjects in a study of dieting.
Lean body mass, given in kilograms, is a person’s weight
leaving out all fat. Metabolic rate, in calories burned per 24
hours, is the rate at which the body consumes energy.
Mass 39.3 36.1 37.7 37.4 44.2 41.9 46 38.2 45.3 46.4 45.3 53.3
Rate 1290 980 1150 900 1230 1050 940 1470 1330 1300 1410 1010
Find the least-squares regression line for predicting metabolic
rate from body mass.

10. (1 pt) Heights (in centimeters) and weights (in kilograms)
of 7 supermodels are given below. Find the regression equation,
letting the first variable be the independent (x) variable, and predict
the weight of a supermodel who is 167 cm tall.
Height 178 176 166 174 172 168 176
Weight 57 55 47 54 53 50 56
The regression equation is ˆ y = + x:
The best predicted weight of a supermodel who is 167 cm
tall is .

11. (1 pt) Empathy means being able to understand what others
feel. To see how the brain expresses empathy, researchers
recruited 16 couples in their midtwenties who were married or
had been dating for at least two years. They zapped the man’s
hand with an electrode while the woman watched, and measured
the activity in several parts of the woman’s brain that would respond
to her own pain. Brain activity was recorded as a fraction
of the activity observed when the woman herself was zapped
with the electrode. The women also completed a psychological
test that measures empathy.
Subject 1 2 3 4 5 6 7 Empathy Score 42 48 39 55 63 66 66 Brain Activity -0.113 0.383 0.006 0.366 0.013 0.4 0.104 Given that the equation for the regression line is ˆ y=0:00539x+
0:04637, what is the residual for subject 2?

12. (1 pt) A study was conducted to determine whether the
final grade of a student in an introductory psychology course is
linearly related to his or her performance on the verbal ability
test administered before college entrance. The verbal scores and
final grades for 10 students are shown in the table below.
Student Verbal Score x Final Grade y
1 74 100
2 71 75
3 33 63
4 80 79
5 42 86
6 36 92
7 48 85
8 47 68
9 72 93
10 28 87
Find the following:
(a) The correlation coefficient: r =
(b) The least squares line: ˆ y =
(c) Calculate the residual for the fourth student:

13. (1 pt) The amounts of 6 restaurant bills and the corresponding
amounts of the tips are given in the below. Assume
that bill amount is the explanatory variable and tip amount the
response variable.
Bill 64:30 49:72 70:29 106:27 43:58 32:98
Tip 7:70 5:28 10:00 16:00 5:50 4:50
(a) Find the correlation: r =
(b) Does there appear to be a significant correlation?
A. No
B. Yes
(c) The regression equation is ˆ y = .
(d) If the amount of the bill is \$95; the best prediction for the
amount of the tip is \$ .
(e) According to the regression equation, for every \$10 increase
in the bill, the tip should (Enter INCREASE
or DECREASE) by \$ .

14. (1 pt) Education and crime ratings for randomly selected
Canadian cities are given in the following table. Education is
a composite rating including pupil/teacher ratio, academic options
in higher education, etc. The higher the education rating,
the better the education system. Crime is expressed in crimes
committed per 100 people.
City Education Rating (%) Crime Rating (%)
Calgary 35 12
Toronto 35 10
Winnipeg 31 16
Vancouver 32 20
Halifax 30 25
Ottawa 36 13
Montreal 33 21
(a) State the slope term and the Y -intercept term of
the line which attempts to predict the crime rating of a Canadian
city based on its linear association with its education rating.
(b) Find the correlation
(c) As the education rating of a Canadian city decreases by 1sure
you include the negative sign if warranted) percentage?
(d) What percentage of the variation in the variable Crime Rating
is not explained by its linear relationship to the variable Education
Rating? Use at least one place after the decimal.
(e) Using your answer in (a), predict the mean crime rate of a
Canadian city having an education rating of 34

15. (1 pt) The following data, taken from 8 towns in Alberta,
are the percentage of residents who are university graduates and
the median household incomes (in \$ 1000’s) for all households
in each town.
Graduates (%) Median Income (\$ 1000)
61.7 47.6
50.9 34.1
57.1 31.5
56.4 41.3
42.8 34.5
42.1 28.1
33.2 23.1
19.2 20.4
(a) State the slope term and the Y -intercept term
of the least squares regression line which attempts to predict
the median income of a town in Alberta based on its linear relationship
with the percentage of residents who are university
(b) Find the correlation coefficient.
(c) As the percenage of university graduates increases by 10(d)
What percentage of the variation in the variable Median Income
is not explained by its linear relationship to the variable Percentage
of University Graduates? Use at least one place after
the decimal.
of an Alberta town with 24.0

