Cash-back offer from May 7th to 12th, 2024: Get a flat 10% cash-back credited to your account for a minimum transaction of $50.Post Your Questions Today!

Question DetailsNormal
$ 38.00

Math 1530 Capstone Project Part I | Complete Solution

Question posted by
Online Tutor Profile
request

Math 1530 Capstone Project Part I    Fall 2012        Solution
Directions:
1.    DO YOUR OWN WORK!  It is academic misconduct to copy or seek assistance from other people, or to share your work with other students.  Any academic misconduct on this project results in a grade of 0.
2.    Capstone Project counts for 200 points of the total grade. 
3.    The project is due by __________ on _______, _______, 2012. No late projects will be accepted.
4.    Start each problem on a new page.  
5.    Insert any graphs in the appropriate places (not attached as an addendum at the back or even at the end of the problem.)  
6.    Only insert the relevant portions of a Minitab display used to answer a question, not everything Minitab gives you in hoping the right information is somewhere in what you copied into the document. 

Here are the questions that were asked on the survey:
1.    GENDER:  Are you male or female? (Male, Female)
2.    What are your birth month and year? (MONTH_BIRTH: Month; YEAR_ BIRTH: Year)
3.    ELECTION_VOTE:  If you vote in the US Presidential Election this fall, which political party do you prefer? (Democrat (Barack Obama), Republican (Mitt Romney), Other Party)
4.    ELECTION_WINNER:  Who do you think will win the 2012 US Presidential Election? (Democrat (Barack Obama), Republican (Mitt Romney), Other)
5.    WORK_HOURS: On average, how many hours per week will you be working at a paid job this semester?
6.    FRIENDS_FB: How many Facebook friends do you have? 
7.    SHOE_SIZE: What is your shoe size?
8.    HEIGHT:  What is your height?
9.    AGE_INSTRUCTOR:  Guess the age (in years) of your Math 1530 instructor. 
10.    SALARY_EXPECTED:  What is your expected salary (in dollars) for a secure job in Johnson City if you have finished your intended highest degree? 

The following questions were included in the Research Assignment for MATH1530 students at the beginning of the semester. Imagine that you have finished your intended highest degree and have a secure job in Johnson City. You plan to buy a house in Johnson City area this year.  Pick a home that you want to buy and input the information in the following questions.

11.    TYPE_HOUSE: Which of the following best describes the type of the house? (Single family, Condominium, Town house, Apartment, Multifamily house, Other)
12.    SCHOOL: Which elementary school does this house belong to? (Cherokee, Fairmont, Lake Ridge,
  Mountain View, North Side, South Side, Towne Acres, Woodland, Unknown)
13.    YEAR_HOUSE: Which year was the house built? 
14.    PRICE_HOUSE: What is the listing price (in dollars) of this house? 
15.    NUMBER_BRS: How many bedrooms are in this house? 
16.    SF_FINISHED: What is the total finished square feet of this house? 
17.    How much in dollars are the property taxes of this house? (TAX_COUNTY:  County tax; TAX_CITY: City tax)

A total of 791 students responded to the MATH1530 class survey.  The data for 788 students were recorded.  The Minitab worksheet MATH1530Fall12Survey.mtw includes the responses to some of the questions.  Note that there are some missing values, denoted by an asterisk (*), in the data set.
The Minitab worksheet is set up as follows:
C1: ID
C2: GENDER

C6: ELECTION_VOTE
C7: ELECTION_WINNER
C8: WORK_HOURS
C9: AGE_INSTRUCTOR
C10: SALARY_EXPECTED ($)
C11: PRICE_HOUSE
C12: SF_FINISHED
C13: LN_PRICE (Natural Log of PRICE_HOUSE)
C14: LN_SF (Natural Log of SF_FINISHED)
C15: TYPE_HOUSE (Code the type into two categories: single family and other) 
 
1.     Variable type. Which of these questions from the class survey produced variables that are categorical and which are quantitative?  Circle your answer.

a.    ELECTION_VOTE         Categorical              Quantitative            Neither
b.    AGE_INSTRUCTOR         Categorical              Quantitative            Neither
c.    TYPE_HOUSE        Categorical              Quantitative            Neither
d.    PRICE_HOUSE        Categorical              Quantitative            Neither
e.    TAX_COUNTY        Categorical              Quantitative            Neither

Note: A categorical variable places an individual into one of several groups or categories. A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense.   


2.    Age of MATH1530 instructors: Question 9 from the survey asked students to guess the age (in years) of their Math 1530 instructors.

a.    Create a histogram for AGE_INSTRUCTOR and insert it here.

 
b.    Which of the following best describes the shape of the distribution? Circle your answer.
Unimodal        Bimodal        Mutimodal 

c.    Why do you think we have observed this shape? 

The responses from MATH1530 students to this question are for different instructors.  Therefore, the data should show several peaks which reflect the different ages of MATH1530 instructors.
 
3.    Students’ expected salary:  Question 10 from the survey asked “What is your expected salary (in dollars) for a secure job in Johnson City if you have finished your intended highest degree?”

a.    Create an appropriate display for students’ expected salary and insert it here.
 
b.    Which of the following best describes the shape of the distribution? Circle your answer.
Skewed left        Symmetric        Skewed right 


c.    Are there any outliers in this data? Justify your answer. 
Yes, the boxplot of the variable shows that there are many outliers (*).  
 

To verify this, use the measures obtained in the next question.   We have
Q 3 – Q 1 = 95,000-50,000 = 45, 000.    1.5 * IQR = 1.5 * 45,000 = 67,500. 
Lower fence = Q 1 – 1.5 * IQR = 50,000 – 67,500 = -17,500.   
Upper fence = Q 3 + 1.5 * IQR = 95,000 +  67,500 =  162,500
Therefore, any meal cost below -17,500 or above 162,500 would be considered an outlier.
d.    Use numerical measures appropriate for the shape to describe the center and spread. 
Descriptive Statistics: SALARY_EXPECTED  ($) 

Variable                N  N*   Mean  SE Mean  StDev  Minimum     Q1  Median
SALARY_EXPECTED  ($)  704  84  87919     2985  79191    10000  50000   70000

Variable                 Q3  Maximum
SALARY_EXPECTED  ($)  95000   850000

Since there are outliers, the five-number summary should be used to describe the distribution: 
Min = 10,000, Q1 =50,000, Median = 70,000, Q3 = 95,000, Max = 850,000

Note that the mean is larger than the median.  This will typically be the case when the distribution is right skewed. 

e.    Create a side-by-side boxplot to compare the distributions of the expected salary for males and females.  Insert the graph below. Comment based on the graph.
Graph>Boxplot>With Groups
 
Descriptive Statistics: SALARY_EXPECTED  ($) 

Variable              GENDER    N  N*   Mean  SE Mean  StDev  Minimum     Q1
SALARY_EXPECTED  ($)  Female  436  56  83404     3311  69137    15000  50000
                      Male    268  28  95265     5677  92934    10000  50000

Variable              GENDER  Median      Q3  Maximum
SALARY_EXPECTED  ($)  Female   66920   90000   700000
                      Male     75000  100000   850000

Both graphs appear skewed to the right with many high outliers.  The boxplot for female has slightly more right skewness than the one for male. The median, Q3, and Maximum are large for males than for females.  There are more outliers in the female group.
 
4.    House listing price.  The listing price of a house depends on many variables and one of them is the finished square footage. MATH1530 class survey asked students to select a home for sale that they want to buy in the Johnson City area this year assuming that they have finished their intended highest degree and have a secure job in Johnson City.  Questions 14 and 16 asked students to input the listing price (in dollars) (PRICE_HOUSE) and the total finished square feet (SF_FINISHED) of the house.  Assume the houses selected by the MATH1530 students are an SRS of all houses for sale in Johnson City this year.   We are interested in studying the relationship between the listing price and the total finished square feet of a house and whether knowing a house’s finished square footage would explain the listing price.

In the dataset, there are a few houses with very large listing prices and finished square footage. In regression, sometimes, we also consider a natural logarithm transformation of both explanatory variable and response. Check http://en.wikipedia.org/wiki/Natural_logarithm for more details of natural logarithm transformation.  Columns C13 (LN_PRICE) and C14 (LN_SF) are the natural logarithm of PRICE_HOUSE and SF_FINISHED, respectively.

a.    Create appropriate plots to display the relationships between listing price and finished square footage and between the logarithm transformations of these two variables.  Insert the plots here.

  

b.    Does each of the plots show a positive association, a negative association, or no association between the two variables? 

Both plots show a positive association between the two variables.

c.    Which pair of variables is more appropriate to be fitted by a linear regression model, PRICE_HOUSE and SF_FINISHED or LN_PRICE and LN_SF?  Explain.

The scatterplots show a clearer linear relationship between LN_PRICE and LN_SF.  The variance of PRICE_HOUSE is larger as SF_FINISHED increases.
Stat>Basic Statistics>Correlation
d.    What is the correlation coefficient between the two variables you selected in Part (c)?  0.667_

Note: if you selected PRICE_HOUSE and SF_FINISHED in Part (c), then the correlation is 0.649.

e.    Obtain the least squares regression equation for the two variables you selected in Part (c).  Insert it here.  Stat>Regression>Fitted Line Plot
The regression equation is: LN_PRICE = 5.39 + 0.872 LN_SF

Note: if you selected PRICE_HOUSE and SF_FINISHED in Part (c), then the regression equation is PRICE_HOUSE = 38369 + 75.44 SF_FINISHED

f.    (Bonus) Interpret the slope of the regression equation in part (e) in the context of the question.
The price of the house will increase $75.44 for each additional increase in the finished square footage.
The logarithm of the price of house will increase by 0.872 units on average for each addition increase in the logarithm of square footage.
g.    How well does the regression equation fit the data? Explain. Justify your answer with appropriate plot(s) and summary statistics.

 
The fitted line plot shows that the regression model fits the data fairly well although there appears to be a couple of outliers.  R2 (R-squared) is useful in describing the linear association between X and Y.  Minitab displays this measure in the figure above: R-Sq = 44.5%. Therefore 44.5% (R-Sq) of the variation in the LN_PRICE can be explained by the least-squares equation. 
 

The fitted line plot shows that the regression model fits the data fairly well, although there are outliers present.  The R-Squared value of 42.1% says that 42.1% of the variation in house price can be explained by the finished square footage of the house.

Note: Another scatterplot that is helpful to see whether the model makes sense is the residual plot. This helps in determining the appropriateness of the regression model.  Recall that the residuals are Residual = Observed Data – Predicted Data. The residual plot shouldn’t have any interesting features, like direction or shape. It should stretch horizontally with about the same amount of scatter about the horizontal line at 0. There should be no bends and no outliers. We see that the plot below looks fairly good. In minitab go to Stat>Regression>Regression>Graphs>”residuals versus fits”.
 
 
h.    Assume that there is a house with total finished area of 13,300 square feet and listing price of $26,900 because of poor condition.  The natural logarithm is 9.5 for the area and 10.2 for the listing price.  If this observation is added to the analysis, 
    will it be an outlier?       YES

    will it be an influential point?  YES
 
i.    (Bonus) Assume the finished square footage is 2500 for a house in Johnson City.  Use the regression equation to predict the listing price of this house if it is on the market this year.  

The logarithm of 2500 is 7.824.  
LN_PRICE  = 5.39 + 0.872 LN(2500) = 5.3 + (.872)(7.824)=12.12
Thus the estimated listing price of this house is exp(12.12) = $183,505
 or from the other model:
PRICE_HOUSE = 38369 + 75.44 SF_FINISHED
Price of House = 38369 + 75.44(2500)    =   $226,969
 
j.    Provide a scatter plot of the two variables you selected in Part (c) and add the categorical variable TYPE_HOUSE.  Display the regression lines for the two groups.
 

k.    What do the associations of the two variables you selected in Part (c) by Type of house look like?     

    Single Family:            positive      negative       no association    

    Other:              positive      negative       no association


l.    To predict the listing price of a house using the finished square footage, would you rather include the type of the house in the model?  
Yes.  The two regression lines are quite different. 

Available Answer
$ 38.00

[Solved] Math 1530 Capstone Project Part I | Complete Solution

  • This solution is not purchased yet.
  • Submitted On 03 Jul, 2016 06:00:20
Answer posted by
Online Tutor Profile
solution
The responses from MATH1530 students to this qu...
Buy now to view the complete solution
Other Similar Questions
User Profile
Exper...

Math 1530 Capstone Project Part I | Complete Solution

The responses from MATH1530 students to this question are for different instructors. Therefore, the data should show several peaks which reflect the different ages of MATH1530 instructors...
User Profile
NUMBE...

Math 1530 Quiz Complete work 100% Satisfaction Guaranteed!

Eggs that are contaminated with salmonella can cause food poisoning among consumers. A large egg producer takes an SRS of 200 eggs from all the eggs shipped in one day. The laboratory reports that 11 of these eggs had salmone...
User Profile
Exper...

MATH 1530 CAPSTONE TECHNOLOGY PROJECT SUMMER 2015 | Complete Solution

e. What is the name of your test statistic and what is its value? The test statistic is a t-test statistic to test for single sample mean and has a value of 0.98. f. What is the P-value for the test? P = 0.326 g. State...
User Profile
Exper...

Math 1530 Quiz | Complete Solution

Eggs that are contaminated with salmonella can cause food poisoning among consumers. A large egg producer takes an SRS of 200 eggs from all the eggs shipped in one day. The laboratory reports that 11 of these eggs had salm...

The benefits of buying study notes from CourseMerits

homeworkhelptime
Assurance Of Timely Delivery
We value your patience, and to ensure you always receive your homework help within the promised time, our dedicated team of tutors begins their work as soon as the request arrives.
tutoring
Best Price In The Market
All the services that are available on our page cost only a nominal amount of money. In fact, the prices are lower than the industry standards. You can always expect value for money from us.
tutorsupport
Uninterrupted 24/7 Support
Our customer support wing remains online 24x7 to provide you seamless assistance. Also, when you post a query or a request here, you can expect an immediate response from our side.
closebutton

$ 629.35