STAT 1124 Project 2 Question 1 2 3 4 5 6 7 8 9 10 11 12 Total Marks 3 4 2 1 3 4 3 6 3 2 3 3 /37 Instructions: • Hand in your project on paper on or before the due date. All computer output must be copy and pasted if you type your project, or printed if you handwrite your project. • You may do this project in pairs. If you do so, hand in only one paper with both your names on it. The data for this project come from a sample of 70 outstanding loans from a particular bank. 1. What do people get loans for? Describe the distribution of TYPE. Include appropriate graphs and/or statistics. [3 marks] 2. Use the data to construct the 90% CI for the proportion of loans that are mortgages. And interpret the interval in the context of the problem. [4 marks] 3. The 70 loans that make up these data come from one particular bank. Does this affect the validity of the above CI? [2 marks] 4. Do regular customers and new customers have different loan status? Explore the association. Look at the observed and expected frequency tables for CUSTOMER vs. STATUS (You don’t need to include them). One of the expected frequency is less than 5. Which one? [1 mark] 5. Redefine the variable STATUS, call it STATUS2, by combining the categories “Questionable” and “Bad” into one. Let’s call it “Uncertain.” Re-run Chi-square test, and report the smallest expected frequency and interpret what the number means in the context of the problem. (You don’t need to paste any output. [3 marks] 6. Describe the association between CUSTOMER and STATUS2 in the sample. Include the appropriate graph. [4 marks] 7. Is the sample association statistically significant? Justify your answer by using the appropriate statistics and decision point. [3 marks] p. 1/2 Now let’s look at how much loans customers get. The variable LC records loan-to-collateral ratio. So if someone gets a $800,000,000 mortgage on a $1,000,000 house, their LC is 80%. 8. Provide descriptive statistic of LC. Include the histogram, as well as the mean, SD, and the Five- number Summary. Describe the centre, spread, and shape of the distribution. [6 marks] 9. Estimate the average LC ratio of all loans. Use an interval that will be correct 95% of the time. Clearly show the parameter of interest, the point estimate, the SE, and the CI. (Use s as a substitute for σ) [3 marks] 10. The sample indicates that LC is not Normally distributed. Does this mean the above CI is not valid? Explain your answer. [2 marks] 11. Does different types of loans have different loan-collateral ratio? Produce the side-by-side Box Plot of LC by loan type, and comment on what the graph tells you. [3 marks] 12. Now, let’s see whether the Loan-to-collateral ratio depends on how many years the customer has been at their current job. Can we conclude that the two variables are related? Justify your answer with the appropriate statistical test and graph. [3 marks] p. 2/2