MARKETING ANALYTICS

104 views 8:21 am 0 Comments March 23, 2023

MARKETING ANALYTICS MCS 3500 DUE ON FEBRUARY 27TH BY MIDNIGHT
Assignment 2 Winter 2023 100
Data: boston housing.csv – use only 300 random samples from about 500 house prices.
For all three PARTS, use log(
medv) as outcome variable (see data description)
Part 1: Multiple Linear Regression (50)
A. Identify most important variables using best subset algorithm to predict house prices using three model
selection criteria (RSS = residual sum of square; adjr^2 = adjusted R^2 and BIC =Bayesian information
criteria).
20
B. Provide descriptive statistics for identified important variables in Part 1 A. 10
C. Run multiple linear regression model with these identified variables (in Part 1 A). Write regression equation
and interpret any five regression coefficients.
20
Part 2: Multiple Linear Regression with Quadratic Effects (30)
D. Run multiple regression analysis with the variables identified in Part 1 with adding quadratic effects of
average numbers of rooms (
rm) and provide average marginal effects (ame) with comments. 10
E. Plot and interpret prediction and average marginal effects of average number of rooms (rm) 20
Part 3: Multivariate Adaptive Splines (MARS) (20)
F. Run MARS using all variables in the housing data to investigate non-linear response patterns. Comments
on your findings that show non-linear patterns (if exist).
15
G. Are the identified variables using MARS the same or different from Part 1 A? Please comment,
if any.
5
Format
: Generate program output as a word document (using knit function in Rmarkdown) and add your
additional responses in the same document. Your assignment should include codes, output, plots that are relevant
for the assignment questions. Drop non-relevant output as required.

Data Description
The Boston Housing Dataset
The Boston Housing Dataset is a derived from information collected by the U.S. Census Service concerning housing in
the area of
Boston MA. The following describes the dataset columns:
CRIM – per capita crime rate by town
ZN – proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS – proportion of non-retail business acres per town.
CHAS – Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX – nitric oxides concentration (parts per 10 million)
RM – average number of rooms per dwelling
AGE – proportion of owner-occupied units built prior to 1940
DIS – weighted distances to five Boston employment centres
RAD – index of accessibility to radial highways
TAX – full-value property-tax rate per $10,000
PTRATIO – pupil-teacher ratio by town
B – 1000(Bk – 0.63)^2 where Bk is the proportion of blacks by town
LSTAT – % lower status of the population
MEDV – Median value of owner-occupied homes in $1000’s