BUS5PA

159 views 10:10 am 0 Comments November 1, 2023

BUS5PA Assignment 3 BUS5PA Predictive Analytics – Semester 2, 2023 Assignment 3: Customer Segmentation, Association Rule Mining, and Market Basket Analysis (MBA) Case Studies Release Date: 16th October 2023 Due Date: Sunday 5 th November 2023 at 11:55pm Weight: 30% Format of Submission: A report (pdf file in electronic form) + SAS files in spk format (electronic) Submission of project in LMS site. Unless you submit both files, a significant amount of marks will be deducted. Number of attempts: only one attempt is allowed Part A – Cluster Analysis (40%) A car insurance company wants to understand the different purchasing behaviours. As a first step they plan to identify different segments of customers in order to improve the current target marketing campaign. The CUSTOMER_DATA dataset contains the basic details about customers obtained via customer IDs. In this dataset each row represents an individual customer. There are seven variables in the dataset. The variables in the data set are shown below with the appropriate roles and levels. Name Model Role Measurement Level Description CustomerID ID Interval Identification number of the customer Customer_Value_S core Input Interval Customer value score represents a customer’s value to a company based on past performance. Education Input Nominal Highest Education Gender Input Nominal Gender of the customer Income Input Interval Annual Income of the customer ($) Location_Code Input Nominal Residential Locality Marital_Status Input Nominal Marital status of customer You, as the data analyst, is required to conduct a cluster analysis of the data set and provide an insightful report on the different customer segments to the manager of the insurance. a. Create a new diagram in your project. Name the diagram as Profiling. b. Define the data set CUSTOMER_DATA as a data source and set appropriate roles and levels. c. Add an Input Data Source node to the diagram workspace and select the CUSTOMER_DATA data table as the data source. d. Determine whether the model roles and measurement levels assigned to the variables are appropriate. Examine the distribution of the variables. • Are there any skewed variables? Are there missing values that should be replaced? BUS5PA Assignment 3 • If yes, use the Transform variables node to transform the skewed variables. (Hint: Use the log transformation; LOG(variable_name) e. Add a Cluster node to the diagram workspace and set the number of clusters as four. f. Set the appropriate properties for the Cluster node. Leave the default setting as Internal Standardization  Standardization What would happen if inputs were not standardized? Explain using knowledge from discussions in the class. g. Run the diagram from the Cluster node and examine the results. Does the number of clusters created seem reasonable? Discuss using knowledge from class discussions related to – what is a cluster/what is the ideal number of clusters to have, etc. h. Increase the number of clusters to a maximum of six clusters and re-run the Cluster node. How does the number and quality of clusters compare to that obtained in question e? Do you think it is better to further increase the number of clusters? (You can answer this question by trying out a higher number of clusters – or discuss based on the previous clustering outcomes). i. Use the Segment Profile node to summarize the nature of the clusters based on the better number of clusters from question h. Describe the profiles based on different customer behaviors. (Hints: Distribution of each interval variable in the segments can be interpreted as the same as provided in the tutorial; Distribution of each nominal variable in the segments is visualized by pie chart, and the inner ring represents the distribution of the total population, while the outer ring represents the distribution for the given segment) j. The insurance company manager would like to develop a target marketing strategy based on this cluster analysis. Discuss how the clustering you have carried out could be used in such a strategy. Part B – Market Basket Analysis and Association Rules (30%) In order to plan innovative promotions to move items that are often purchased together, a supermarket chain is interested in market basket analysis of groceries purchased. You are a member of the analytics team assigned to the task. The supermarket chose to conduct a market basket analysis of specific items purchased from the online TRANSACTIONS data set contains information about more than 38,000 transactions made over the past three months from 167 different items including: Whole milk soda Tropical fruit Citrus fruit Shopping bags Bottled beer Other vegetables yogurt Bottled water Pip fruit Canned beer Newspapers Rolls/buns Root vegetables sausage pastry Whipped/sou r cream frankfurter You have access to SAS Enterprise Miner data analytics tools and decided to carry out a market basket and association rule-based analysis of the data. The following instructions will help you to set up the SAS diagram for the analysis. There are three variables in the data set: BUS5PA Assignment 3 a. Create a new diagram. Name the diagram Retail. b. Create a new data source using the data set TRANSACTION. c. Assign the variable Date the model role Rejected. This variable is not used in this analysis. Assign the ID model role to the variable MemberId and the Target model role to the variable Item. Change the data source role to Transaction. d. Add the TRANSACTIONS data set and an Association node to the diagram. e. Change the setting for the Export Rule by ID property to Yes. f. Leave the remaining default settings for the Association node and run the analysis. g. Examine the results of the association analysis. Your team leader has indicated that the answer to the following questions will be useful to the management. You have to answer the questions and prepare a report giving evidence to support your answers – (e.g.: Screen shots, numeric values etc.). 1. What is the significance of the lift value of a rule? What is lift and what is the importance in calculating lift? 2. What is the highest lift value for the resulting rules, which rules have this value? What does the highest lift value signify? 3. Based on the association rules, briefly describe 3 example product bundles and promotions that you might suggest? 4.You are required to provide detailed report of the outcomes of the analysis to your manager. Prepare a brief report (max. 1000 words) presenting: (a) The problem (b) Your solution/approach (c) Outcomes (d) Analysis results and interpretation 5. You should explain the approach and outcomes such as support, confidence, lift and-, how could the product bundles you suggested be used (practical value) by the departments. Part C – Open Discussion – Analytics Case Study (30%) This question is based on the week 11 workshop materials: It is very important That you attend the guest lecture to be able to answer this question. Read the provided article: ‘Prediction the future of cx’. You may also reference additional material you can find to gain background knowledge of the area. You are expected to summarize the content of the guest lecture and discuss how you relate the guest lecture to the article ‘Prediction the future of cx’ BUS5PA Assignment 3 • How would you relate the knowledge and understanding you gained from the guest lecture to topics discussed in the article? • Do you agree that using customer questionnaires to collect data to understand the customers will limit the possible insights that can be generated? Using information from the guest lecture discuss your thoughts on this topic. You are expected to write a report (between 700 – 1000 words) discussing the above points (you may check answer guidelines for a more detailed information on the requirements for answer preparation).

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *