Consumer sales trends

79 views 10:04 am 0 Comments February 27, 2023

DTSC660: Data and Database Management with SQL

Module 7

Assignment 5

Purpose

For this assignment you have been asked to look into some consumer sales trends and to perform some analysis. For this assignment you will rely on the skills obtained from Modules 5 and 7 to successfully create the banking database and query it to retrieve the appropriate responses. Note that many, but not necessarily all, of the tools you learned in Modules 5 and 7 will be applied in this assignment.To complete this assignment, download and import the dataset and then create queries that respond to each prompt. Please make sure that you only use postgres language conventions.

Each question is all or nothing. Graders will not attempt to correct or interpret malformed SQL queries.

You will be responsible for testing your code on the provided data set before submission. The question will be graded based on whether or not it generates the correct output. Extraneous columns will not count against you.

Submission

You will submit a total of

1

sql files. Each file must use the postgres standards taught in the course. Use of other flavors of SQL such as T-SQL will result in an automatic 0 for the assignment. Do not submit files as archives (ZIP) files.

File 1

: You must submit a SQL document called _Assignment5. This document must include ALL queries requested in the instructions below.

You will submit the file to the Assignment 5 folder.

 


Instructions

As in the practice assignments you will be querying a large dataset to gather insights about that data. You have been provided a SQL file with all the necessary data. Make sure you take a moment to familiarize yourself with the columns, their data types, and any other pertinent information. I would recommend printing out a query with the first few records using the following command to help you get started.

PART 1 Creating the Table and Importing the Data

In the assignment 5 folder, download the

customer_spending.csv

file

Place this file in a public folder on your computer

Take note of the path to this file (copy the path)

Utilize the COPY command you learned in Module 2 (revisit this module if necessary) to import the data.

Run a basic select statement that verifies the data is present and matches what is in the csv file.

Part 2 (Queries)

This is the part you will be graded on. To complete this part, download the assignment_5_template.sql file from the assignment 3 folder. Rename this file using the naming convention: _Assignment5. Complete each query in the identified space in this document. Once you are done, submit the document to the Assignment 3 folder.

Please note that some questions are intentionally vague. Remember that you will not be penalized for extraneous columns, but you will be for extraneous rows (data). There may be multiple correct ways to solve some queries. Note that a request for a list of values always expects no duplicate values unless specified otherwise.

Write a query that returns the top selling (revenue) categories for the year 2016. Include the category name and total revenue and organized alphabetically by category name.

Write a query that returns a list of sub categories and their average unit price, average unit cost, and the difference between these values called margin for the year 2015. Organize the results alphabetically by sub category.

Write a query that returns the total number of female buyers who purchased clothes.

Write a query that provides the average quantity and average cost of products purchased by each age and by sub category. Organize the columns with age, sub category, average quantity, and average cost. Make the data organized by age oldest to youngest and then by sub category alphabetically.

Write a query that returns a list of countries that had more than 30 transactions to customers between the ages of 18-25 in the entire data set.

Write a query that returns a list of sub categories, the average quantity and average cost both rounded to 2 decimal places. Only include sub categories that have at least 10 records in the data set. Organize the data by sub category alphabetically.