Scripting for Data Science Assignment

123 views 7:59 am 0 Comments April 18, 2023

SET11123 Scripting for Data Science Assignment with HNd Assignment

1. Module numberSET111232. Module titleScripting for Data Science3. Module leaderMd Zia Ullah4. Tutor with responsibility for this Assessment Student’s first point of contactMd Zia Ullah Kehinde Babaagba5. AssessmentPractical coursework6. Weighting50% of module assessment7. Size and/or time limits for assessmentYou should be able to complete this assessment within approximately 20 hours (if you have kept up with lecture and practical materials).8. Deadline of submissionFriday 28th of April at 3 pm Your attention is drawn to the penalties for late submissions. Only your module leader can authorise extensions.9. Arrangements for submissionVia Moodle – See coursework document.10. Assessment RegulationsAll assessments are subject to University Regulations. Plagiarised work will be dealt with according to the university’s guidelines (Please read – especially if this is the first time in a UK university): Academic Integrity – Edinburgh Napier Students’ Association11. The requirements for the assessmentSee the coursework document.12. Special instructionsIf you use pieces of code that are not your own work (e.g., copied from an example you found online), you must specify its source (e.g., URL). Failing to do so, your coursework may be deemed as plagiarised.13. Return of work and feedbackYou should keep a copy of your submitted work. Written feedback with indicative marks will be provided via Moodle within 3 (working) weeks from the date of the submission (read above). Individual oral feedback will be provided in a form of a one-to-one meeting (upon request from the student). Please note that all marks are subjected to internal moderation and verification at the assessment board. If you have any doubts wrt your feedback and/or would like to discuss it, please contact the module leader.14. Assessment criteriaSee the coursework document.

Overview

The aim of this coursework is to consolidate your knowledge of all the fundamentals of Python. You should be able to solve all the tasks detailed below with all you have learned in the module (data types, conditions, loops, functions, classes, regex, data I/O, numpy arrays, data preprocessing, exceptions).

For this coursework, you will be analysing average temperature data of different cities across the world.

Dataset

You can download the dataset in CSV format Average Temperature of Cities.csv on Moodle (downloaded from https://www.kaggle.com/swapnilbhange/average- temperature-of-cities the 17th January 2022). The dataset contains the following columns: Country, City, Average Temperature from Jan-Dec, year average, and continent. The temperature values are expressed in Celsius.

Task

Please read your tasks CAREFULLY.

Desired features

You should create the necessary functions and/or classes to provide the following functionality:

read the dataset;given a string as a parameter indicating the country (case insensitive), you should return the cities where temperature information was recorded. For example, United Kingdom includes data for London and Edinburgh. Hint: a city could be represented as an object including temperature data.Given a string as parameter indicating a continent, you should return all the countries included in the dataset associated to the specified continent.

DESIRED OUTPUTS

Determine the city with the lowest average temperature recorded over a single year. Do the same with the highest temperature.Determine the city and the month where the lowest temperature was recorded. Do the same with the highest.For each continent, show the top 5 hottest cities (considering the average yearly temperature reported in the dataset).For each country, show the city (and the month) were the coldest temperature was recorded.

SUBMISSION

You must submit a .zip file, containing:

A .ipynb file with your code. Please clear all cells before you submit. A .py file is also fine for this submission.[Optional] A README file, if you want to describe how your code should be used.

IMPORTANT: A zip file means a zip file. Other formats (e.g., RAR, 7z, GZ) will not be accepted and your submission and your grade will be 0 (zero). Before submitting your solution, it is your responsibility to check the integrity of your file. A corrupted zipped file will also lead to 0 (zero).

Your code file must be named SET11123_YOURMATR_CW2.ipynb. For instance, if your matriculation number is 40014374, then the Python file must be SET11123_40014374_CW2.ipynb. Similarly, the zip file containing your code must be named SET11123_YOURMATR_CW2.zip.

Your zip file must be uploaded via Moodle.

Deadline: [Week 13] 28th April 2023 – 15:00.

Additional Information

You are allowed to use any external library for your code, including those we have not seen during the module.If you want to use PyCharm, instead of a Jupyter Notebook, it is totally fine. In that case, you should submit a .py file within your zip.Make sure you add comments to your code to help me to understand your thought process. Moreover, at the top of your script file, please also add your name and matriculation number as a comment.If you are using Jupyter Notebook, you are also encouraged to create Markdown cells to describe your code. Markdown cells are considered as comments in your code and, as such, they will contribute towards the code quality assessment criterion.Make sure that your code works on any computer. You can easily prove this by using/uploading your code on Google Colab.Be free to use all the material on Moodle at the best of your convenience. You are also welcome to search online and get inspiration from someone else’s work. However, in this latter case, you must write in your code (e.g., with a comment) where you took the inspiration from.You are welcome to ask for any clarification regarding this coursework, in the case that certain aspects of it may be unclear or ambiguous.