Assessment 3 Description
This assessment concerns a very standard task in machine learning: image classification/ recognition. We have done image recognition in tutorials so you should be familiar with the general problem. For this assessment, students need to choose a real-world problem. This problem should be confirmed by the Unit Assessor. After confirming the problem, students need to develop and evaluate various neural network based models to solve the chosen problem and communicate the results. Students will use the images.zip as the dataset for their assignment.
Please carefully read through this document to understand the tasks and expectations.
Section I: Defining the Problem
Context
You are still part of the same organization as in Assessments 1 and 2. Now you are working in a team with one other machine learning expert.
Aa Example Problem
Your social scientist colleagues have taken thousands of photos of children suffering from diseases from drinking polluted water. The size of the images is 32×32. In these images, your colleagues expect to see 10 different objects with noises. Your colleagues want your team to build a system that automatically recognizes the objects in each image and labels the image.
As preparation for this, you take a labelled dataset of 32×32 RGB images of digits (0 – 9). Each image contains just one object of interest and there are a total of 10 objects (10 digits). The details of the dataset are given later. You will first build a system for these images as a test to run for the images given by the scientists.
Task
Your task is to design and train a neural network that will accurately identify the objects in the images in the given dataset. The architecture and optimization of the neural network is completely your decision. However, in this example, the neural network should take a 32×32 image in the format identified in the dataset section as input, and 10 classes in the output.
Dataset Sample
The dataset is contained in the images.zip folder. The data is split among 5 batch files with each batch file containing 10000 images. There is also a metafile, which contains labelling information, i.e. which label refers to which object. You can import the files in python using the following code:
import pickle
import numpy as np
from matplotlib import pyplot as plt
dict = {}
data = []
with open(‘data_batch_1’, ‘rb’) as fo:
dict = pickle.load(fo, encoding=’bytes’)
images = dict[b’data’]
You can read each file by using the appropriate filename. Running this code will return a dictionary containing the data. The dictionary will have four keys with two keys referring to the input pixels and the output labels. Each RGB image is stored as a row vector of 3072 pixels (32 x 32 x 3). The rest of the data exploration is left up to you.
Section II: Assignment Submission
Required Files
You are required to submit three files. The three files must exactly conform to the requirements below otherwise, you will lose the grade. If the files do not follow the requirements, we will not be able to mark them and award an F. The required files are
A neural network model file (.h5), which contains your neural network model. The model file must be named ‘ourmodel.h5’ (quotation marks not included).
A python script (.py file), which contains all your code work and shows the process through which you developed your solution. The code should be broken down into sections and with appropriate comments making it easy to follow. You will lose grades if your code is not easy to follow. The source code file should be named ‘source_code.py’.
A report summarizing your design and detailing how you arrived at your solution, what difficulties you faced and how did you try to tackle those difficulties. Your report should follow the template provided.
All files should be submitted on or before the Monday of week 7 of the term at 11:59 PM.
Remember: You will receive a Fail grade by default if you do not submit all three files listed above. You will also receive a Fail grade if the files are not submitted according to the specifications listed below. It is your responsibility to make sure that the files are submitted correctly. Make sure you submit the files well before the deadline.
File Submission
Your files should be submitted exactly as given below otherwise, you will receive a Fail:
The report should be submitted through the Turnitin link. It should be in pdf format. The pdf file should be named ‘studentfirstname_studentlastname.pdf’(quotations marks not included).
The source code file as well as your model file should be submitted together in a zip folder in the Assignment Submission link. The folder name should be the same as the document name (of course the file type will be different).
Section III: Our Expectations
There are many image recognition solutions available on the web. You are welcome and encouraged to explore them. However, we expect you to make your own solution and spend considerable time designing it to increase it is accuracy. We will be able to tell from your report and code how much effort you put into the solution
This is a challenging assessment so please start on it as soon as possible. Do not leave it till last week. Ideally, you should be spending at least 2 days per week to be able to design and develop a reasonable model.
Section IV: Marking Criteria
A rubric has been provided to help students with marking. However, a description of the rubric is provided in the following.
An interview of 10-15 minutes will be conducted (on week 7 Wednesday during tutorial time) for each student to answer some questions about the model and explain their code. If you cannot answer questions about your solutions, it will inform us that you did not solve the problem yourself and we will report you for academic integrity breach.
If you meet the minimal requirements for submission above, you can achieve the following grades
Pass: To achieve a pass mark you must show a basic understanding of the process and tasks involved in solving the given problem and have a basic working solution, which gives good accuracy and has a decent run time. Furthermore, your source code should be well commented and broken down into sections. Lastly, your report should be well-written.
Credit: To achieve a credit you must show a good understanding of the process and the tasks involved, and have a working solution with a superior accuracy and a good run time. Moreover, you can partially identify the challenges of the problem, the implementation issues, and some difficulties you faced and how to tackle them.
Distinction: To achieve a distinction you must show an excellent understanding of the process, the tasks involved, and have a well-implemented working solution with excellent accuracy and excellent run-time. Furthermore, you can identify the majority of the interesting features of the problem, the implementation issues the requirements pose and how you solved all these issues.
High Distinction: Everything in distinction but at an outstanding level and even going beyond!