Data Analysis Assignment

137 views 7:23 am 0 Comments July 4, 2023

Assignment overview:

This assignment involves analysing a dataset, interpreting results, and drawing conclusions based on the analyses. The dataset can be found in the file “practical_assignment_2021.xls” which is on Canvas under the Assignments folder.

It is a group task worth 50% of your final mark for this unit of study. Please form groups of 3-4 students on Canvas (People>Groups). Any student who is unable to form a group by 12th May will be randomly assigned to groups.

Please submit a report and accompanying statistical output on Canvas by midnight (11:59 pm) on Friday 4th June 2021. Should you require any clarification, please ask on the VETS6103 canvas discussion board.

Description of the study:

A large prospective study was undertaken in nine regions of Australia to determine the factors influencing milk production in dairy cows, with data collected by Dr John Morton from the University of Queensland. Specifically, the objective of this study was to investigate the effect of parity and reproductive events on milk production in the first 120 days of lactation. Variables in the dataset are described in Table 1.

Note: This assignment is a subset of original data that has been substantially modified for pedagogical purposes. Therefore, results will not be the same as those obtained by the author. After completing this task, you should be able to apply these statistical skills to analyse other datasets, including your own research.

Table 1. Variables in the dataset “practical_assignment_2021.xlsx” based on individual cow reproduction and lactation data collected from various farms in Australia.

Variable Description Categories/units
Herd Farm identification code 1 – 3
Cow ID Cow identification code Unique for each cow
Parity Lactation number 1 – 7
Milk Yield 120 Milk yield in first 120 days of lactation Litres
CFS Interval Calving to first service interval Days
Conception at FS Conception at first service Yes/No
Dystocia Dystocia at calving Yes/No
Vag discharge Vaginal discharge observed Yes/No
Season Season of calving Spring, Summer, Autumn, Winter

Learning outcomes:

After successfully completing the assignment, you should be able to:

  • Identify variable types;
  • conduct descriptive analyses for various variable types;
  • evaluate associations between variables;
  • conduct inferential analyses including general linear modelling;
  • interpret the results of statistical modelling;
  • present the results in a technical report or a journal article; and
  • write in an abstract based on the study.

Assignment tasks

Your job is to analyse the given dataset to achieve the study objective, i.e., to investigate the effect of parity and reproductive events on milk production in the first 120 days of lactation. Please follow the instructions given below.

  • Background to the study: Identify the explanatory and outcome variable/s in the dataset. Discuss why the explanatory variables might influence the outcome, considering the biology and welfare of the cow species. Use references from scientific journals to support your claims.
  • Descriptive analyses: Conduct appropriate descriptive analyses for variables in the dataset (a) to understand the distribution of each variable and (b) make a preliminary assessment of the associations of explanatory variables with the outcome. Present the results in appropriately labelled figures and tables. Do not include any raw R-commander output in this section. All raw R-commander output should be included in an appendix.
  • Univariable analyses: Conduct appropriate inferential analyses (t-test, ANOVA, regression) to objectively evaluate the association of each explanatory variable with the outcome. Present the results in a table (or tables) suitable for publication in a scientific journal. Briefly describe these results, including which variables influence the outcome in a short paragraph.
  • Multivariable analyses:
  • Build a multivariable linear regression model containing all variables with p-value < 0.25 in univariable analyses (Full model).
  • Drop non-significant variables (p-value > 0.05) from the full model one at a time, starting with the variable with highest p-value until all variables in the model are significant (Final model).
  • Present the results of the final model in a table suitable for publication in a scientific journal. Briefly describe these results, including which variables were included in the final model and their impact on the outcome.
  • Statistical methods: Describe the statistical approaches you have used to analyse the data in Steps 2 – 4 above as you would do in the statistical methods section of a peer-reviewed journal article. Mention the statistical program used (e.g., R-commander).
  • Abstract: Prepare an abstract based on the analyses conducted in this assignment suitable for publication in a scientific journal.
  • Supplementary materials/appendices: This section is for any extra graphs, tables and/or annotated raw R-commander output that is not included in the main text. It can be referred to in the main body of the assignment, but the reader should be able to understand your assignment without referring to the supplementary material.
  • Referencing: Provide in-text references and a list of references at the end of the text using the APA referencing style.
  • Contribution statement: Write a statement of the contributions made by each member of your group towards the assignment. Briefly mention how often you communicated with each other for preparing the assignment.
Tags: , , , , , , , , , ,