LMP2004H: Introduction to Biostatistics

Who can attend

A maximum of 20 students can be enrolled in this course.

Ten of these will be from the MHSc in Laboratory Medicine program, while for the remaining 10 spots priority will be given to students from the research streams at the Department of LMP.

Course description

This course introduces the fundamental concepts of Biostatistics, providing an understanding of the basic theoretical underpinnings and practical applications of statistics.

You will learn essential statistical techniques and analyses relevant to your academic studies or professional work in general medicine and in the fields of pathology and clinical embryology.

Course highlights

Basic theoretical underpinnings: An exploration of core statistical theories including probability, distribution, and inference, creating a strong foundation for further study and application.

Practical applications: Instruction on how to apply statistical concepts to real-world problems in general medicine, pathology, and clinical embryology, offering essential skills needed for basic analysis in these fields.

Basics of AI-assisted coding: Introduction to the principles and techniques of AI-assisted coding, integrating artificial intelligence with statistical methods for enhanced data analysis.

Capstone project preparation: The course provides the necessary statistical foundations for students to successfully undertake and complete the program's capstone projects, integrating theory with practical application.

Overall, this course is designed to equip you with the knowledge and skills to navigate the basic concepts of statistics, enabling you to apply these principles.

Learning outcomes

After completing this course, you will be able to:

  • understand the foundational statistical concepts including the mean-variance framework, hypothesis testing, and uncertainty principles, in addition to fundamental equations needed for basic statistical analysis.
  • understand study design and statistical applications in medical research.
  • produce and appraise a methods section in scientific literature, specifically evaluating and appraising the appropriateness of chosen methods and expressing these methods clearly and correctly.
  • produce and appraise a results section in scientific literature, specifically evaluating accuracy of reporting, formatting and structure of reporting, and reporting what is described in the methods section accurately.
  • evaluate quality of a statistical model, prediction, or test, and explaining these results to a scientific or non-scientific audience.
  • understand statistical coding in R and guidance on AI-assisted code generation, editing, and debugging.


You will be taught in twelve three-hour sessions:

  • 1.5 hours: Lectures will cover the foundational theoretical underpinnings of Biostatistics, including core concepts in probability, distribution, inference, practical applications in pathology and clinical embryology, and basics of AI-assisted coding.
  • 1.5 hours: Tutorials will offer interactive exercises and problem-solving sessions, focusing on the practical application of the concepts learned in the lectures.
  • Participation will be measured by in class “quick write” assignments which are short concept checks which must be handed in at the end of tutorial.
  • Weekly Assignments: will be given at the end of each lecture and will be due at the beginning of the next lecture. These weekly assignments will concentrate on the core concept presented in each lecture, enabling you to reinforce your understanding and practice the application of theoretical principles. The assignments will use provided open-source datasets to develop and report an analysis. 

Late submissions will receive a grade of 0.

Course coordindator

Dr. Mark Tatangelo


Teaching Assistant: 

Julia Gallucci

lmp.grad@utoronto.ca for administrative queries.

Timings and location

Wednesdays 9 am – 12 pm

Location: MY 440

Office hours (Tatangelo): Wednesdays 8 - 9 am, MY 440

Office hours (Gallucci): Wednesdays 2 pm, CAMH (250 College St)

Evaluation methods

Participation: 25% (2.5% each “quick write” in class activities)

Weekly Assignments: 25% (2.5% each, best 10 out of 11 for weeks 2-12)

Midterm Exam: 25%

Final Exam: 25% 


Lecture date


September 13, 2023

Basic statistics, measures of central tendency, data types

September 20, 2023

Population statistics, standard deviation, variance, z-distribution

September 27, 2023

Sample statistics, standard error, t-distribution

October 4, 2023

Hypothesis Testing

October 11, 2023

Analysis of variance

October 18, 2023

Power, sample, size, and Introduction to AI assisted coding in R

October 25, 2023

Pre-post studies and test accuracy

November 1, 2023

Linear regression

November 8, 2023

Logistic regression

November 15, 2023

Survival analysis

November 22, 2023

Matched analysis

November 29, 2023

Repeated measurements

December 9, 2023

Final exam

Required materials

R version 4.2.3

R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

RStudio Team (2020). RStudio: Integrated Development Environment for R. RStudio, PBC, Boston, MA 

OpenAI (2020). GPT-3: Generative Pre-trained Transformer 3. arXiv:2005.14165.

Recommended materials 

OpenAI (2021). GPT-4: Generative Pre-trained Transformer 4. OpenAI. 

Course textbook

Hoffman et al. Basic biostatistics for medical and biomedical practitioners. 2019 

Rosenbaum, P. R. (2010). Design of Observational Studies. Springer Series in Statistics. Springer, New York, NY.

Phillips, N. D. (Year). A Pirate's Guide to R. Publisher.

Course Datasets


Statement on the use of AI

ChatGPT integration: ChatGPT, an AI model by OpenAI, will be used as a supplementary tool to aid in writing code for R.

Mandatory assignments: There will be specific assignments and modules in which you are required to use ChatGPT. You can access ChatGPT using an email address google account, apple account, or Microsoft account.

Optional use: Outside of mandatory assignments, you may choose to use ChatGPT for other work in LMP2004 as you see fit.

Ethical considerations: Any information or data derived from ChatGPT that is used in student work must be appropriately cited.

Critical analysis: You do not recommend you rely solely on ChatGPT without personal input or analysis. You should assess and validate the information from ChatGPT within the context of your work.

You are expected to follow these guidelines while using ChatGPT for the duration of the course.