Experimental design and data analysis

Experiments and what to do with them

This website serves as an open-access, “video textbook” for experimental design and data analysis. 

We use examples drawn from biology to illustrate and apply concepts but the principles apply generally (e.g., social sciences). 

As in a book, this website organizes topics by “chapter”, but this website replaces most text with videos.  All videos offer transcripts to facilitate access with a poor internet connection; slides are also available, where applicable, as are practice problems with answers. 

If you have an example (or a request for subject matter) that you believe would improve this resource, please contact Crispin Jordan: 

View Crispin Jordan's staff profile

Aims of the website

'Experiments and what to do with them' aims to promote Research Reproducibility across biological disciplines.  In particular this website:

  • provide a firm introduction to essential aspects of experimental design 
  • emphasize the need for appropriate experimental Power 
  • demonstrate power analysis for every type of analysis explored here
  • discusses Questionable Research Practices and how to avoid them
  • introduces Open Research practices that increase transparency and work towards reproducibility 
  • follows the American Statistical Association's advice to move away from the concept of 'statistical significance'. The ASA advises that the term 'statistically significant' (and all variants thereof) be dropped entirely from statistical discourse, and this website largely adopts this advice. Specifically, in line with recommendations, we (i) continue to use p-values for inference but without reference to the arbitary threshold of 0.05 and (ii) emphasize 'effect size' to understand effects. However, given that students continue to interact with a large body of pre-exisitng literature that adheres to the concept of 'statistical significance' this course cannot simply ignore this historical perspective. Therefore, we occasionally mention 'statistical significance' but also demonstrate alternative approaches that improve interpretation.

Current analyses

We focus on General Linear Models (GLMs) to implement a broad range of analyses.  At the moment, this website presents:

  • An introduction to randomization tests, hypotheses, null distributions and p-values
  • T-tests
  • Data transformation
  • 1-Factor GLM (i.e. very loosely speaking, “1-factor ANOVA”)
  • Multi-factor GLM (e.g., again, very loosely speaking, “2-factor ANOVA”)
  • GLM with continuous independent variables (covariates; i.e., “regression”)
  • GLM combining factors and covariates (i.e., very loosely speaking, “ANCOVA”)
  • An introduction of mixed effects models
  • Systematic Reviews

In the future, we will also deal with:

  • Different ways to calculate p-values (e.g. type 1, 2 and 3 sum of squares, AIC, etc.)
  • Models with multiple covariates
  • Model selection
  • More on mixed effects models
  • Computational methods; bootstrapping and randomization tests
  • Generalized Linear (Mixed) Models
  • Multivariate analyses
  • And more – please make a request!

Please see the General Introduction for further perspective on this website.

The materials presented here draw from many scientists over many years; please see Acknowledgements, below. 

Chapter 1.  General Introduction

Practicing biologists need a strong foundation in experimental design and data analysis; these skills allow biologists to transform an idea or hypothesis into a conclusion.

Interleaf 1- Biologists who found careers using statistics

Biologists who found careers using statistics

Chapter 2.  An introduction to R

This chapter provides an introduction to basic skills in R.

Chapter 3. Using R to introduce basic concepts of hypothesis testing

This chapter explains the logic used to test hypotheses from a frequentist perspective (i.e., the perspective most commonly taught for data analysis). 

Chapter 4. Plotting data

In this chapter we discuss best practice for plotting data for common experimental designs in biomedical science.

Chapter 5. Variance

Variability is what makes biology (and life) so interesting.

Chapter 6. Measuring an average with uncertainty

We cannot measure anything perfectly: our measurements always include some degree of uncertainty. This chapter explains how we can describe this uncertainty when reporting and interpreting results. Specifically, we introduce the idea of ‘standard error’ and ‘confidence intervals’.

Chapter 7. Comparing averages with two (or one) groups

This chapter explores how to compare the average of a group to something else for simple experimental designs. 

Chapter 8. Abandon statistical significance

This chapter explores the arguments to abandon the concept of statistical significance, and recommends alternative approaches to interpret results. 

Chapter 9. Experimental design

‘Experimental design’ is a huge topic, with many books devoted to the topic. The vast majority of experiments in the biological sciences, however, are based on a few foundational principles. We focus on these principles in this (and following) chapter(s) to provide the resources to design reliable, replicable and powerful experiments.

Chapter 10. More experimental design: independence and pseudo-replication

This chapter first describes the evidence for pseudo-replication in animal experiments. We then introduce the concepts to understand when pseudo-replication arises, why it matters, and provide advice to avoid pseudo-replication and practice to spot it in published studies.

Chapter 11. Power analysis

This chapter explains what power analysis is and why it is essential to design a study.

Chapter 12. Questionable research practices

This chapter highlights the prevalence of Questionable Research Practices for some disciplines of Biology and Psychology, explains how they lead to non-reproducible research, and discusses solutions to begin to deal with them.

Chapter 13. Comparing averages between more than two groups: 1-factor models

This chapter deals with the design and analysis of such 1-Factor experiments; the analyses are termed, 1-Factor General Linear Models (glm).

Chapter 14. Dealing with violated assumptions

This chapter focuses on data transformation.

Chapter 15. Analysing experiments with multiple factors

This chapter focuses on designing and analysing multi-factor experiments.

Chapter 16. Understanding covariates: simple regression and analyses that combine covariates and factors

This chapter introduces approaches to model continuous data as an independent variable. We refer to continuous independent variables as ‘covariates’.

Chapter 17. Practice with general linear models

This chapter provides further opportunity to practice working general linear models.

Chapter 18. Mixed effects models

This current chapter introduces another type of effect: ‘random effects’. Mixed effects models, the subject of this chapter, combine ‘fixed’ and ‘random’ effects.

Appendix

Other online courses

Acknowledgements

Crispin Jordan thanks many people who contributed to his understanding of experimental design and data analysis, challenged his thinking on the topic, and provided inspiration: