Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Random Assignment in Experiments | Introduction & Examples

Random Assignment in Experiments | Introduction & Examples

Published on March 8, 2021 by Pritha Bhandari . Revised on June 22, 2023.

In experimental research, random assignment is a way of placing participants from your sample into different treatment groups using randomization.

With simple random assignment, every member of the sample has a known or equal chance of being placed in a control group or an experimental group. Studies that use simple random assignment are also called completely randomized designs .

Random assignment is a key part of experimental design . It helps you ensure that all groups are comparable at the start of a study: any differences between them are due to random factors, not research biases like sampling bias or selection bias .

Table of contents

Why does random assignment matter, random sampling vs random assignment, how do you use random assignment, when is random assignment not used, other interesting articles, frequently asked questions about random assignment.

Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment and avoid biases.

In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. To do so, they often use different levels of an independent variable for different groups of participants.

This is called a between-groups or independent measures design.

You use three groups of participants that are each given a different level of the independent variable:

  • a control group that’s given a placebo (no dosage, to control for a placebo effect ),
  • an experimental group that’s given a low dosage,
  • a second experimental group that’s given a high dosage.

Random assignment to helps you make sure that the treatment groups don’t differ in systematic ways at the start of the experiment, as this can seriously affect (and even invalidate) your work.

If you don’t use random assignment, you may not be able to rule out alternative explanations for your results.

  • participants recruited from cafes are placed in the control group ,
  • participants recruited from local community centers are placed in the low dosage experimental group,
  • participants recruited from gyms are placed in the high dosage group.

With this type of assignment, it’s hard to tell whether the participant characteristics are the same across all groups at the start of the study. Gym-users may tend to engage in more healthy behaviors than people who frequent cafes or community centers, and this would introduce a healthy user bias in your study.

Although random assignment helps even out baseline differences between groups, it doesn’t always make them completely equivalent. There may still be extraneous variables that differ between groups, and there will always be some group differences that arise from chance.

Most of the time, the random variation between groups is low, and, therefore, it’s acceptable for further analysis. This is especially true when you have a large sample. In general, you should always use random assignment in experiments when it is ethically possible and makes sense for your study topic.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

random assignment vs matching techniques

Random sampling and random assignment are both important concepts in research, but it’s important to understand the difference between them.

Random sampling (also called probability sampling or random selection) is a way of selecting members of a population to be included in your study. In contrast, random assignment is a way of sorting the sample participants into control and experimental groups.

While random sampling is used in many types of studies, random assignment is only used in between-subjects experimental designs.

Some studies use both random sampling and random assignment, while others use only one or the other.

Random sample vs random assignment

Random sampling enhances the external validity or generalizability of your results, because it helps ensure that your sample is unbiased and representative of the whole population. This allows you to make stronger statistical inferences .

You use a simple random sample to collect data. Because you have access to the whole population (all employees), you can assign all 8000 employees a number and use a random number generator to select 300 employees. These 300 employees are your full sample.

Random assignment enhances the internal validity of the study, because it ensures that there are no systematic differences between the participants in each group. This helps you conclude that the outcomes can be attributed to the independent variable .

  • a control group that receives no intervention.
  • an experimental group that has a remote team-building intervention every week for a month.

You use random assignment to place participants into the control or experimental group. To do so, you take your list of participants and assign each participant a number. Again, you use a random number generator to place each participant in one of the two groups.

To use simple random assignment, you start by giving every member of the sample a unique number. Then, you can use computer programs or manual methods to randomly assign each participant to a group.

  • Random number generator: Use a computer program to generate random numbers from the list for each group.
  • Lottery method: Place all numbers individually in a hat or a bucket, and draw numbers at random for each group.
  • Flip a coin: When you only have two groups, for each number on the list, flip a coin to decide if they’ll be in the control or the experimental group.
  • Use a dice: When you have three groups, for each number on the list, roll a dice to decide which of the groups they will be in. For example, assume that rolling 1 or 2 lands them in a control group; 3 or 4 in an experimental group; and 5 or 6 in a second control or experimental group.

This type of random assignment is the most powerful method of placing participants in conditions, because each individual has an equal chance of being placed in any one of your treatment groups.

Random assignment in block designs

In more complicated experimental designs, random assignment is only used after participants are grouped into blocks based on some characteristic (e.g., test score or demographic variable). These groupings mean that you need a larger sample to achieve high statistical power .

For example, a randomized block design involves placing participants into blocks based on a shared characteristic (e.g., college students versus graduates), and then using random assignment within each block to assign participants to every treatment condition. This helps you assess whether the characteristic affects the outcomes of your treatment.

In an experimental matched design , you use blocking and then match up individual participants from each block based on specific characteristics. Within each matched pair or group, you randomly assign each participant to one of the conditions in the experiment and compare their outcomes.

Sometimes, it’s not relevant or ethical to use simple random assignment, so groups are assigned in a different way.

When comparing different groups

Sometimes, differences between participants are the main focus of a study, for example, when comparing men and women or people with and without health conditions. Participants are not randomly assigned to different groups, but instead assigned based on their characteristics.

In this type of study, the characteristic of interest (e.g., gender) is an independent variable, and the groups differ based on the different levels (e.g., men, women, etc.). All participants are tested the same way, and then their group-level outcomes are compared.

When it’s not ethically permissible

When studying unhealthy or dangerous behaviors, it’s not possible to use random assignment. For example, if you’re studying heavy drinkers and social drinkers, it’s unethical to randomly assign participants to one of the two groups and ask them to drink large amounts of alcohol for your experiment.

When you can’t assign participants to groups, you can also conduct a quasi-experimental study . In a quasi-experiment, you study the outcomes of pre-existing groups who receive treatments that you may not have any control over (e.g., heavy drinkers and social drinkers). These groups aren’t randomly assigned, but may be considered comparable when some other variables (e.g., age or socioeconomic status) are controlled for.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Random Assignment in Experiments | Introduction & Examples. Scribbr. Retrieved September 16, 2024, from https://www.scribbr.com/methodology/random-assignment/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, guide to experimental design | overview, steps, & examples, confounding variables | definition, examples & controls, control groups and treatment groups | uses & examples, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of springeropen

When randomisation is not good enough: Matching groups in intervention studies

Francesco sella.

1 Centre for Mathematical Cognition, Loughborough University, Loughborough, UK

2 Department of Experimental Psychology, University of Oxford, Oxford, UK

Roi Cohen Kadosh

Associated data.

Randomised assignment of individuals to treatment and controls groups is often considered the gold standard to draw valid conclusions about the efficacy of an intervention. In practice, randomisation can lead to accidental differences due to chance. Researchers have offered alternatives to reduce such differences, but these methods are not used frequently due to the requirement of advanced statistical methods. Here, we recommend a simple assignment procedure based on variance minimisation (VM), which assigns incoming participants automatically to the condition that minimises differences between groups in relevant measures. As an example of its application in the research context, we simulated an intervention study whereby a researcher used the VM procedure on a covariate to assign participants to a control and intervention group rather than controlling for the covariate at the analysis stage. Among other features of the simulated study, such as effect size and sample size, we manipulated the correlation between the matching covariate and the outcome variable and the presence of imbalance between groups in the covariate. Our results highlighted the advantages of VM over prevalent random assignment procedure in terms of reducing the Type I error rate and providing accurate estimates of the effect of the group on the outcome variable. The VM procedure is valuable in situations whereby the intervention to an individual begins before the recruitment of the entire sample size is completed. We provide an Excel spreadsheet, as well as scripts in R, MATLAB, and Python to ease and foster the implementation of the VM procedure.

Supplementary Information

The online version contains supplementary material available at 10.3758/s13423-021-01970-5.

Introduction

Randomisation in controlled trials.

A common problem in intervention studies is comparing the effect of intervention while minimising the influence of confounding factors. In the pre-treatment assessment, a researcher usually measures the characteristics that the treatment aims to modify (i.e., outcome measures) as well as other variables that can exert an influence on the treatment (i.e., covariates). Then, the researcher will randomly assign individuals to the treatment and the control condition. In the ideal scenario, the control condition matches the treatment condition except for that specific feature of the treatment that the researcher considers to be crucial for causing a change in the outcome measures (e.g., placebo vs the active molecule in pharmacological studies). If the treatment is effective, the treatment group should improve in the outcome measures compared to the control group.

In the case of randomisation with large sample size, the statistical test for a difference at baseline or in other covariates becomes irrelevant as occurring significant differences reflect Type I error (de Boer et al., 2015 ; Roberts & Torgerson, 1999 ), which more likely arises when several covariates are considered (Austin et al., 2010 ). However, large sample sizes are difficult to achieve. Many researchers, especially in the clinical sciences, rely on small naturally occurring samples composed of individuals who voluntarily join the study when they wish to. In this scenario, the sampling is suboptimal as participants are not randomly sampled from the population, but they take part in the study based on convenience and opportunity. Although the assignment to different treatment conditions can be random, differences at baseline are more likely to emerge in small compared to large trials (Bruhn & Mckenzie, 2009 ; Chia, 2000 ; Nguyen & Collins, 2017 ; Saint-mont, 2015 ). Unfortunately, there is no statistical way to control for these differences between groups at pre-test (Miller & Chapman, 2001 ; Van Breukelen, 2006 ). Therefore, the imbalance in the pre-treatment scores can compromise the evaluation of the treatment efficacy, and seriously harm the interpretability of the results. To correct for this, the researcher may choose to allocate individuals to a condition based on previously collected pre-treatment scores and match the groups on these scores. However, this procedure requires the researcher to complete the pre-treatment assessment of all participants before the beginning of the treatment. The whole process may take several months, increase the attrition rate before the treatment begins and cannot account for unwanted changes in the measures of interest. Furthermore, the immediate implementation of the treatment is frequently necessary, especially in a clinical setting, where the treatment must begin in a critical phase of the patients’ clinical condition.

Minimising group differences

One solution is the use of covariate-adaptive randomisation procedures (Chen & Lee, 2011 ; Dragalin et al., 2003 ; Endo et al., 2006 ; Scott et al., 2002 ), which allocate participants to the different conditions as they join the study and, at the same time, reduce the difference between groups on predefined critical variables. There are three commonly used types of covariate-adaptive randomisation methods: stratified randomisation, dynamic hierarchical randomisation, and minimisation (Lin et al., 2015 ). Differences at baseline can be reduced by using stratified randomisation, whereby specific (prognostic) variables are divided into strata and participants are randomly selected from each stratum. However, stratified randomisation becomes difficult to implement as the factors to control for increase (Therneau, 1993 ). In dynamic hierarchical randomisation, covariates are ranked in order of importance and participants are assigned to conditions via biased coin allocation when thresholds of imbalance are exceeded in selected covariates (Signorini et al., 1993 ). A minimisation procedure, the focus of this paper, calculates the level of imbalance in covariates that assigning a participant to each condition would cause, then allocates with high probability (to maintain a degree of randomness) the current participant to the condition that minimises the imbalance.

In this vein, the use of covariate-adaptive randomisation procedures not only matches groups on covariates, but also implicitly forces researchers to state in advance those critical covariates related to the treatment rather than controlling for their effect at a later stage, when running statistical analyses (Simmons et al., 2011 ). A covariate-adaptive randomisation procedure attempts to reduce the unwanted differences at baseline that inadvertently emerge from a random assignment. However, it is worth highlighting that the covariate-adaptive randomisation procedures aim to solve the imbalances at pre-test that might emerge from the random assignment of participants, rather than issues related to non-random selection of participants from naturally occurring samples.

Despite a variety of covariate-adaptive randomisation procedures at disposal, researchers conducting training/treatment studies, including randomised control trials (RCTs), seldom implement these methods (Ciolino et al., 2019 ; Lin et al., 2015 ; Taves, 2010 ). The lack of popularity of these procedures might be due to multiple factors. Researchers may feel more comfortable in implementing more traditional and easier to understand stratified/block randomisation. In this vein, an efficient implementation of covariate-adaptive procedures would require the consultancy of an expert statistician for the entire duration of the trial; an extra cost that principal investigators may prefer to avoid (Ciolino et al., 2019 ). Finally, the lack of free, easy-to-use, computerised functions to automatically implement covariate-adaptive procedures may have contributed to their still limited dissemination (Treasure & Farewell, 2012 ; Treasure & MacRae, 1998 ).

Here, we provide a procedure based on variance minimisation (VM; Frane, 1998 ; Pocock & Simon, 1975 ; Scott et al., 2002 ; Treasure & MacRae, 1998 ), which assigns the next incoming participant to the condition that minimises differences between groups in the chosen measures. Our procedure brings the benefit of using multiple covariates without creating strata in advance, as done in the stratified randomisation, and it is relatively easy to implement compared with the more complex dynamic hierarchical randomisation. The logic and the calculation behind the procedure are simple and easy-to-grasp also from an audience of non-experts. We provided ready-to-use code to implement the procedure using different (also free) software along with step-by-step written instructions, thereby reducing any costs associated with product licenses or consultancy from expert statisticians.

Description of the VM procedure

The goal of the VM procedure is to find the best group assignment for participants prior to an intervention, such that the groups are matched in terms of the scores that the researcher suspects might cause random differences in post-intervention outcomes. The VM procedure requires the researcher to define the number of groups to which participants can be assigned and to collect individual scores for each variable on which groups are matched. These variables can be continuous or binary, where nominal variables with more than two categories can be transformed into multiple dummy variables (as in regression analysis) before being passed to the VM procedure (see section Using VM Procedure on Non-Dichotomous Nominal Variables, in the Supplementary Materials ). The procedure particularly suits those studies in which proper matching is essential, but the assignment to groups needs to occur while the recruitment is still ongoing. It works as follows.

The first participants joining the study are sequentially assigned one to each group. For example, in case of three different groups (i.e., A, B, C), the first participant is assigned to Group A, the second participant to Group B, and the third participant to Group C. Then the fourth participant is added temporarily to each group, and for each temporary group assignment, the algorithm checks which group assignment for this participant would minimize the between-group variance (i.e., V in Fig. ​ Fig.1) 1 ) of the measures of interest and assigns the participant to that group. The next (fifth) participant undergoes the same procedure, but the algorithm will not assign the present participant to the group of the previous participant in order to ensure a balanced distribution of participants in each condition. The same procedure goes on until there is only one group remaining, which in the case of three groups would be for the sixth participant. The sixth participant would be automatically assigned to the remaining group, such that each group would now have two participants assigned to them. Then, the entire procedure starts again with the possibility for the next participant to be assigned to all available groups (for a formal description of the variance minimisation procedure, see section Details of the Minimisation Procedure, in the Supplementary Materials ).

An external file that holds a picture, illustration, etc.
Object name is 13423_2021_1970_Fig1_HTML.jpg

Comparison of assignment to groups using ( a ) variance minimisation and ( b ) random assignment. When a new participant joins a study, variance minimisation assigns the participant to the group that minimises the variance between groups along with the pre-defined variables (i.e., V ); in this case intelligence (IQ), executive functions (EFs), attentional performance (AP), and gender, while keeping the number of participants in each group balanced. Random assignment, on the other hand, assigns the participant to every group with equal probability and does not match the groups

To avoid predictable group assignments due to this shrinking set of available groups, the user can also specify a small probability of random assignment over the VM procedure (see section Discontinuous Implementation of the VM Procedure: The Parameter pRand, in the Supplementary Materials ). This random component makes the assignment unpredictable even if the researcher has access to previous group allocations.

Simulations

We present multiple simulations to illustrate how the VM procedure can be implemented in different scenarios and the advantages it provides.

In the first simulation, we implemented the VM procedure to assign participants to three experimental groups based on three continuous and one dichotomous variable. We compared the matching obtained from the VM procedure with random assignment. In the second simulation, we showed that the VM procedure better detects group differences and provides better estimates of effects compared with the attempt to control for the effect of covariates. In the supplementary materials , we demonstrate how to incorporate a random component in the VM procedure to ensure a non-deterministic assignment of participants to conditions (section Discontinuous Implementation of the VM Procedure: The Parameter pRand ) and how the VM can match participants also on non-dichotomous nominal variables (section Using VM Procedure on Non-Dichotomous Nominal Variables ). We briefly discuss the results of these two additional simulations in the Discussion section.

The functions to implement the VM procedure in Excel, MATLAB, Python, and R along with tutorials, as well as the R code of the simulation, can be found at the Open Science Framework ( https://osf.io/6jfvk/?view_only=8d405f7b794d4e3bbff7e345e6ef4eed ).

VM procedure outperforms random assignment in matching groups on continuous and dichotomous variables

In the first fictional example, a researcher wants to evaluate whether the combination of cognitive training of executive functions and brain stimulation improves the clinical symptoms of ADHD. The study design comprises three groups: the first group receives brain stimulation and the executive functions training; the second group receives sham stimulation and the training; the third group receives neither training nor stimulation (passive control group). The researcher aims to match the three groups on intelligence, executive functions performance, attentional performance, and gender. Figure ​ Figure1 1 illustrates how VM assigns incoming participants compared with a traditional random assignment.

We simulated 1,000 data sets whereby we randomly drew the scores for IQ, executive functions, and attentional performance from a normal distribution, with a mean of 100 and a standard deviation of 15. Participants’ gender came from a binomial distribution with the same probability for a participant to be male or female. The simulated values for the matching variables were randomly generated, therefore there were no real differences between groups. We varied the sample size to be very small ( n = 36), small ( n = 66), medium ( n = 159), and large ( n = 969), reflecting the researcher’s intention to evaluate the possible presence of an extremely large ( f = 0.55), large ( f = 0.40), medium ( f = 0.25), and small ( f = 0.10) effect size, respectively, while keeping the alpha at .05 and power at 80% (Faul et al., 2009 ). We assigned participants to the three groups randomly or by using the VM procedure.

We ran univariate analyses of variance (ANOVAs) with IQ, executive functions, and attentional performance as dependent variables and group as factor whereas differences in gender distribution across groups were analysed using χ 2 tests. In Fig. ​ Fig.2, 2 , we show the distributions of F , p , and η 2 values from ANOVAs on IQ, executive functions, and attentional performance (top panel), whereas in the case of gender, we presented the distribution and χ 2 , p , and Cramer’s V values (bottom panel) separately for the random assignment and the VM procedure across different sample sizes. Compared with random assignment, the VM procedure yielded smaller F , η 2 , χ 2 , and Cramer’s V values and the distribution of p -values was skewed toward 1, rather than uniform. The VM procedure demonstrated an efficient matching between groups starting from a very small sample size while keeping the number of participants in each group balanced. Moreover, both the VM procedure and the random assignment violated ANOVA assumptions on the normality of residuals and homogeneity of variance between groups with a similar rate (see Supplementary Materials, Fig. S1 ).

An external file that holds a picture, illustration, etc.
Object name is 13423_2021_1970_Fig2_HTML.jpg

A comparison of the VM procedure and random assignment based on simulated data. Top panel: Distributions of F -values, p -values, and η 2 values from ANOVAs comparing groups on intelligence (IQ), executive functions (EFs), and attentional performance (AP) separately for the VM procedure (orange boxplots) and the random assignment (blue boxplots). Bottom panel: Distributions of χ 2 , p -values, and Cramer’s V values comparing groups on gender separately for the VM procedure (orange boxplots) and the random assignment (blue boxplots). The boxplots represent the quartiles whereas the whiskers represent the 95% limits of the distribution. (Colour figure online)

Matching groups on a covariate versus controlling for a covariate with imbalance

We simulated an intervention study to display the advantages that the minimisation procedure provides in terms of detecting group differences and better estimates of effects compared with the attempt to control for the effect of covariates in the statistical analysis after the intervention was completed. A researcher evaluates the effect of an intervention on a dependent variable Y while controlling for the possible confounding effect of a covariate A, which positively correlates with Y, and a covariate B that correlates with covariate A (i.e., pattern correlation 1), or Y (i.e., pattern correlation 2), or neither of them (i.e., pattern correlation 3). In this vein, the covariate A represents a variable that the researchers ought to control for, given its known relation with the dependent variable Y, whereas the covariate B represents a non-matching variable that is still inserted into the model as it might have a real or spurious correlation with the covariate A and the dependent variable Y. We simulated a small, medium, and large effect of the intervention (i.e., Cohen’s d = 0.2; d = 0.5; d = 0.8) and, accordingly, we varied the total sample size to be 788, 128, and 52 to achieve a power of 80% while keeping the alpha at .05 (Faul et al., 2009 ). For comparison, we used the same sample sizes, 788, 128, and 52, when simulating the absence of an intervention effect (i.e., Cohen’s d = 0). Crucially, we compared the scenario whereby the researcher matches participants on the covariate A (i.e., VM on CovA) before implementing the intervention or randomly assigns participants to the control and training group and then attempts to control for the effect of covariate after the intervention (i.e., Control for CovA). The subsequent inclusion of the covariate A in the analysis, especially in the case of imbalance between groups in the covariate A, would bias the effect of the group on Y when the difference between groups in the covariate A is larger in the direction of the intervention effect. Conversely, the minimisation procedure reduces the difference between groups on the covariate A and the inclusion of the covariate A into the analysis (i.e., analysis of covariance; ANCOVA) would not cause biases in the estimation of the effect of the group on Y.

In the case of the control for covariate approach, we generated the scores of the covariate A by taking them from a standard normal distribution ( M = 0, SD = 1) and we randomly assigned participants to the control and training group. We generated an imbalance in the covariate A by calculating the standard error of the mean and multiplying it for the standard normal deviates ±1.28, ±1.64, ±1.96 corresponding to the 20%, 10%, and 5% probabilities respectively of the standard normal distribution. The use of the standard error allowed to keep the imbalance proportionate to the sample size. The obtained imbalance was added to the scores of the covariate A only for the training group, thereby generating a difference in covariate A that went in the same or in the opposite direction with respect to the intervention effect (i.e., larger scores on the dependent variable only for the training group; Egbewale et al., 2014 ). We also included the case of absent imbalance for reference. In the case of the VM procedure, we took the previously generated scores of the covariate A with the imbalance, and we assigned participants to the control or training group using the VM procedure. Then, we generated the scores of Y that were correlated with the covariate A according to four correlations, that were, 0, 0.5, 0.7, and 0.9. Finally, we added 0, 0.2, 0.5, 0.8 to the Y scores of the training group to simulate an absent, small, medium, and large effect of the intervention.

In both the random assignment and the VM procedure, the covariate B was generated to alternatively have a correlation of 0.5 ( SD = 0.1) with the covariate A (i.e., Pattern 1), Y (i.e., Pattern 2), or no correlation with these two variables (i.e., Pattern 3). We randomly selected the correlation from a normal distribution with an average 0.5 and standard deviation of 0.1 to add some noise to the correlation while maintaining it positive and centred on 0.5.

Overall, we varied multiple experimental conditions in 504 scenarios (for a similar approach, see Egbewale et al., 2014 ):

  • seven imbalances on the covariate A: −1.96, −1.64, −1.28, 0, 1.28, 1.64, 1.96;
  • four correlations between covariates A and Y: 0, 0.5, 0.7, 0.9;
  • six treatment effects: 0 (×3 as the absence of the effect was tested with three sample sizes, that were, 52, 128, 788), 0.2, 0.5, 0.8;
  • three patterns of correlation between the covariate B, covariate A, and Y.

We simulated each scenario 1,000 times.

As expected, the correlations between the covariate B and the other two variables varied according to the pre-specified patterns of correlations, which were practically identical in the VM and control for covariate approach (see Table S1 in the Supplementary Materials).

We ran a series of ANCOVAs with Y as the dependent variable, the covariates A and B, and group [Training, Control] as independent variables. We used a regression approach as the variable group was converted to a dichotomous numerical variable (i.e., control = 0, training = 1) to directly use the regression coefficients as estimates for the effect of each variable on Y. Both the VM procedure and the control for the covariate approach display a similar rate in violating ANCOVA assumptions of the normality of residuals and homogeneity of variance between groups (see Supplementary Materials; Fig. S2 ).

In this fictitious scenario, the researcher would be interested in evaluating the effect of the group on Y while controlling for covariates. Therefore, we reported the proportion of significant results ( p < .05; Fig. ​ Fig.3) 3 ) and the estimated effect (i.e., coefficient of the regression; Fig. ​ Fig.4) 4 ) for the effect of group on Y depending on the imbalance in the covariate A, the effect size of the intervention, and the degree of correlation between the covariate A and Y. For simplicity, in Figs. ​ Figs.3 3 and ​ and4, 4 , we reported only the simulation with a large sample size (i.e., n = 788) when the effect of the intervention was absent (i.e., d =0). The pattern of results remained stable across the patterns of correlations of the covariate B. Therefore, we reported the proportion of significant results and estimated effects for the group, covariate A, and covariate B across the patterns correlation of the covariate B in the Supplementary Materials (Figs. S5 – S22 ).

An external file that holds a picture, illustration, etc.
Object name is 13423_2021_1970_Fig3_HTML.jpg

Proportion of significant results ( y -axis) for the effect of group in the ANCOVA (Y ~ CovA + CovB + Group) separately for the VM procedure (orange lines) and control for CovA approach (blue lines) across imbalances of the covariate A ( x -axis) when the sample size varied according to the effect size to be detected (rows; absent = 0, n = 788; small = 0.2, n = 788; medium = 0.5, n = 128; large = 0.8, n = 52) and the correlation between the covariate A and the dependent variable Y ranged between 0 and 0.9 (columns). The black dotted line represents alpha (i.e., 0.05) and the dashed black line represents the expected power (i.e., 0.8). (Colour figure online)

An external file that holds a picture, illustration, etc.
Object name is 13423_2021_1970_Fig4_HTML.jpg

Median of estimates ( y -axis; regression coefficients) for the effect of group in the ANCOVA (Y ~ CovA + CovB + Group) separately for the VM procedure (orange lines) and control for CovA approach (blue lines) across imbalances of the covariate A ( x -axis) when the sample size varied according to the effect size to be detected (rows; absent = 0, n = 788; small = 0.2, n = 788; medium = 0.5, n = 128; large = 0.8, n = 52) and the correlation between the covariate A and the dependent variable Y ranged between 0 and 0.9 (columns). The black dotted line represents the expected regression coefficients (i.e., 0, 0.2, 0.5, 0.8). (Colour figure online)

When the effect of the intervention was present (second to fourth rows in Fig. ​ Fig.3), 3 ), the VM procedure showed a more stable detection of significant results also in the presence of serious imbalances in the covariate A. This stability became clearer as the correlation between the covariate A and Y increased. When the effect of the intervention was absent (first row in Fig. ​ Fig.3), 3 ), the VM procedure always kept the Type I error around 0.05 while the control covariate approach inflated Type I error rate in the case of strong imbalance in the covariate A when it was highly correlated (i.e., 0.7, 0.9) with the outcome variable Y.

A similar pattern of results emerged when we compared the estimates of the effect of the group (i.e., regression coefficients) yielded by the VM procedure and the control for covariate approach. The VM procedure always provided accurate estimates of the effect of the group. Conversely, the control for covariate approach returned biased estimates with large imbalances in the covariate A and when its correlation with the outcome variable Y was high (i.e., 0.7, 0.9; Fig. ​ Fig.4 4 ).

In treatment studies, groups should be as similar as possible in all the variables of interest before the beginning of the treatment. An optimal matching can ensure that the effect of the treatment is not related to the pre-treatment characteristics of the groups and can, therefore, be extended to the general population. In contrast, the random assignment can yield relevant, and even statistically significant, differences between the groups before the treatment (Treasure & MacRae, 1998 ).

The proposed VM procedure constitutes a quick and useful tool to match groups before treatment on both continuous and categorical covariates (Pocock & Simon, 1975 ; Scott et al., 2002 ; Treasure & MacRae, 1998 ). The latter, though, need to be transformed into dummy variables to be passed to the minimisation algorithm (for a minimisation procedure that directly handles nominal covariates see Colavincenzo, 2013 ). We simulated an intervention study whereby a researcher used the VM procedure on a covariate to assign participants to a control and intervention group rather than controlling for the covariate at the analysis stage. Among other features of the simulated study, we manipulated the correlation between the matching covariate and the outcome variable and the presence of imbalance between groups in the covariate. Controlling for covariates post hoc inflated Type I error rate and yielded biased estimates of the effect of the group on the outcome variable when the imbalance between groups in the covariate increased and the correlation between the covariate and the outcome variable was high. Conversely, the use of VM on the covariate did not inflate Type I error rate and provided accurate estimates of the effect of the group on the outcome variable.

The progressive shrinking of available conditions when using the VM procedure ensures a perfect balance in the number of participants across conditions while still minimising covariate imbalance. However, some participants will be forcefully assigned to a given condition irrespective of their scores in the covariates. Therefore, in some instances, the researcher will know in advance the condition the participants will be assigned to and not all participants will have the chance to be assigned to each of the available conditions. This restriction might be relevant for clinical trials where one of the conditions is potentially beneficial (i.e., the treatment group). In this case, the researcher can insert a random component into the VM procedure by defining the probability to implement a random assignment. The random component prevents the researcher from being sure about the condition some participants will be assigned to and gives all participants the possibility, in principle, to be assigned to one of the conditions. Using a small amount of randomness (e.g., pRand = 0.1) provides a good balance between matching groups on covariates while avoiding predictable allocation (see section Discontinuous Implementation of the VM Procedure: The Parameter pRand, in the Supplementary Materials ).

Despite the benefits of the minimisation procedure, limitations must be carefully considered. First, the application of the VM procedure on small sample sizes does not prevent the treatment effect from being influenced by the unequal distribution of unobserved confounding variables, whose equal distribution is most likely achieved with large sample sizes. This limitation related to small sample sizes affects both the VM procedure and random assignment. Nevertheless, the selection of matching covariates for the minimisation procedure encourages researchers to carefully think in advance about possible confounding variables and match participants on them. Secondly, we showed that the VM is beneficial in simple ANOVA/ANCOVA simulations. In the case of more complex models (e.g., with an interaction), the researcher should carefully consider whether the minimisation procedure constitutes an advantage to the design. We recommend running simulations tailored to specific research designs to ensure that the VM procedure adequately matches participants across conditions.

Third, the minimisation procedure considers all covariates equally important without giving the user the possibility to allow more imbalance in some covariates compared to others (for a minimisation procedure that allows weighting see Saghaei, 2011 ). It is therefore paramount that the researchers will carefully consider the covariates they wish to match the groups on.

Overall, our minimisation procedure, even after considering the above-mentioned limitations, provides important advantages over the randomisation procedure that is used frequently. Its relative simplicity encourages researchers to use covariate-adaptive matching procedures (Ciolino et al., 2019 ; Lin et al., 2015 ). To allow the requested shift from the randomisation procedure, we provide scripts, written using popular software (i.e., R, Python, MATLAB, and Excel), which allow a fast and easy implementation of the VM procedure and integration with other stimulus presentation and analysis scripts. In this light, the treatment can start in the same session in which pre-treatment measures are acquired, thereby reducing the total number of sessions and, consequently, the overall costs. The immediate application of the treatment also excludes the possibility that pre-treatment measures change between the period of the initial recruitment and the actual implementation of the treatment. We strongly recommend using the VM procedure in these studies to yield more effective and valid RCTs.

(DOCX 2855 kb)

Acknowledgements

This study was supported by the European Research Council (Learning&Achievement 338065).

Open practices statement

The R code of the analyses is available at https://osf.io/6jfvk/?view_only=8d405f7b794d4e3bbff7e345e6ef4eed

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • Austin PC, Manca A, Zwarenstein M, Juurlink DN, Stanbrook MB. Baseline comparisons in randomized controlled trials. Journal of Clinical Epidemiology. 2010; 63 (8):940–942. doi: 10.1016/j.jclinepi.2010.03.009. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bruhn M, Mckenzie D. In pursuit of balance: Randomization in practice in development field experiments. American Economic Journal: Applied Economics. 2009; 4 (1):200–232. [ Google Scholar ]
  • Chen LH, Lee WC. Two-way minimization: A novel treatment allocation method for small trials. PLOS ONE. 2011; 6 (12):1–8. doi: 10.1371/journal.pone.0028604. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chia KS. Randomisation: Magical cure for bias? Annals of the Academy of Medicine . Singapore. 2000; 29 (5):563–564. [ PubMed ] [ Google Scholar ]
  • Ciolino, J. D., Palac, H. L., Yang, A., Vaca, M., & Belli, H. M. (2019). Ideal vs. real: A systematic review on handling covariates in randomized controlled trials. BMC Medical Research Methodology, 19 (1), 136. 10.1186/s12874-019-0787-8 [ PMC free article ] [ PubMed ]
  • Colavincenzo, J. (2013). Doctoring your clinical trial with adaptive randomization: SAS® Macros to perform adaptive randomization. Proceedings of the SAS® Global Forum 2013 Conference [Internet]. Cary (NC): SAS Institute Inc. https://support.sas.com/resources/papers/proceedings13/181-2013.pdf
  • de Boer MR, Waterlander WE, Kuijper LDJ, Steenhuis IHM, Twisk JWR. Testing for baseline differences in randomized controlled trials: An unhealthy research behavior that is hard to eradicate. International Journal of Behavioral Nutrition and Physical Activity. 2015; 12 (1):1–8. doi: 10.1186/s12966-015-0162-z. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dragalin V, Fedorov V, Patterson S, Jones B. Kullback-Leibler divergence for evaluating bioequivalence. Statistics in Medicine. 2003; 22 (6):913–930. doi: 10.1002/sim.1451. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Egbewale, B. E., Lewis, M. & Sim, J. (2014). Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study. BMC Med Res Methodol, 14 , 49. 10.1186/1471-2288-14-49 [ PMC free article ] [ PubMed ]
  • Endo A, Nagatani F, Hamada C, Yoshimura I. Minimization method for balancing continuous prognostic variables between treatment and control groups using Kullback-Leibler divergence. Contemporary Clinical Trials. 2006; 27 (5):420–431. doi: 10.1016/j.cct.2006.05.002. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Faul F, Erdfelder E, Buchner A, Lang A-G. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behavior Research Methods. 2009; 41 (4):1149–1160. doi: 10.3758/BRM.41.4.1149. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Frane JW. A method of biased coin randomisation, its implementation and its validation. Drug Information Journal. 1998; 32 :423–432. doi: 10.1177/009286159803200213. [ CrossRef ] [ Google Scholar ]
  • Lin Y, Zhu M, Su Z. The pursuit of balance: An overview of covariate-adaptive randomization techniques in clinical trials. Contemporary Clinical Trials. 2015; 45 :21–25. doi: 10.1016/j.cct.2015.07.011. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Miller GA, Chapman JP. Misunderstanding analysis of covariance. Journal of Abnormal Psychology. 2001; 110 (1):40–48. doi: 10.1037/0021-843X.110.1.40. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Nguyen, T., & Collins, G. S. (2017). Simple randomization did not protect against bias in smaller trials, 84 , 105–113. 10.1016/j.jclinepi.2017.02.010 [ PubMed ]
  • Pocock SJ, Simon R. Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics. 1975; 31 (1):103. doi: 10.2307/2529712. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Roberts C, Torgerson DJ. Understanding controlled trials: Baseline imbalance in randomised controlled trials. BMJ. 1999; 319 (7203):185–185. doi: 10.1136/bmj.319.7203.185. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Saghaei M. An overview of randomization and minimization programs for randomized clinical trials. Journal of Medical Signals and Sensors. 2011; 1 (1):55. doi: 10.4103/2228-7477.83520. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Saint-mont, U. (2015). Randomization does not help much, comparability does. PLOS ONE, 10 (7), Article e0132102. 10.1371/journal.pone.0132102 [ PMC free article ] [ PubMed ]
  • Scott NW, McPherson GC, Ramsay CR, Campbell MK. The method of minimization for allocation to clinical trials: A review. Controlled Clinical Trials. 2002; 23 (6):662–674. doi: 10.1016/S0197-2456(02)00242-8. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Signorini DF, Leung O, Simes RJ, Beller E, Gebski VJ, Callaghan T. Dynamic balanced randomization for clinical trials. Statistics in Medicine. 1993; 12 (24):2343–2350. doi: 10.1002/sim.4780122410. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science. 2011; 22 (11):1359–1366. doi: 10.1177/0956797611417632. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Taves DR. The use of minimization in clinical trials. Contemporary Clinical Trials. 2010; 31 (2):180–184. doi: 10.1016/j.cct.2009.12.005. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Therneau TM. How many stratification factors are “too many” to use in a randomization plan? Controlled Clinical Trials. 1993; 14 (2):98–108. doi: 10.1016/0197-2456(93)90013-4. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treasure T, Farewell V. Minimization in interventional trials : great value but residual vulnerability. Journal of Clinical Epidemiology. 2012; 65 (1):7–9. doi: 10.1016/j.jclinepi.2011.07.005. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treasure T, MacRae KD. Minimisation: The platinum standard for trials? BMJ (Clinical Research Ed.) 1998; 317 (7155):362–363. doi: 10.1136/bmj.317.7155.362. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Van Breukelen GJP. ANCOVA versus change from baseline had more power in randomized studies and more bias in nonrandomized studies. Journal of Clinical Epidemiology. 2006; 59 (9):920–925. doi: 10.1016/j.jclinepi.2006.02.007. [ PubMed ] [ CrossRef ] [ Google Scholar ]

5.2 Experimental Design

Learning objectives.

  • Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question.
  • Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it
  • Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

In this section, we look at some different ways to design an experiment. The primary distinction we will make is between approaches in which each participant experiences one level of the independent variable and approaches in which each participant experiences all levels of the independent variable. The former are called between-subjects experiments and the latter are called within-subjects experiments.

Between-Subjects Experiments

In a  between-subjects experiment , each participant is tested in only one condition. For example, a researcher with a sample of 100 university  students might assign half of them to write about a traumatic event and the other half write about a neutral event. Or a researcher with a sample of 60 people with severe agoraphobia (fear of open spaces) might assign 20 of them to receive each of three different treatments for that disorder. It is essential in a between-subjects experiment that the researcher assigns participants to conditions so that the different groups are, on average, highly similar to each other. Those in a trauma condition and a neutral condition, for example, should include a similar proportion of men and women, and they should have similar average intelligence quotients (IQs), similar average levels of motivation, similar average numbers of health problems, and so on. This matching is a matter of controlling these extraneous participant variables across conditions so that they do not become confounding variables.

Random Assignment

The primary way that researchers accomplish this kind of control of extraneous variables across conditions is called  random assignment , which means using a random process to decide which participants are tested in which conditions. Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population, and it is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other fields too.

In its strictest sense, random assignment should meet two criteria. One is that each participant has an equal chance of being assigned to each condition (e.g., a 50% chance of being assigned to each of two conditions). The second is that each participant is assigned to a condition independently of other participants. Thus one way to assign participants to two conditions would be to flip a coin for each one. If the coin lands heads, the participant is assigned to Condition A, and if it lands tails, the participant is assigned to Condition B. For three conditions, one could use a computer to generate a random integer from 1 to 3 for each participant. If the integer is 1, the participant is assigned to Condition A; if it is 2, the participant is assigned to Condition B; and if it is 3, the participant is assigned to Condition C. In practice, a full sequence of conditions—one for each participant expected to be in the experiment—is usually created ahead of time, and each new participant is assigned to the next condition in the sequence as he or she is tested. When the procedure is computerized, the computer program often handles the random assignment.

One problem with coin flipping and other strict procedures for random assignment is that they are likely to result in unequal sample sizes in the different conditions. Unequal sample sizes are generally not a serious problem, and you should never throw away data you have already collected to achieve equal sample sizes. However, for a fixed number of participants, it is statistically most efficient to divide them into equal-sized groups. It is standard practice, therefore, to use a kind of modified random assignment that keeps the number of participants in each group as similar as possible. One approach is block randomization . In block randomization, all the conditions occur once in the sequence before any of them is repeated. Then they all occur again before any of them is repeated again. Within each of these “blocks,” the conditions occur in a random order. Again, the sequence of conditions is usually generated before any participants are tested, and each new participant is assigned to the next condition in the sequence.  Table 5.2  shows such a sequence for assigning nine participants to three conditions. The Research Randomizer website ( http://www.randomizer.org ) will generate block randomization sequences for any number of participants and conditions. Again, when the procedure is computerized, the computer program often handles the block randomization.

4 B
5 C
6 A

Random assignment is not guaranteed to control all extraneous variables across conditions. The process is random, so it is always possible that just by chance, the participants in one condition might turn out to be substantially older, less tired, more motivated, or less depressed on average than the participants in another condition. However, there are some reasons that this possibility is not a major concern. One is that random assignment works better than one might expect, especially for large samples. Another is that the inferential statistics that researchers use to decide whether a difference between groups reflects a difference in the population takes the “fallibility” of random assignment into account. Yet another reason is that even if random assignment does result in a confounding variable and therefore produces misleading results, this confound is likely to be detected when the experiment is replicated. The upshot is that random assignment to conditions—although not infallible in terms of controlling extraneous variables—is always considered a strength of a research design.

Matched Groups

An alternative to simple random assignment of participants to conditions is the use of a matched-groups design . Using this design, participants in the various conditions are matched on the dependent variable or on some extraneous variable(s) prior the manipulation of the independent variable. This guarantees that these variables will not be confounded across the experimental conditions. For instance, if we want to determine whether expressive writing affects people’s health then we could start by measuring various health-related variables in our prospective research participants. We could then use that information to rank-order participants according to how healthy or unhealthy they are. Next, the two healthiest participants would be randomly assigned to complete different conditions (one would be randomly assigned to the traumatic experiences writing condition and the other to the neutral writing condition). The next two healthiest participants would then be randomly assigned to complete different conditions, and so on until the two least healthy participants. This method would ensure that participants in the traumatic experiences writing condition are matched to participants in the neutral writing condition with respect to health at the beginning of the study. If at the end of the experiment, a difference in health was detected across the two conditions, then we would know that it is due to the writing manipulation and not to pre-existing differences in health.

Within-Subjects Experiments

In a  within-subjects experiment , each participant is tested under all conditions. Consider an experiment on the effect of a defendant’s physical attractiveness on judgments of his guilt. Again, in a between-subjects experiment, one group of participants would be shown an attractive defendant and asked to judge his guilt, and another group of participants would be shown an unattractive defendant and asked to judge his guilt. In a within-subjects experiment, however, the same group of participants would judge the guilt of both an attractive  and  an unattractive defendant.

The primary advantage of this approach is that it provides maximum control of extraneous participant variables. Participants in all conditions have the same mean IQ, same socioeconomic status, same number of siblings, and so on—because they are the very same people. Within-subjects experiments also make it possible to use statistical procedures that remove the effect of these extraneous participant variables on the dependent variable and therefore make the data less “noisy” and the effect of the independent variable easier to detect. We will look more closely at this idea later in the book .  However, not all experiments can use a within-subjects design nor would it be desirable to do so.

One disadvantage of within-subjects experiments is that they make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. This  knowledge could  lead the participant to judge the unattractive defendant more harshly because he thinks this is what he is expected to do. Or it could make participants judge the two defendants similarly in an effort to be “fair.”

Carryover Effects and Counterbalancing

The primary disadvantage of within-subjects designs is that they can result in order effects. An order effect  occurs when participants’ responses in the various conditions are affected by the order of conditions to which they were exposed. One type of order effect is a carryover effect. A  carryover effect  is an effect of being tested in one condition on participants’ behavior in later conditions. One type of carryover effect is a  practice effect , where participants perform a task better in later conditions because they have had a chance to practice it. Another type is a fatigue effect , where participants perform a task worse in later conditions because they become tired or bored. Being tested in one condition can also change how participants perceive stimuli or interpret their task in later conditions. This  type of effect is called a  context effect (or contrast effect) . For example, an average-looking defendant might be judged more harshly when participants have just judged an attractive defendant than when they have just judged an unattractive defendant. Within-subjects experiments also make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. 

Carryover effects can be interesting in their own right. (Does the attractiveness of one person depend on the attractiveness of other people that we have seen recently?) But when they are not the focus of the research, carryover effects can be problematic. Imagine, for example, that participants judge the guilt of an attractive defendant and then judge the guilt of an unattractive defendant. If they judge the unattractive defendant more harshly, this might be because of his unattractiveness. But it could be instead that they judge him more harshly because they are becoming bored or tired. In other words, the order of the conditions is a confounding variable. The attractive condition is always the first condition and the unattractive condition the second. Thus any difference between the conditions in terms of the dependent variable could be caused by the order of the conditions and not the independent variable itself.

There is a solution to the problem of order effects, however, that can be used in many situations. It is  counterbalancing , which means testing different participants in different orders. The best method of counterbalancing is complete counterbalancing  in which an equal number of participants complete each possible order of conditions. For example, half of the participants would be tested in the attractive defendant condition followed by the unattractive defendant condition, and others half would be tested in the unattractive condition followed by the attractive condition. With three conditions, there would be six different orders (ABC, ACB, BAC, BCA, CAB, and CBA), so some participants would be tested in each of the six orders. With four conditions, there would be 24 different orders; with five conditions there would be 120 possible orders. With counterbalancing, participants are assigned to orders randomly, using the techniques we have already discussed. Thus, random assignment plays an important role in within-subjects designs just as in between-subjects designs. Here, instead of randomly assigning to conditions, they are randomly assigned to different orders of conditions. In fact, it can safely be said that if a study does not involve random assignment in one form or another, it is not an experiment.

A more efficient way of counterbalancing is through a Latin square design which randomizes through having equal rows and columns. For example, if you have four treatments, you must have four versions. Like a Sudoku puzzle, no treatment can repeat in a row or column. For four versions of four treatments, the Latin square design would look like:

A B C D
B C D A
C D A B
D A B C

You can see in the diagram above that the square has been constructed to ensure that each condition appears at each ordinal position (A appears first once, second once, third once, and fourth once) and each condition preceded and follows each other condition one time. A Latin square for an experiment with 6 conditions would by 6 x 6 in dimension, one for an experiment with 8 conditions would be 8 x 8 in dimension, and so on. So while complete counterbalancing of 6 conditions would require 720 orders, a Latin square would only require 6 orders.

Finally, when the number of conditions is large experiments can use  random counterbalancing  in which the order of the conditions is randomly determined for each participant. Using this technique every possible order of conditions is determined and then one of these orders is randomly selected for each participant. This is not as powerful a technique as complete counterbalancing or partial counterbalancing using a Latin squares design. Use of random counterbalancing will result in more random error, but if order effects are likely to be small and the number of conditions is large, this is an option available to researchers.

There are two ways to think about what counterbalancing accomplishes. One is that it controls the order of conditions so that it is no longer a confounding variable. Instead of the attractive condition always being first and the unattractive condition always being second, the attractive condition comes first for some participants and second for others. Likewise, the unattractive condition comes first for some participants and second for others. Thus any overall difference in the dependent variable between the two conditions cannot have been caused by the order of conditions. A second way to think about what counterbalancing accomplishes is that if there are carryover effects, it makes it possible to detect them. One can analyze the data separately for each order to see whether it had an effect.

When 9 Is “Larger” Than 221

Researcher Michael Birnbaum has argued that the  lack  of context provided by between-subjects designs is often a bigger problem than the context effects created by within-subjects designs. To demonstrate this problem, he asked participants to rate two numbers on how large they were on a scale of 1-to-10 where 1 was “very very small” and 10 was “very very large”.  One group of participants were asked to rate the number 9 and another group was asked to rate the number 221 (Birnbaum, 1999) [1] . Participants in this between-subjects design gave the number 9 a mean rating of 5.13 and the number 221 a mean rating of 3.10. In other words, they rated 9 as larger than 221! According to Birnbaum, this  difference  is because participants spontaneously compared 9 with other one-digit numbers (in which case it is  relatively large) and compared 221 with other three-digit numbers (in which case it is relatively  small).

Simultaneous Within-Subjects Designs

So far, we have discussed an approach to within-subjects designs in which participants are tested in one condition at a time. There is another approach, however, that is often used when participants make multiple responses in each condition. Imagine, for example, that participants judge the guilt of 10 attractive defendants and 10 unattractive defendants. Instead of having people make judgments about all 10 defendants of one type followed by all 10 defendants of the other type, the researcher could present all 20 defendants in a sequence that mixed the two types. The researcher could then compute each participant’s mean rating for each type of defendant. Or imagine an experiment designed to see whether people with social anxiety disorder remember negative adjectives (e.g., “stupid,” “incompetent”) better than positive ones (e.g., “happy,” “productive”). The researcher could have participants study a single list that includes both kinds of words and then have them try to recall as many words as possible. The researcher could then count the number of each type of word that was recalled. 

Between-Subjects or Within-Subjects?

Almost every experiment can be conducted using either a between-subjects design or a within-subjects design. This possibility means that researchers must choose between the two approaches based on their relative merits for the particular situation.

Between-subjects experiments have the advantage of being conceptually simpler and requiring less testing time per participant. They also avoid carryover effects without the need for counterbalancing. Within-subjects experiments have the advantage of controlling extraneous participant variables, which generally reduces noise in the data and makes it easier to detect a relationship between the independent and dependent variables.

A good rule of thumb, then, is that if it is possible to conduct a within-subjects experiment (with proper counterbalancing) in the time that is available per participant—and you have no serious concerns about carryover effects—this design is probably the best option. If a within-subjects design would be difficult or impossible to carry out, then you should consider a between-subjects design instead. For example, if you were testing participants in a doctor’s waiting room or shoppers in line at a grocery store, you might not have enough time to test each participant in all conditions and therefore would opt for a between-subjects design. Or imagine you were trying to reduce people’s level of prejudice by having them interact with someone of another race. A within-subjects design with counterbalancing would require testing some participants in the treatment condition first and then in a control condition. But if the treatment works and reduces people’s level of prejudice, then they would no longer be suitable for testing in the control condition. This difficulty is true for many designs that involve a treatment meant to produce long-term change in participants’ behavior (e.g., studies testing the effectiveness of psychotherapy). Clearly, a between-subjects design would be necessary here.

Remember also that using one type of design does not preclude using the other type in a different study. There is no reason that a researcher could not use both a between-subjects design and a within-subjects design to answer the same research question. In fact, professional researchers often take exactly this type of mixed methods approach.

Key Takeaways

  • Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
  • Random assignment to conditions in between-subjects experiments or counterbalancing of orders of conditions in within-subjects experiments is a fundamental element of experimental research. The purpose of these techniques is to control extraneous variables so that they do not become confounding variables.
  • You want to test the relative effectiveness of two training programs for running a marathon.
  • Using photographs of people as stimuli, you want to see if smiling people are perceived as more intelligent than people who are not smiling.
  • In a field experiment, you want to see if the way a panhandler is dressed (neatly vs. sloppily) affects whether or not passersby give him any money.
  • You want to see if concrete nouns (e.g.,  dog ) are recalled better than abstract nouns (e.g.,  truth).
  • Birnbaum, M.H. (1999). How to show that 9>221: Collect judgments in a between-subjects design. Psychological Methods, 4 (3), 243-249. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Random Assignment in Psychology: Definition & Examples

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

In psychology, random assignment refers to the practice of allocating participants to different experimental groups in a study in a completely unbiased way, ensuring each participant has an equal chance of being assigned to any group.

In experimental research, random assignment, or random placement, organizes participants from your sample into different groups using randomization. 

Random assignment uses chance procedures to ensure that each participant has an equal opportunity of being assigned to either a control or experimental group.

The control group does not receive the treatment in question, whereas the experimental group does receive the treatment.

When using random assignment, neither the researcher nor the participant can choose the group to which the participant is assigned. This ensures that any differences between and within the groups are not systematic at the onset of the study. 

In a study to test the success of a weight-loss program, investigators randomly assigned a pool of participants to one of two groups.

Group A participants participated in the weight-loss program for 10 weeks and took a class where they learned about the benefits of healthy eating and exercise.

Group B participants read a 200-page book that explains the benefits of weight loss. The investigator randomly assigned participants to one of the two groups.

The researchers found that those who participated in the program and took the class were more likely to lose weight than those in the other group that received only the book.

Importance 

Random assignment ensures that each group in the experiment is identical before applying the independent variable.

In experiments , researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. Random assignment increases the likelihood that the treatment groups are the same at the onset of a study.

Thus, any changes that result from the independent variable can be assumed to be a result of the treatment of interest. This is particularly important for eliminating sources of bias and strengthening the internal validity of an experiment.

Random assignment is the best method for inferring a causal relationship between a treatment and an outcome.

Random Selection vs. Random Assignment 

Random selection (also called probability sampling or random sampling) is a way of randomly selecting members of a population to be included in your study.

On the other hand, random assignment is a way of sorting the sample participants into control and treatment groups. 

Random selection ensures that everyone in the population has an equal chance of being selected for the study. Once the pool of participants has been chosen, experimenters use random assignment to assign participants into groups. 

Random assignment is only used in between-subjects experimental designs, while random selection can be used in a variety of study designs.

Random Assignment vs Random Sampling

Random sampling refers to selecting participants from a population so that each individual has an equal chance of being chosen. This method enhances the representativeness of the sample.

Random assignment, on the other hand, is used in experimental designs once participants are selected. It involves allocating these participants to different experimental groups or conditions randomly.

This helps ensure that any differences in results across groups are due to manipulating the independent variable, not preexisting differences among participants.

When to Use Random Assignment

Random assignment is used in experiments with a between-groups or independent measures design.

In these research designs, researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.

There is usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable at the onset of the study.

How to Use Random Assignment

There are a variety of ways to assign participants into study groups randomly. Here are a handful of popular methods: 

  • Random Number Generator : Give each member of the sample a unique number; use a computer program to randomly generate a number from the list for each group.
  • Lottery : Give each member of the sample a unique number. Place all numbers in a hat or bucket and draw numbers at random for each group.
  • Flipping a Coin : Flip a coin for each participant to decide if they will be in the control group or experimental group (this method can only be used when you have just two groups) 
  • Roll a Die : For each number on the list, roll a dice to decide which of the groups they will be in. For example, assume that rolling 1, 2, or 3 places them in a control group and rolling 3, 4, 5 lands them in an experimental group.

When is Random Assignment not used?

  • When it is not ethically permissible: Randomization is only ethical if the researcher has no evidence that one treatment is superior to the other or that one treatment might have harmful side effects. 
  • When answering non-causal questions : If the researcher is just interested in predicting the probability of an event, the causal relationship between the variables is not important and observational designs would be more suitable than random assignment. 
  • When studying the effect of variables that cannot be manipulated: Some risk factors cannot be manipulated and so it would not make any sense to study them in a randomized trial. For example, we cannot randomly assign participants into categories based on age, gender, or genetic factors.

Drawbacks of Random Assignment

While randomization assures an unbiased assignment of participants to groups, it does not guarantee the equality of these groups. There could still be extraneous variables that differ between groups or group differences that arise from chance. Additionally, there is still an element of luck with random assignments.

Thus, researchers can not produce perfectly equal groups for each specific study. Differences between the treatment group and control group might still exist, and the results of a randomized trial may sometimes be wrong, but this is absolutely okay.

Scientific evidence is a long and continuous process, and the groups will tend to be equal in the long run when data is aggregated in a meta-analysis.

Additionally, external validity (i.e., the extent to which the researcher can use the results of the study to generalize to the larger population) is compromised with random assignment.

Random assignment is challenging to implement outside of controlled laboratory conditions and might not represent what would happen in the real world at the population level. 

Random assignment can also be more costly than simple observational studies, where an investigator is just observing events without intervening with the population.

Randomization also can be time-consuming and challenging, especially when participants refuse to receive the assigned treatment or do not adhere to recommendations. 

What is the difference between random sampling and random assignment?

Random sampling refers to randomly selecting a sample of participants from a population. Random assignment refers to randomly assigning participants to treatment groups from the selected sample.

Does random assignment increase internal validity?

Yes, random assignment ensures that there are no systematic differences between the participants in each group, enhancing the study’s internal validity .

Does random assignment reduce sampling error?

Yes, with random assignment, participants have an equal chance of being assigned to either a control group or an experimental group, resulting in a sample that is, in theory, representative of the population.

Random assignment does not completely eliminate sampling error because a sample only approximates the population from which it is drawn. However, random sampling is a way to minimize sampling errors. 

When is random assignment not possible?

Random assignment is not possible when the experimenters cannot control the treatment or independent variable.

For example, if you want to compare how men and women perform on a test, you cannot randomly assign subjects to these groups.

Participants are not randomly assigned to different groups in this study, but instead assigned based on their characteristics.

Does random assignment eliminate confounding variables?

Yes, random assignment eliminates the influence of any confounding variables on the treatment because it distributes them at random among the study groups. Randomization invalidates any relationship between a confounding variable and the treatment.

Why is random assignment of participants to treatment conditions in an experiment used?

Random assignment is used to ensure that all groups are comparable at the start of a study. This allows researchers to conclude that the outcomes of the study can be attributed to the intervention at hand and to rule out alternative explanations for study results.

Further Reading

  • Bogomolnaia, A., & Moulin, H. (2001). A new solution to the random assignment problem .  Journal of Economic theory ,  100 (2), 295-328.
  • Krause, M. S., & Howard, K. I. (2003). What random assignment does and does not do .  Journal of Clinical Psychology ,  59 (7), 751-766.

Print Friendly, PDF & Email

Chapter 6: Experimental Research

6.2 experimental design, learning objectives.

  • Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question.
  • Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it.
  • Define what a control condition is, explain its purpose in research on treatment effectiveness, and describe some alternative types of control conditions.
  • Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

In this section, we look at some different ways to design an experiment. The primary distinction we will make is between approaches in which each participant experiences one level of the independent variable and approaches in which each participant experiences all levels of the independent variable. The former are called between-subjects experiments and the latter are called within-subjects experiments.

Between-Subjects Experiments

In a between-subjects experiment , each participant is tested in only one condition. For example, a researcher with a sample of 100 college students might assign half of them to write about a traumatic event and the other half write about a neutral event. Or a researcher with a sample of 60 people with severe agoraphobia (fear of open spaces) might assign 20 of them to receive each of three different treatments for that disorder. It is essential in a between-subjects experiment that the researcher assign participants to conditions so that the different groups are, on average, highly similar to each other. Those in a trauma condition and a neutral condition, for example, should include a similar proportion of men and women, and they should have similar average intelligence quotients (IQs), similar average levels of motivation, similar average numbers of health problems, and so on. This is a matter of controlling these extraneous participant variables across conditions so that they do not become confounding variables.

Random Assignment

The primary way that researchers accomplish this kind of control of extraneous variables across conditions is called random assignment , which means using a random process to decide which participants are tested in which conditions. Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population, and it is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other fields too.

In its strictest sense, random assignment should meet two criteria. One is that each participant has an equal chance of being assigned to each condition (e.g., a 50% chance of being assigned to each of two conditions). The second is that each participant is assigned to a condition independently of other participants. Thus one way to assign participants to two conditions would be to flip a coin for each one. If the coin lands heads, the participant is assigned to Condition A, and if it lands tails, the participant is assigned to Condition B. For three conditions, one could use a computer to generate a random integer from 1 to 3 for each participant. If the integer is 1, the participant is assigned to Condition A; if it is 2, the participant is assigned to Condition B; and if it is 3, the participant is assigned to Condition C. In practice, a full sequence of conditions—one for each participant expected to be in the experiment—is usually created ahead of time, and each new participant is assigned to the next condition in the sequence as he or she is tested. When the procedure is computerized, the computer program often handles the random assignment.

One problem with coin flipping and other strict procedures for random assignment is that they are likely to result in unequal sample sizes in the different conditions. Unequal sample sizes are generally not a serious problem, and you should never throw away data you have already collected to achieve equal sample sizes. However, for a fixed number of participants, it is statistically most efficient to divide them into equal-sized groups. It is standard practice, therefore, to use a kind of modified random assignment that keeps the number of participants in each group as similar as possible. One approach is block randomization . In block randomization, all the conditions occur once in the sequence before any of them is repeated. Then they all occur again before any of them is repeated again. Within each of these “blocks,” the conditions occur in a random order. Again, the sequence of conditions is usually generated before any participants are tested, and each new participant is assigned to the next condition in the sequence. Table 6.2 “Block Randomization Sequence for Assigning Nine Participants to Three Conditions” shows such a sequence for assigning nine participants to three conditions. The Research Randomizer website ( http://www.randomizer.org ) will generate block randomization sequences for any number of participants and conditions. Again, when the procedure is computerized, the computer program often handles the block randomization.

Table 6.2 Block Randomization Sequence for Assigning Nine Participants to Three Conditions

Participant Condition
4 B
5 C
6 A

Random assignment is not guaranteed to control all extraneous variables across conditions. It is always possible that just by chance, the participants in one condition might turn out to be substantially older, less tired, more motivated, or less depressed on average than the participants in another condition. However, there are some reasons that this is not a major concern. One is that random assignment works better than one might expect, especially for large samples. Another is that the inferential statistics that researchers use to decide whether a difference between groups reflects a difference in the population takes the “fallibility” of random assignment into account. Yet another reason is that even if random assignment does result in a confounding variable and therefore produces misleading results, this is likely to be detected when the experiment is replicated. The upshot is that random assignment to conditions—although not infallible in terms of controlling extraneous variables—is always considered a strength of a research design.

Treatment and Control Conditions

Between-subjects experiments are often used to determine whether a treatment works. In psychological research, a treatment is any intervention meant to change people’s behavior for the better. This includes psychotherapies and medical treatments for psychological disorders but also interventions designed to improve learning, promote conservation, reduce prejudice, and so on. To determine whether a treatment works, participants are randomly assigned to either a treatment condition , in which they receive the treatment, or a control condition , in which they do not receive the treatment. If participants in the treatment condition end up better off than participants in the control condition—for example, they are less depressed, learn faster, conserve more, express less prejudice—then the researcher can conclude that the treatment works. In research on the effectiveness of psychotherapies and medical treatments, this type of experiment is often called a randomized clinical trial .

There are different types of control conditions. In a no-treatment control condition , participants receive no treatment whatsoever. One problem with this approach, however, is the existence of placebo effects. A placebo is a simulated treatment that lacks any active ingredient or element that should make it effective, and a placebo effect is a positive effect of such a treatment. Many folk remedies that seem to work—such as eating chicken soup for a cold or placing soap under the bedsheets to stop nighttime leg cramps—are probably nothing more than placebos. Although placebo effects are not well understood, they are probably driven primarily by people’s expectations that they will improve. Having the expectation to improve can result in reduced stress, anxiety, and depression, which can alter perceptions and even improve immune system functioning (Price, Finniss, & Benedetti, 2008).

Placebo effects are interesting in their own right (see Note 6.28 “The Powerful Placebo” ), but they also pose a serious problem for researchers who want to determine whether a treatment works. Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” shows some hypothetical results in which participants in a treatment condition improved more on average than participants in a no-treatment control condition. If these conditions (the two leftmost bars in Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” ) were the only conditions in this experiment, however, one could not conclude that the treatment worked. It could be instead that participants in the treatment group improved more because they expected to improve, while those in the no-treatment control condition did not.

Figure 6.2 Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions

Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions

Fortunately, there are several solutions to this problem. One is to include a placebo control condition , in which participants receive a placebo that looks much like the treatment but lacks the active ingredient or element thought to be responsible for the treatment’s effectiveness. When participants in a treatment condition take a pill, for example, then those in a placebo control condition would take an identical-looking pill that lacks the active ingredient in the treatment (a “sugar pill”). In research on psychotherapy effectiveness, the placebo might involve going to a psychotherapist and talking in an unstructured way about one’s problems. The idea is that if participants in both the treatment and the placebo control groups expect to improve, then any improvement in the treatment group over and above that in the placebo control group must have been caused by the treatment and not by participants’ expectations. This is what is shown by a comparison of the two outer bars in Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” .

Of course, the principle of informed consent requires that participants be told that they will be assigned to either a treatment or a placebo control condition—even though they cannot be told which until the experiment ends. In many cases the participants who had been in the control condition are then offered an opportunity to have the real treatment. An alternative approach is to use a waitlist control condition , in which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it. This allows researchers to compare participants who have received the treatment with participants who are not currently receiving it but who still expect to improve (eventually). A final solution to the problem of placebo effects is to leave out the control condition completely and compare any new treatment with the best available alternative treatment. For example, a new treatment for simple phobia could be compared with standard exposure therapy. Because participants in both conditions receive a treatment, their expectations about improvement should be similar. This approach also makes sense because once there is an effective treatment, the interesting question about a new treatment is not simply “Does it work?” but “Does it work better than what is already available?”

The Powerful Placebo

Many people are not surprised that placebos can have a positive effect on disorders that seem fundamentally psychological, including depression, anxiety, and insomnia. However, placebos can also have a positive effect on disorders that most people think of as fundamentally physiological. These include asthma, ulcers, and warts (Shapiro & Shapiro, 1999). There is even evidence that placebo surgery—also called “sham surgery”—can be as effective as actual surgery.

Medical researcher J. Bruce Moseley and his colleagues conducted a study on the effectiveness of two arthroscopic surgery procedures for osteoarthritis of the knee (Moseley et al., 2002). The control participants in this study were prepped for surgery, received a tranquilizer, and even received three small incisions in their knees. But they did not receive the actual arthroscopic surgical procedure. The surprising result was that all participants improved in terms of both knee pain and function, and the sham surgery group improved just as much as the treatment groups. According to the researchers, “This study provides strong evidence that arthroscopic lavage with or without débridement [the surgical procedures used] is not better than and appears to be equivalent to a placebo procedure in improving knee pain and self-reported function” (p. 85).

Doctors treating a patient in Surgery

Research has shown that patients with osteoarthritis of the knee who receive a “sham surgery” experience reductions in pain and improvement in knee function similar to those of patients who receive a real surgery.

Army Medicine – Surgery – CC BY 2.0.

Within-Subjects Experiments

In a within-subjects experiment , each participant is tested under all conditions. Consider an experiment on the effect of a defendant’s physical attractiveness on judgments of his guilt. Again, in a between-subjects experiment, one group of participants would be shown an attractive defendant and asked to judge his guilt, and another group of participants would be shown an unattractive defendant and asked to judge his guilt. In a within-subjects experiment, however, the same group of participants would judge the guilt of both an attractive and an unattractive defendant.

The primary advantage of this approach is that it provides maximum control of extraneous participant variables. Participants in all conditions have the same mean IQ, same socioeconomic status, same number of siblings, and so on—because they are the very same people. Within-subjects experiments also make it possible to use statistical procedures that remove the effect of these extraneous participant variables on the dependent variable and therefore make the data less “noisy” and the effect of the independent variable easier to detect. We will look more closely at this idea later in the book.

Carryover Effects and Counterbalancing

The primary disadvantage of within-subjects designs is that they can result in carryover effects. A carryover effect is an effect of being tested in one condition on participants’ behavior in later conditions. One type of carryover effect is a practice effect , where participants perform a task better in later conditions because they have had a chance to practice it. Another type is a fatigue effect , where participants perform a task worse in later conditions because they become tired or bored. Being tested in one condition can also change how participants perceive stimuli or interpret their task in later conditions. This is called a context effect . For example, an average-looking defendant might be judged more harshly when participants have just judged an attractive defendant than when they have just judged an unattractive defendant. Within-subjects experiments also make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. This could lead the participant to judge the unattractive defendant more harshly because he thinks this is what he is expected to do. Or it could make participants judge the two defendants similarly in an effort to be “fair.”

Carryover effects can be interesting in their own right. (Does the attractiveness of one person depend on the attractiveness of other people that we have seen recently?) But when they are not the focus of the research, carryover effects can be problematic. Imagine, for example, that participants judge the guilt of an attractive defendant and then judge the guilt of an unattractive defendant. If they judge the unattractive defendant more harshly, this might be because of his unattractiveness. But it could be instead that they judge him more harshly because they are becoming bored or tired. In other words, the order of the conditions is a confounding variable. The attractive condition is always the first condition and the unattractive condition the second. Thus any difference between the conditions in terms of the dependent variable could be caused by the order of the conditions and not the independent variable itself.

There is a solution to the problem of order effects, however, that can be used in many situations. It is counterbalancing , which means testing different participants in different orders. For example, some participants would be tested in the attractive defendant condition followed by the unattractive defendant condition, and others would be tested in the unattractive condition followed by the attractive condition. With three conditions, there would be six different orders (ABC, ACB, BAC, BCA, CAB, and CBA), so some participants would be tested in each of the six orders. With counterbalancing, participants are assigned to orders randomly, using the techniques we have already discussed. Thus random assignment plays an important role in within-subjects designs just as in between-subjects designs. Here, instead of randomly assigning to conditions, they are randomly assigned to different orders of conditions. In fact, it can safely be said that if a study does not involve random assignment in one form or another, it is not an experiment.

There are two ways to think about what counterbalancing accomplishes. One is that it controls the order of conditions so that it is no longer a confounding variable. Instead of the attractive condition always being first and the unattractive condition always being second, the attractive condition comes first for some participants and second for others. Likewise, the unattractive condition comes first for some participants and second for others. Thus any overall difference in the dependent variable between the two conditions cannot have been caused by the order of conditions. A second way to think about what counterbalancing accomplishes is that if there are carryover effects, it makes it possible to detect them. One can analyze the data separately for each order to see whether it had an effect.

When 9 Is “Larger” Than 221

Researcher Michael Birnbaum has argued that the lack of context provided by between-subjects designs is often a bigger problem than the context effects created by within-subjects designs. To demonstrate this, he asked one group of participants to rate how large the number 9 was on a 1-to-10 rating scale and another group to rate how large the number 221 was on the same 1-to-10 rating scale (Birnbaum, 1999). Participants in this between-subjects design gave the number 9 a mean rating of 5.13 and the number 221 a mean rating of 3.10. In other words, they rated 9 as larger than 221! According to Birnbaum, this is because participants spontaneously compared 9 with other one-digit numbers (in which case it is relatively large) and compared 221 with other three-digit numbers (in which case it is relatively small).

Simultaneous Within-Subjects Designs

So far, we have discussed an approach to within-subjects designs in which participants are tested in one condition at a time. There is another approach, however, that is often used when participants make multiple responses in each condition. Imagine, for example, that participants judge the guilt of 10 attractive defendants and 10 unattractive defendants. Instead of having people make judgments about all 10 defendants of one type followed by all 10 defendants of the other type, the researcher could present all 20 defendants in a sequence that mixed the two types. The researcher could then compute each participant’s mean rating for each type of defendant. Or imagine an experiment designed to see whether people with social anxiety disorder remember negative adjectives (e.g., “stupid,” “incompetent”) better than positive ones (e.g., “happy,” “productive”). The researcher could have participants study a single list that includes both kinds of words and then have them try to recall as many words as possible. The researcher could then count the number of each type of word that was recalled. There are many ways to determine the order in which the stimuli are presented, but one common way is to generate a different random order for each participant.

Between-Subjects or Within-Subjects?

Almost every experiment can be conducted using either a between-subjects design or a within-subjects design. This means that researchers must choose between the two approaches based on their relative merits for the particular situation.

Between-subjects experiments have the advantage of being conceptually simpler and requiring less testing time per participant. They also avoid carryover effects without the need for counterbalancing. Within-subjects experiments have the advantage of controlling extraneous participant variables, which generally reduces noise in the data and makes it easier to detect a relationship between the independent and dependent variables.

A good rule of thumb, then, is that if it is possible to conduct a within-subjects experiment (with proper counterbalancing) in the time that is available per participant—and you have no serious concerns about carryover effects—this is probably the best option. If a within-subjects design would be difficult or impossible to carry out, then you should consider a between-subjects design instead. For example, if you were testing participants in a doctor’s waiting room or shoppers in line at a grocery store, you might not have enough time to test each participant in all conditions and therefore would opt for a between-subjects design. Or imagine you were trying to reduce people’s level of prejudice by having them interact with someone of another race. A within-subjects design with counterbalancing would require testing some participants in the treatment condition first and then in a control condition. But if the treatment works and reduces people’s level of prejudice, then they would no longer be suitable for testing in the control condition. This is true for many designs that involve a treatment meant to produce long-term change in participants’ behavior (e.g., studies testing the effectiveness of psychotherapy). Clearly, a between-subjects design would be necessary here.

Remember also that using one type of design does not preclude using the other type in a different study. There is no reason that a researcher could not use both a between-subjects design and a within-subjects design to answer the same research question. In fact, professional researchers often do exactly this.

Key Takeaways

  • Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
  • Random assignment to conditions in between-subjects experiments or to orders of conditions in within-subjects experiments is a fundamental element of experimental research. Its purpose is to control extraneous variables so that they do not become confounding variables.
  • Experimental research on the effectiveness of a treatment requires both a treatment condition and a control condition, which can be a no-treatment control condition, a placebo control condition, or a waitlist control condition. Experimental treatments can also be compared with the best available alternative.

Discussion: For each of the following topics, list the pros and cons of a between-subjects and within-subjects design and decide which would be better.

  • You want to test the relative effectiveness of two training programs for running a marathon.
  • Using photographs of people as stimuli, you want to see if smiling people are perceived as more intelligent than people who are not smiling.
  • In a field experiment, you want to see if the way a panhandler is dressed (neatly vs. sloppily) affects whether or not passersby give him any money.
  • You want to see if concrete nouns (e.g., dog ) are recalled better than abstract nouns (e.g., truth ).
  • Discussion: Imagine that an experiment shows that participants who receive psychodynamic therapy for a dog phobia improve more than participants in a no-treatment control group. Explain a fundamental problem with this research design and at least two ways that it might be corrected.

Birnbaum, M. H. (1999). How to show that 9 > 221: Collect judgments in a between-subjects design. Psychological Methods, 4 , 243–249.

Moseley, J. B., O’Malley, K., Petersen, N. J., Menke, T. J., Brody, B. A., Kuykendall, D. H., … Wray, N. P. (2002). A controlled trial of arthroscopic surgery for osteoarthritis of the knee. The New England Journal of Medicine, 347 , 81–88.

Price, D. D., Finniss, D. G., & Benedetti, F. (2008). A comprehensive review of the placebo effect: Recent advances and current thought. Annual Review of Psychology, 59 , 565–590.

Shapiro, A. K., & Shapiro, E. (1999). The powerful placebo: From ancient priest to modern physician . Baltimore, MD: Johns Hopkins University Press.

  • Research Methods in Psychology. Provided by : University of Minnesota Libraries Publishing. Located at : http://open.lib.umn.edu/psychologyresearchmethods . License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike

Footer Logo Lumen Candela

Privacy Policy

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Random Assignment in Experiments

By Jim Frost 4 Comments

Random assignment uses chance to assign subjects to the control and treatment groups in an experiment. This process helps ensure that the groups are equivalent at the beginning of the study, which makes it safer to assume the treatments caused any differences between groups that the experimenters observe at the end of the study.

photogram of tumbling dice to illustrate a process for random assignment.

Huh? That might be a big surprise! At this point, you might be wondering about all of those studies that use statistics to assess the effects of different treatments. There’s a critical separation between significance and causality:

  • Statistical procedures determine whether an effect is significant.
  • Experimental designs determine how confidently you can assume that a treatment causes the effect.

In this post, learn how using random assignment in experiments can help you identify causal relationships.

Correlation, Causation, and Confounding Variables

Random assignment helps you separate causation from correlation and rule out confounding variables. As a critical component of the scientific method , experiments typically set up contrasts between a control group and one or more treatment groups. The idea is to determine whether the effect, which is the difference between a treatment group and the control group, is statistically significant. If the effect is significant, group assignment correlates with different outcomes.

However, as you have no doubt heard, correlation does not necessarily imply causation. In other words, the experimental groups can have different mean outcomes, but the treatment might not be causing those differences even though the differences are statistically significant.

The difficulty in definitively stating that a treatment caused the difference is due to potential confounding variables or confounders. Confounders are alternative explanations for differences between the experimental groups. Confounding variables correlate with both the experimental groups and the outcome variable. In this situation, confounding variables can be the actual cause for the outcome differences rather than the treatments themselves. As you’ll see, if an experiment does not account for confounding variables, they can bias the results and make them untrustworthy.

Related posts : Understanding Correlation in Statistics , Causation versus Correlation , and Hill’s Criteria for Causation .

Example of Confounding in an Experiment

A photograph of vitamin capsules to represent our experiment.

  • Control group: Does not consume vitamin supplements
  • Treatment group: Regularly consumes vitamin supplements.

Imagine we measure a specific health outcome. After the experiment is complete, we perform a 2-sample t-test to determine whether the mean outcomes for these two groups are different. Assume the test results indicate that the mean health outcome in the treatment group is significantly better than the control group.

Why can’t we assume that the vitamins improved the health outcomes? After all, only the treatment group took the vitamins.

Related post : Confounding Variables in Regression Analysis

Alternative Explanations for Differences in Outcomes

The answer to that question depends on how we assigned the subjects to the experimental groups. If we let the subjects decide which group to join based on their existing vitamin habits, it opens the door to confounding variables. It’s reasonable to assume that people who take vitamins regularly also tend to have other healthy habits. These habits are confounders because they correlate with both vitamin consumption (experimental group) and the health outcome measure.

Random assignment prevents this self sorting of participants and reduces the likelihood that the groups start with systematic differences.

In fact, studies have found that supplement users are more physically active, have healthier diets, have lower blood pressure, and so on compared to those who don’t take supplements. If subjects who already take vitamins regularly join the treatment group voluntarily, they bring these healthy habits disproportionately to the treatment group. Consequently, these habits will be much more prevalent in the treatment group than the control group.

The healthy habits are the confounding variables—the potential alternative explanations for the difference in our study’s health outcome. It’s entirely possible that these systematic differences between groups at the start of the study might cause the difference in the health outcome at the end of the study—and not the vitamin consumption itself!

If our experiment doesn’t account for these confounding variables, we can’t trust the results. While we obtained statistically significant results with the 2-sample t-test for health outcomes, we don’t know for sure whether the vitamins, the systematic difference in habits, or some combination of the two caused the improvements.

Learn why many randomized clinical experiments use a placebo to control for the Placebo Effect .

Experiments Must Account for Confounding Variables

Your experimental design must account for confounding variables to avoid their problems. Scientific studies commonly use the following methods to handle confounders:

  • Use control variables to keep them constant throughout an experiment.
  • Statistically control for them in an observational study.
  • Use random assignment to reduce the likelihood that systematic differences exist between experimental groups when the study begins.

Let’s take a look at how random assignment works in an experimental design.

Random Assignment Can Reduce the Impact of Confounding Variables

Note that random assignment is different than random sampling. Random sampling is a process for obtaining a sample that accurately represents a population .

Photo of a coin toss to represent how we can incorporate random assignment in our experiment.

Random assignment uses a chance process to assign subjects to experimental groups. Using random assignment requires that the experimenters can control the group assignment for all study subjects. For our study, we must be able to assign our participants to either the control group or the supplement group. Clearly, if we don’t have the ability to assign subjects to the groups, we can’t use random assignment!

Additionally, the process must have an equal probability of assigning a subject to any of the groups. For example, in our vitamin supplement study, we can use a coin toss to assign each subject to either the control group or supplement group. For more complex experimental designs, we can use a random number generator or even draw names out of a hat.

Random Assignment Distributes Confounders Equally

The random assignment process distributes confounding properties amongst your experimental groups equally. In other words, randomness helps eliminate systematic differences between groups. For our study, flipping the coin tends to equalize the distribution of subjects with healthier habits between the control and treatment group. Consequently, these two groups should start roughly equal for all confounding variables, including healthy habits!

Random assignment is a simple, elegant solution to a complex problem. For any given study area, there can be a long list of confounding variables that you could worry about. However, using random assignment, you don’t need to know what they are, how to detect them, or even measure them. Instead, use random assignment to equalize them across your experimental groups so they’re not a problem.

Because random assignment helps ensure that the groups are comparable when the experiment begins, you can be more confident that the treatments caused the post-study differences. Random assignment helps increase the internal validity of your study.

Comparing the Vitamin Study With and Without Random Assignment

Let’s compare two scenarios involving our hypothetical vitamin study. We’ll assume that the study obtains statistically significant results in both cases.

Scenario 1: We don’t use random assignment and, unbeknownst to us, subjects with healthier habits disproportionately end up in the supplement treatment group. The experimental groups differ by both healthy habits and vitamin consumption. Consequently, we can’t determine whether it was the habits or vitamins that improved the outcomes.

Scenario 2: We use random assignment and, consequently, the treatment and control groups start with roughly equal levels of healthy habits. The intentional introduction of vitamin supplements in the treatment group is the primary difference between the groups. Consequently, we can more confidently assert that the supplements caused an improvement in health outcomes.

For both scenarios, the statistical results could be identical. However, the methodology behind the second scenario makes a stronger case for a causal relationship between vitamin supplement consumption and health outcomes.

How important is it to use the correct methodology? Well, if the relationship between vitamins and health outcomes is not causal, then consuming vitamins won’t cause your health outcomes to improve regardless of what the study indicates. Instead, it’s probably all the other healthy habits!

Learn more about Randomized Controlled Trials (RCTs) that are the gold standard for identifying causal relationships because they use random assignment.

Drawbacks of Random Assignment

Random assignment helps reduce the chances of systematic differences between the groups at the start of an experiment and, thereby, mitigates the threats of confounding variables and alternative explanations. However, the process does not always equalize all of the confounding variables. Its random nature tends to eliminate systematic differences, but it doesn’t always succeed.

Sometimes random assignment is impossible because the experimenters cannot control the treatment or independent variable. For example, if you want to determine how individuals with and without depression perform on a test, you cannot randomly assign subjects to these groups. The same difficulty occurs when you’re studying differences between genders.

In other cases, there might be ethical issues. For example, in a randomized experiment, the researchers would want to withhold treatment for the control group. However, if the treatments are vaccinations, it might be unethical to withhold the vaccinations.

Other times, random assignment might be possible, but it is very challenging. For example, with vitamin consumption, it’s generally thought that if vitamin supplements cause health improvements, it’s only after very long-term use. It’s hard to enforce random assignment with a strict regimen for usage in one group and non-usage in the other group over the long-run. Or imagine a study about smoking. The researchers would find it difficult to assign subjects to the smoking and non-smoking groups randomly!

Fortunately, if you can’t use random assignment to help reduce the problem of confounding variables, there are different methods available. The other primary approach is to perform an observational study and incorporate the confounders into the statistical model itself. For more information, read my post Observational Studies Explained .

Read About Real Experiments that Used Random Assignment

I’ve written several blog posts about studies that have used random assignment to make causal inferences. Read studies about the following:

  • Flu Vaccinations
  • COVID-19 Vaccinations

Sullivan L.  Random assignment versus random selection . SAGE Glossary of the Social and Behavioral Sciences, SAGE Publications, Inc.; 2009.

Share this:

random assignment vs matching techniques

Reader Interactions

' src=

November 13, 2019 at 4:59 am

Hi Jim, I have a question of randomly assigning participants to one of two conditions when it is an ongoing study and you are not sure of how many participants there will be. I am using this random assignment tool for factorial experiments. http://methodologymedia.psu.edu/most/rannumgenerator It asks you for the total number of participants but at this point, I am not sure how many there will be. Thanks for any advice you can give me, Floyd

' src=

May 28, 2019 at 11:34 am

Jim, can you comment on the validity of using the following approach when we can’t use random assignments. I’m in education, we have an ACT prep course that we offer. We can’t force students to take it and we can’t keep them from taking it either. But we want to know if it’s working. Let’s say that by senior year all students who are going to take the ACT have taken it. Let’s also say that I’m only including students who have taking it twice (so I can show growth between first and second time taking it). What I’ve done to address confounders is to go back to say 8th or 9th grade (prior to anyone taking the ACT or the ACT prep course) and run an analysis showing the two groups are not significantly different to start with. Is this valid? If the ACT prep students were higher achievers in 8th or 9th grade, I could not assume my prep course is effecting greater growth, but if they were not significantly different in 8th or 9th grade, I can assume the significant difference in ACT growth (from first to second testing) is due to the prep course. Yes or no?

' src=

May 26, 2019 at 5:37 pm

Nice post! I think the key to understanding scientific research is to understand randomization. And most people don’t get it.

' src=

May 27, 2019 at 9:48 pm

Thank you, Anoop!

I think randomness in an experiment is a funny thing. The issue of confounding factors is a serious problem. You might not even know what they are! But, use random assignment and, voila, the problem usually goes away! If you can’t use random assignment, suddenly you have a whole host of issues to worry about, which I’ll be writing about in more detail in my upcoming post about observational experiments!

Comments and Questions Cancel reply

Matching and Randomization in Experiments

Thoughts on a classic paper on causality.

Jeremy Salfen

Jeremy Salfen

  • Custom Social Profile Link

I recently read Donald Rubin’s classic paper Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies ( PDF ) as part of the Kickstarter Data team’s reading group.

Two arguments in this paper jumped out at me, the first about the value of matching and the second about the costs and benefits of conducting a randomized versus observational study.

Rubin proposes a hypothetical experiment on 2 N units in which the experimental treatment E is assigned to N units while the control treatment C is assigned to a different set of N units. If for every unit receiving E there is a matched unit receiving C such that we expect the pair to react identically to the same treatment, then there are N identically matched pairs. Rubin observes that

if one had N identically matched pairs, a “thoughtless” random assignment could be worse than a nonrandom assignment of E to one member of the pair and C to the other. By “thoughtless” we mean some random assignment that does not assure that the members of each matched pair get different treatments. (692)

In this case, “thoughtless” randomization “could be worse” in the sense that the statistical power of the experiment will suffer, and you will be less likely to detect an effect if there really is one.

Geography is a good example of this. If you’re running an experiment on users across the United States, matching similar geographic regions might be more effective than a completely randomized trial because the variation between regions might be higher than variation within regions , diluting the effect.

Of course, a multilevel model that takes into account region is one solution. For an insightful description of how Google has approached this problem, see Estimating causal effects using geo experiments .

Randomized vs. Observational Studies

Another one of Rubin’s claims that stood out to me is a comparison of the costs and benefits of typical randomized and observational studies.

One major advantage of randomized studies is that, with a large enough sample size, you often don’t have to worry about controlling for confounding factors that might bias your results.

There can be downsides to randomized studies though. They can have nontrivial setup costs, and running a randomized study over a long window (e.g. years) is often not feasible. Moreover, because randomized trials are often conducted in a controlled environment, Rubin claims that they tend to be less natural than an observational study — that is, the units of analysis are often constrained to a particular setting or selected to be a subset of the population of interest.

Granted, this is more the case for experiments in fields like psychology than in online experiments on the web, but it does suggest that generalizability is a factor we should consider when interpreting the results of a randomized trial.

Comparing these costs and benefits, Rubin argues that

the first issue, the effect of variables not explicitly controlled, is usually more serious in nonrandomized than in randomized studies, while the second, the applicability of the results to a population of interest, is often more serious in randomized than in nonrandomized studies. (698)

I take this as a reminder to think carefully about the generalizability of the results of an experiment. When we run experiments on specific parts of a website or on particular subsets of users, often our goal is to generalize these results to the entire website or to all users. Rubin reminds us that observational studies, when analyzed properly, may in fact be better suited to those kinds of claims, particularly when matching can be used.

When randomisation is not good enough: Matching groups in intervention studies

  • Brief Report
  • Open access
  • Published: 09 July 2021
  • Volume 28 , pages 2085–2093, ( 2021 )

Cite this article

You have full access to this open access article

random assignment vs matching techniques

  • Francesco Sella 1 ,
  • Gal Raz 2 &
  • Roi Cohen Kadosh 2  

13k Accesses

13 Citations

3 Altmetric

Explore all metrics

Randomised assignment of individuals to treatment and controls groups is often considered the gold standard to draw valid conclusions about the efficacy of an intervention. In practice, randomisation can lead to accidental differences due to chance. Researchers have offered alternatives to reduce such differences, but these methods are not used frequently due to the requirement of advanced statistical methods. Here, we recommend a simple assignment procedure based on variance minimisation (VM), which assigns incoming participants automatically to the condition that minimises differences between groups in relevant measures. As an example of its application in the research context, we simulated an intervention study whereby a researcher used the VM procedure on a covariate to assign participants to a control and intervention group rather than controlling for the covariate at the analysis stage. Among other features of the simulated study, such as effect size and sample size, we manipulated the correlation between the matching covariate and the outcome variable and the presence of imbalance between groups in the covariate. Our results highlighted the advantages of VM over prevalent random assignment procedure in terms of reducing the Type I error rate and providing accurate estimates of the effect of the group on the outcome variable. The VM procedure is valuable in situations whereby the intervention to an individual begins before the recruitment of the entire sample size is completed. We provide an Excel spreadsheet, as well as scripts in R, MATLAB, and Python to ease and foster the implementation of the VM procedure.

Similar content being viewed by others

Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches.

random assignment vs matching techniques

Power Analysis and Sample Size Planning in ANCOVA Designs

random assignment vs matching techniques

Cluster randomised trials with a binary outcome and a small number of clusters: comparison of individual and cluster level analysis method

Avoid common mistakes on your manuscript.

Introduction

Randomisation in controlled trials.

A common problem in intervention studies is comparing the effect of intervention while minimising the influence of confounding factors. In the pre-treatment assessment, a researcher usually measures the characteristics that the treatment aims to modify (i.e., outcome measures) as well as other variables that can exert an influence on the treatment (i.e., covariates). Then, the researcher will randomly assign individuals to the treatment and the control condition. In the ideal scenario, the control condition matches the treatment condition except for that specific feature of the treatment that the researcher considers to be crucial for causing a change in the outcome measures (e.g., placebo vs the active molecule in pharmacological studies). If the treatment is effective, the treatment group should improve in the outcome measures compared to the control group.

In the case of randomisation with large sample size, the statistical test for a difference at baseline or in other covariates becomes irrelevant as occurring significant differences reflect Type I error (de Boer et al., 2015 ; Roberts & Torgerson, 1999 ), which more likely arises when several covariates are considered (Austin et al., 2010 ). However, large sample sizes are difficult to achieve. Many researchers, especially in the clinical sciences, rely on small naturally occurring samples composed of individuals who voluntarily join the study when they wish to. In this scenario, the sampling is suboptimal as participants are not randomly sampled from the population, but they take part in the study based on convenience and opportunity. Although the assignment to different treatment conditions can be random, differences at baseline are more likely to emerge in small compared to large trials (Bruhn & Mckenzie, 2009 ; Chia, 2000 ; Nguyen & Collins, 2017 ; Saint-mont, 2015 ). Unfortunately, there is no statistical way to control for these differences between groups at pre-test (Miller & Chapman, 2001 ; Van Breukelen, 2006 ). Therefore, the imbalance in the pre-treatment scores can compromise the evaluation of the treatment efficacy, and seriously harm the interpretability of the results. To correct for this, the researcher may choose to allocate individuals to a condition based on previously collected pre-treatment scores and match the groups on these scores. However, this procedure requires the researcher to complete the pre-treatment assessment of all participants before the beginning of the treatment. The whole process may take several months, increase the attrition rate before the treatment begins and cannot account for unwanted changes in the measures of interest. Furthermore, the immediate implementation of the treatment is frequently necessary, especially in a clinical setting, where the treatment must begin in a critical phase of the patients’ clinical condition.

Minimising group differences

One solution is the use of covariate-adaptive randomisation procedures (Chen & Lee, 2011 ; Dragalin et al., 2003 ; Endo et al., 2006 ; Scott et al., 2002 ), which allocate participants to the different conditions as they join the study and, at the same time, reduce the difference between groups on predefined critical variables. There are three commonly used types of covariate-adaptive randomisation methods: stratified randomisation, dynamic hierarchical randomisation, and minimisation (Lin et al., 2015 ). Differences at baseline can be reduced by using stratified randomisation, whereby specific (prognostic) variables are divided into strata and participants are randomly selected from each stratum. However, stratified randomisation becomes difficult to implement as the factors to control for increase (Therneau, 1993 ). In dynamic hierarchical randomisation, covariates are ranked in order of importance and participants are assigned to conditions via biased coin allocation when thresholds of imbalance are exceeded in selected covariates (Signorini et al., 1993 ). A minimisation procedure, the focus of this paper, calculates the level of imbalance in covariates that assigning a participant to each condition would cause, then allocates with high probability (to maintain a degree of randomness) the current participant to the condition that minimises the imbalance.

In this vein, the use of covariate-adaptive randomisation procedures not only matches groups on covariates, but also implicitly forces researchers to state in advance those critical covariates related to the treatment rather than controlling for their effect at a later stage, when running statistical analyses (Simmons et al., 2011 ). A covariate-adaptive randomisation procedure attempts to reduce the unwanted differences at baseline that inadvertently emerge from a random assignment. However, it is worth highlighting that the covariate-adaptive randomisation procedures aim to solve the imbalances at pre-test that might emerge from the random assignment of participants, rather than issues related to non-random selection of participants from naturally occurring samples.

Despite a variety of covariate-adaptive randomisation procedures at disposal, researchers conducting training/treatment studies, including randomised control trials (RCTs), seldom implement these methods (Ciolino et al., 2019 ; Lin et al., 2015 ; Taves, 2010 ). The lack of popularity of these procedures might be due to multiple factors. Researchers may feel more comfortable in implementing more traditional and easier to understand stratified/block randomisation. In this vein, an efficient implementation of covariate-adaptive procedures would require the consultancy of an expert statistician for the entire duration of the trial; an extra cost that principal investigators may prefer to avoid (Ciolino et al., 2019 ). Finally, the lack of free, easy-to-use, computerised functions to automatically implement covariate-adaptive procedures may have contributed to their still limited dissemination (Treasure & Farewell, 2012 ; Treasure & MacRae, 1998 ).

Here, we provide a procedure based on variance minimisation (VM; Frane, 1998 ; Pocock & Simon, 1975 ; Scott et al., 2002 ; Treasure & MacRae, 1998 ), which assigns the next incoming participant to the condition that minimises differences between groups in the chosen measures. Our procedure brings the benefit of using multiple covariates without creating strata in advance, as done in the stratified randomisation, and it is relatively easy to implement compared with the more complex dynamic hierarchical randomisation. The logic and the calculation behind the procedure are simple and easy-to-grasp also from an audience of non-experts. We provided ready-to-use code to implement the procedure using different (also free) software along with step-by-step written instructions, thereby reducing any costs associated with product licenses or consultancy from expert statisticians.

Description of the VM procedure

The goal of the VM procedure is to find the best group assignment for participants prior to an intervention, such that the groups are matched in terms of the scores that the researcher suspects might cause random differences in post-intervention outcomes. The VM procedure requires the researcher to define the number of groups to which participants can be assigned and to collect individual scores for each variable on which groups are matched. These variables can be continuous or binary, where nominal variables with more than two categories can be transformed into multiple dummy variables (as in regression analysis) before being passed to the VM procedure (see section Using VM Procedure on Non-Dichotomous Nominal Variables, in the Supplementary Materials ). The procedure particularly suits those studies in which proper matching is essential, but the assignment to groups needs to occur while the recruitment is still ongoing. It works as follows.

The first participants joining the study are sequentially assigned one to each group. For example, in case of three different groups (i.e., A, B, C), the first participant is assigned to Group A, the second participant to Group B, and the third participant to Group C. Then the fourth participant is added temporarily to each group, and for each temporary group assignment, the algorithm checks which group assignment for this participant would minimize the between-group variance (i.e., V in Fig. 1 ) of the measures of interest and assigns the participant to that group. The next (fifth) participant undergoes the same procedure, but the algorithm will not assign the present participant to the group of the previous participant in order to ensure a balanced distribution of participants in each condition. The same procedure goes on until there is only one group remaining, which in the case of three groups would be for the sixth participant. The sixth participant would be automatically assigned to the remaining group, such that each group would now have two participants assigned to them. Then, the entire procedure starts again with the possibility for the next participant to be assigned to all available groups (for a formal description of the variance minimisation procedure, see section Details of the Minimisation Procedure, in the Supplementary Materials ).

figure 1

Comparison of assignment to groups using ( a ) variance minimisation and ( b ) random assignment. When a new participant joins a study, variance minimisation assigns the participant to the group that minimises the variance between groups along with the pre-defined variables (i.e., V ); in this case intelligence (IQ), executive functions (EFs), attentional performance (AP), and gender, while keeping the number of participants in each group balanced. Random assignment, on the other hand, assigns the participant to every group with equal probability and does not match the groups

To avoid predictable group assignments due to this shrinking set of available groups, the user can also specify a small probability of random assignment over the VM procedure (see section Discontinuous Implementation of the VM Procedure: The Parameter pRand, in the Supplementary Materials ). This random component makes the assignment unpredictable even if the researcher has access to previous group allocations.

Simulations

We present multiple simulations to illustrate how the VM procedure can be implemented in different scenarios and the advantages it provides.

In the first simulation, we implemented the VM procedure to assign participants to three experimental groups based on three continuous and one dichotomous variable. We compared the matching obtained from the VM procedure with random assignment. In the second simulation, we showed that the VM procedure better detects group differences and provides better estimates of effects compared with the attempt to control for the effect of covariates. In the supplementary materials , we demonstrate how to incorporate a random component in the VM procedure to ensure a non-deterministic assignment of participants to conditions (section Discontinuous Implementation of the VM Procedure: The Parameter pRand ) and how the VM can match participants also on non-dichotomous nominal variables (section Using VM Procedure on Non-Dichotomous Nominal Variables ). We briefly discuss the results of these two additional simulations in the Discussion section.

The functions to implement the VM procedure in Excel, MATLAB, Python, and R along with tutorials, as well as the R code of the simulation, can be found at the Open Science Framework ( https://osf.io/6jfvk/?view_only=8d405f7b794d4e3bbff7e345e6ef4eed ).

VM procedure outperforms random assignment in matching groups on continuous and dichotomous variables

In the first fictional example, a researcher wants to evaluate whether the combination of cognitive training of executive functions and brain stimulation improves the clinical symptoms of ADHD. The study design comprises three groups: the first group receives brain stimulation and the executive functions training; the second group receives sham stimulation and the training; the third group receives neither training nor stimulation (passive control group). The researcher aims to match the three groups on intelligence, executive functions performance, attentional performance, and gender. Figure 1 illustrates how VM assigns incoming participants compared with a traditional random assignment.

We simulated 1,000 data sets whereby we randomly drew the scores for IQ, executive functions, and attentional performance from a normal distribution, with a mean of 100 and a standard deviation of 15. Participants’ gender came from a binomial distribution with the same probability for a participant to be male or female. The simulated values for the matching variables were randomly generated, therefore there were no real differences between groups. We varied the sample size to be very small ( n = 36), small ( n = 66), medium ( n = 159), and large ( n = 969), reflecting the researcher’s intention to evaluate the possible presence of an extremely large ( f = 0.55), large ( f = 0.40), medium ( f = 0.25), and small ( f = 0.10) effect size, respectively, while keeping the alpha at .05 and power at 80% (Faul et al., 2009 ). We assigned participants to the three groups randomly or by using the VM procedure.

We ran univariate analyses of variance (ANOVAs) with IQ, executive functions, and attentional performance as dependent variables and group as factor whereas differences in gender distribution across groups were analysed using χ 2 tests. In Fig. 2 , we show the distributions of F , p , and η 2 values from ANOVAs on IQ, executive functions, and attentional performance (top panel), whereas in the case of gender, we presented the distribution and χ 2 , p , and Cramer’s V values (bottom panel) separately for the random assignment and the VM procedure across different sample sizes. Compared with random assignment, the VM procedure yielded smaller F , η 2 , χ 2 , and Cramer’s V values and the distribution of p -values was skewed toward 1, rather than uniform. The VM procedure demonstrated an efficient matching between groups starting from a very small sample size while keeping the number of participants in each group balanced. Moreover, both the VM procedure and the random assignment violated ANOVA assumptions on the normality of residuals and homogeneity of variance between groups with a similar rate (see Supplementary Materials, Fig. S1 ).

figure 2

A comparison of the VM procedure and random assignment based on simulated data. Top panel: Distributions of F -values, p -values, and η 2 values from ANOVAs comparing groups on intelligence (IQ), executive functions (EFs), and attentional performance (AP) separately for the VM procedure (orange boxplots) and the random assignment (blue boxplots). Bottom panel: Distributions of χ 2 , p -values, and Cramer’s V values comparing groups on gender separately for the VM procedure (orange boxplots) and the random assignment (blue boxplots). The boxplots represent the quartiles whereas the whiskers represent the 95% limits of the distribution. (Colour figure online)

Matching groups on a covariate versus controlling for a covariate with imbalance

We simulated an intervention study to display the advantages that the minimisation procedure provides in terms of detecting group differences and better estimates of effects compared with the attempt to control for the effect of covariates in the statistical analysis after the intervention was completed. A researcher evaluates the effect of an intervention on a dependent variable Y while controlling for the possible confounding effect of a covariate A, which positively correlates with Y, and a covariate B that correlates with covariate A (i.e., pattern correlation 1), or Y (i.e., pattern correlation 2), or neither of them (i.e., pattern correlation 3). In this vein, the covariate A represents a variable that the researchers ought to control for, given its known relation with the dependent variable Y, whereas the covariate B represents a non-matching variable that is still inserted into the model as it might have a real or spurious correlation with the covariate A and the dependent variable Y. We simulated a small, medium, and large effect of the intervention (i.e., Cohen’s d = 0.2; d = 0.5; d = 0.8) and, accordingly, we varied the total sample size to be 788, 128, and 52 to achieve a power of 80% while keeping the alpha at .05 (Faul et al., 2009 ). For comparison, we used the same sample sizes, 788, 128, and 52, when simulating the absence of an intervention effect (i.e., Cohen’s d = 0). Crucially, we compared the scenario whereby the researcher matches participants on the covariate A (i.e., VM on CovA) before implementing the intervention or randomly assigns participants to the control and training group and then attempts to control for the effect of covariate after the intervention (i.e., Control for CovA). The subsequent inclusion of the covariate A in the analysis, especially in the case of imbalance between groups in the covariate A, would bias the effect of the group on Y when the difference between groups in the covariate A is larger in the direction of the intervention effect. Conversely, the minimisation procedure reduces the difference between groups on the covariate A and the inclusion of the covariate A into the analysis (i.e., analysis of covariance; ANCOVA) would not cause biases in the estimation of the effect of the group on Y.

In the case of the control for covariate approach, we generated the scores of the covariate A by taking them from a standard normal distribution ( M = 0, SD = 1) and we randomly assigned participants to the control and training group. We generated an imbalance in the covariate A by calculating the standard error of the mean and multiplying it for the standard normal deviates ±1.28, ±1.64, ±1.96 corresponding to the 20%, 10%, and 5% probabilities respectively of the standard normal distribution. The use of the standard error allowed to keep the imbalance proportionate to the sample size. The obtained imbalance was added to the scores of the covariate A only for the training group, thereby generating a difference in covariate A that went in the same or in the opposite direction with respect to the intervention effect (i.e., larger scores on the dependent variable only for the training group; Egbewale et al., 2014 ). We also included the case of absent imbalance for reference. In the case of the VM procedure, we took the previously generated scores of the covariate A with the imbalance, and we assigned participants to the control or training group using the VM procedure. Then, we generated the scores of Y that were correlated with the covariate A according to four correlations, that were, 0, 0.5, 0.7, and 0.9. Finally, we added 0, 0.2, 0.5, 0.8 to the Y scores of the training group to simulate an absent, small, medium, and large effect of the intervention.

In both the random assignment and the VM procedure, the covariate B was generated to alternatively have a correlation of 0.5 ( SD = 0.1) with the covariate A (i.e., Pattern 1), Y (i.e., Pattern 2), or no correlation with these two variables (i.e., Pattern 3). We randomly selected the correlation from a normal distribution with an average 0.5 and standard deviation of 0.1 to add some noise to the correlation while maintaining it positive and centred on 0.5.

Overall, we varied multiple experimental conditions in 504 scenarios (for a similar approach, see Egbewale et al., 2014 ):

seven imbalances on the covariate A: −1.96, −1.64, −1.28, 0, 1.28, 1.64, 1.96;

four correlations between covariates A and Y: 0, 0.5, 0.7, 0.9;

six treatment effects: 0 (×3 as the absence of the effect was tested with three sample sizes, that were, 52, 128, 788), 0.2, 0.5, 0.8;

three patterns of correlation between the covariate B, covariate A, and Y.

We simulated each scenario 1,000 times.

As expected, the correlations between the covariate B and the other two variables varied according to the pre-specified patterns of correlations, which were practically identical in the VM and control for covariate approach (see Table S1 in the Supplementary Materials).

We ran a series of ANCOVAs with Y as the dependent variable, the covariates A and B, and group [Training, Control] as independent variables. We used a regression approach as the variable group was converted to a dichotomous numerical variable (i.e., control = 0, training = 1) to directly use the regression coefficients as estimates for the effect of each variable on Y. Both the VM procedure and the control for the covariate approach display a similar rate in violating ANCOVA assumptions of the normality of residuals and homogeneity of variance between groups (see Supplementary Materials; Fig. S2 ).

In this fictitious scenario, the researcher would be interested in evaluating the effect of the group on Y while controlling for covariates. Therefore, we reported the proportion of significant results ( p < .05; Fig. 3 ) and the estimated effect (i.e., coefficient of the regression; Fig. 4 ) for the effect of group on Y depending on the imbalance in the covariate A, the effect size of the intervention, and the degree of correlation between the covariate A and Y. For simplicity, in Figs. 3 and 4 , we reported only the simulation with a large sample size (i.e., n = 788) when the effect of the intervention was absent (i.e., d =0). The pattern of results remained stable across the patterns of correlations of the covariate B. Therefore, we reported the proportion of significant results and estimated effects for the group, covariate A, and covariate B across the patterns correlation of the covariate B in the Supplementary Materials (Figs. S5 – S22 ).

figure 3

Proportion of significant results ( y -axis) for the effect of group in the ANCOVA (Y ~ CovA + CovB + Group) separately for the VM procedure (orange lines) and control for CovA approach (blue lines) across imbalances of the covariate A ( x -axis) when the sample size varied according to the effect size to be detected (rows; absent = 0, n = 788; small = 0.2, n = 788; medium = 0.5, n = 128; large = 0.8, n = 52) and the correlation between the covariate A and the dependent variable Y ranged between 0 and 0.9 (columns). The black dotted line represents alpha (i.e., 0.05) and the dashed black line represents the expected power (i.e., 0.8). (Colour figure online)

figure 4

Median of estimates ( y -axis; regression coefficients) for the effect of group in the ANCOVA (Y ~ CovA + CovB + Group) separately for the VM procedure (orange lines) and control for CovA approach (blue lines) across imbalances of the covariate A ( x -axis) when the sample size varied according to the effect size to be detected (rows; absent = 0, n = 788; small = 0.2, n = 788; medium = 0.5, n = 128; large = 0.8, n = 52) and the correlation between the covariate A and the dependent variable Y ranged between 0 and 0.9 (columns). The black dotted line represents the expected regression coefficients (i.e., 0, 0.2, 0.5, 0.8). (Colour figure online)

When the effect of the intervention was present (second to fourth rows in Fig. 3 ), the VM procedure showed a more stable detection of significant results also in the presence of serious imbalances in the covariate A. This stability became clearer as the correlation between the covariate A and Y increased. When the effect of the intervention was absent (first row in Fig. 3 ), the VM procedure always kept the Type I error around 0.05 while the control covariate approach inflated Type I error rate in the case of strong imbalance in the covariate A when it was highly correlated (i.e., 0.7, 0.9) with the outcome variable Y.

A similar pattern of results emerged when we compared the estimates of the effect of the group (i.e., regression coefficients) yielded by the VM procedure and the control for covariate approach. The VM procedure always provided accurate estimates of the effect of the group. Conversely, the control for covariate approach returned biased estimates with large imbalances in the covariate A and when its correlation with the outcome variable Y was high (i.e., 0.7, 0.9; Fig. 4 ).

In treatment studies, groups should be as similar as possible in all the variables of interest before the beginning of the treatment. An optimal matching can ensure that the effect of the treatment is not related to the pre-treatment characteristics of the groups and can, therefore, be extended to the general population. In contrast, the random assignment can yield relevant, and even statistically significant, differences between the groups before the treatment (Treasure & MacRae, 1998 ).

The proposed VM procedure constitutes a quick and useful tool to match groups before treatment on both continuous and categorical covariates (Pocock & Simon, 1975 ; Scott et al., 2002 ; Treasure & MacRae, 1998 ). The latter, though, need to be transformed into dummy variables to be passed to the minimisation algorithm (for a minimisation procedure that directly handles nominal covariates see Colavincenzo, 2013 ). We simulated an intervention study whereby a researcher used the VM procedure on a covariate to assign participants to a control and intervention group rather than controlling for the covariate at the analysis stage. Among other features of the simulated study, we manipulated the correlation between the matching covariate and the outcome variable and the presence of imbalance between groups in the covariate. Controlling for covariates post hoc inflated Type I error rate and yielded biased estimates of the effect of the group on the outcome variable when the imbalance between groups in the covariate increased and the correlation between the covariate and the outcome variable was high. Conversely, the use of VM on the covariate did not inflate Type I error rate and provided accurate estimates of the effect of the group on the outcome variable.

The progressive shrinking of available conditions when using the VM procedure ensures a perfect balance in the number of participants across conditions while still minimising covariate imbalance. However, some participants will be forcefully assigned to a given condition irrespective of their scores in the covariates. Therefore, in some instances, the researcher will know in advance the condition the participants will be assigned to and not all participants will have the chance to be assigned to each of the available conditions. This restriction might be relevant for clinical trials where one of the conditions is potentially beneficial (i.e., the treatment group). In this case, the researcher can insert a random component into the VM procedure by defining the probability to implement a random assignment. The random component prevents the researcher from being sure about the condition some participants will be assigned to and gives all participants the possibility, in principle, to be assigned to one of the conditions. Using a small amount of randomness (e.g., pRand = 0.1) provides a good balance between matching groups on covariates while avoiding predictable allocation (see section Discontinuous Implementation of the VM Procedure: The Parameter pRand, in the Supplementary Materials ).

Despite the benefits of the minimisation procedure, limitations must be carefully considered. First, the application of the VM procedure on small sample sizes does not prevent the treatment effect from being influenced by the unequal distribution of unobserved confounding variables, whose equal distribution is most likely achieved with large sample sizes. This limitation related to small sample sizes affects both the VM procedure and random assignment. Nevertheless, the selection of matching covariates for the minimisation procedure encourages researchers to carefully think in advance about possible confounding variables and match participants on them. Secondly, we showed that the VM is beneficial in simple ANOVA/ANCOVA simulations. In the case of more complex models (e.g., with an interaction), the researcher should carefully consider whether the minimisation procedure constitutes an advantage to the design. We recommend running simulations tailored to specific research designs to ensure that the VM procedure adequately matches participants across conditions.

Third, the minimisation procedure considers all covariates equally important without giving the user the possibility to allow more imbalance in some covariates compared to others (for a minimisation procedure that allows weighting see Saghaei, 2011 ). It is therefore paramount that the researchers will carefully consider the covariates they wish to match the groups on.

Overall, our minimisation procedure, even after considering the above-mentioned limitations, provides important advantages over the randomisation procedure that is used frequently. Its relative simplicity encourages researchers to use covariate-adaptive matching procedures (Ciolino et al., 2019 ; Lin et al., 2015 ). To allow the requested shift from the randomisation procedure, we provide scripts, written using popular software (i.e., R, Python, MATLAB, and Excel), which allow a fast and easy implementation of the VM procedure and integration with other stimulus presentation and analysis scripts. In this light, the treatment can start in the same session in which pre-treatment measures are acquired, thereby reducing the total number of sessions and, consequently, the overall costs. The immediate application of the treatment also excludes the possibility that pre-treatment measures change between the period of the initial recruitment and the actual implementation of the treatment. We strongly recommend using the VM procedure in these studies to yield more effective and valid RCTs.

Austin, P. C., Manca, A., Zwarenstein, M., Juurlink, D. N., & Stanbrook, M. B. (2010). Baseline comparisons in randomized controlled trials. Journal of Clinical Epidemiology , 63 (8), 940–942. https://doi.org/10.1016/j.jclinepi.2010.03.009

Article   Google Scholar  

Bruhn, M., & Mckenzie, D. (2009). In pursuit of balance: Randomization in practice in development field experiments. American Economic Journal: Applied Economics, 4 (1), 200–232. https://www.jstor.org/stable/25760187

Google Scholar  

Chen, L. H., & Lee, W. C. (2011). Two-way minimization: A novel treatment allocation method for small trials. PLOS ONE , 6 (12), 1–8. https://doi.org/10.1371/journal.pone.0028604

Chia, K. S. (2000). Randomisation: Magical cure for bias? Annals of the Academy of Medicine, Singapore, 29 (5), 563–564.

Ciolino, J. D., Palac, H. L., Yang, A., Vaca, M., & Belli, H. M. (2019). Ideal vs. real: A systematic review on handling covariates in randomized controlled trials. BMC Medical Research Methodology, 19 (1), 136. https://doi.org/10.1186/s12874-019-0787-8

Colavincenzo, J. (2013). Doctoring your clinical trial with adaptive randomization: SAS® Macros to perform adaptive randomization. Proceedings of the SAS® Global Forum 2013 Conference [Internet]. Cary (NC): SAS Institute Inc. https://support.sas.com/resources/papers/proceedings13/181-2013.pdf

de Boer, M. R., Waterlander, W. E., Kuijper, L. D. J., Steenhuis, I. H. M., & Twisk, J. W. R. (2015). Testing for baseline differences in randomized controlled trials: An unhealthy research behavior that is hard to eradicate. International Journal of Behavioral Nutrition and Physical Activity , 12 (1), 1–8. https://doi.org/10.1186/s12966-015-0162-z

Dragalin, V., Fedorov, V., Patterson, S., & Jones, B. (2003). Kullback-Leibler divergence for evaluating bioequivalence. Statistics in Medicine , 22 (6), 913–930. https://doi.org/10.1002/sim.1451

Article   PubMed   Google Scholar  

Egbewale, B. E., Lewis, M. & Sim, J. (2014). Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study. BMC Med Res Methodol, 14 , 49. https://doi.org/10.1186/1471-2288-14-49

Endo, A., Nagatani, F., Hamada, C., & Yoshimura, I. (2006). Minimization method for balancing continuous prognostic variables between treatment and control groups using Kullback-Leibler divergence. Contemporary Clinical Trials , 27 (5), 420–431. https://doi.org/10.1016/j.cct.2006.05.002

Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behavior Research Methods , 41 (4), 1149–1160. https://doi.org/10.3758/BRM.41.4.1149

Frane, J. W. (1998). A method of biased coin randomisation, its implementation and its validation. Drug Information Journal , 32 , 423–432.

Lin, Y., Zhu, M., & Su, Z. (2015). The pursuit of balance: An overview of covariate-adaptive randomization techniques in clinical trials. Contemporary Clinical Trials, 45, 21–25. https://doi.org/10.1016/j.cct.2015.07.011

Miller, G. A., & Chapman, J. P. (2001). Misunderstanding analysis of covariance. Journal of Abnormal Psychology , 110 (1), 40–48.

Nguyen, T., & Collins, G. S. (2017). Simple randomization did not protect against bias in smaller trials, 84 , 105–113. https://doi.org/10.1016/j.jclinepi.2017.02.010

Pocock, S. J., & Simon, R. (1975). Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics , 31 (1), 103. https://doi.org/10.2307/2529712

Roberts, C., & Torgerson, D. J. (1999). Understanding controlled trials: Baseline imbalance in randomised controlled trials. BMJ , 319 (7203), 185–185. https://doi.org/10.1136/bmj.319.7203.185

Article   PubMed   PubMed Central   Google Scholar  

Saghaei, M. (2011). An overview of randomization and minimization programs for randomized clinical trials. Journal of Medical Signals and Sensors, 1 (1), 55.

Saint-mont, U. (2015). Randomization does not help much, comparability does. PLOS ONE, 10 (7), Article e0132102. https://doi.org/10.1371/journal.pone.0132102

Scott, N. W., McPherson, G. C., Ramsay, C. R., & Campbell, M. K. (2002). The method of minimization for allocation to clinical trials: A review. Controlled Clinical Trials , 23 (6), 662–674. https://doi.org/10.1016/S0197-2456(02)00242-8

Signorini, D. F., Leung, O., Simes, R. J., Beller, E., Gebski, V. J., & Callaghan, T. (1993). Dynamic balanced randomization for clinical trials. Statistics in Medicine, 12 (24), 2343–2350.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science , 22 (11), 1359–1366. https://doi.org/10.1177/0956797611417632

Taves, D. R. (2010). The use of minimization in clinical trials. Contemporary Clinical Trials , 31 (2), 180–184. https://doi.org/10.1016/j.cct.2009.12.005

Therneau, T. M. (1993). How many stratification factors are “too many” to use in a randomization plan? Controlled Clinical Trials , 14(2), 98–108. https://doi.org/10.1016/0197-2456(93)90013-4

Treasure, T., & Farewell, V. (2012). Minimization in interventional trials : great value but residual vulnerability. Journal of Clinical Epidemiology , 65 (1), 7–9. https://doi.org/10.1016/j.jclinepi.2011.07.005

Treasure, T., & MacRae, K. D. (1998). Minimisation: The platinum standard for trials? BMJ (Clinical Research Ed.) , 317 (7155), 362–363. https://doi.org/10.1136/bmj.317.7155.362

Van Breukelen, G. J. P. (2006). ANCOVA versus change from baseline had more power in randomized studies and more bias in nonrandomized studies. Journal of Clinical Epidemiology , 59 (9), 920–925. https://doi.org/10.1016/j.jclinepi.2006.02.007

Download references

Acknowledgements

This study was supported by the European Research Council (Learning&Achievement 338065).

Author information

Authors and affiliations.

Centre for Mathematical Cognition, Loughborough University, Loughborough, UK

Francesco Sella

Department of Experimental Psychology, University of Oxford, Oxford, UK

Gal Raz & Roi Cohen Kadosh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Francesco Sella .

Additional information

Open practices statement

The R code of the analyses is available at https://osf.io/6jfvk/?view_only=8d405f7b794d4e3bbff7e345e6ef4eed

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

(DOCX 2855 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Sella, F., Raz, G. & Cohen Kadosh, R. When randomisation is not good enough: Matching groups in intervention studies. Psychon Bull Rev 28 , 2085–2093 (2021). https://doi.org/10.3758/s13423-021-01970-5

Download citation

Accepted : 03 June 2021

Published : 09 July 2021

Issue Date : December 2021

DOI : https://doi.org/10.3758/s13423-021-01970-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Variance minimisation
  • Randomised controlled trial
  • Research design
  • Clinical trials
  • Allocation methods
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. PPT

    random assignment vs matching techniques

  2. Random Assignment in Experiments

    random assignment vs matching techniques

  3. Random Assignment in Experiments

    random assignment vs matching techniques

  4. PPT

    random assignment vs matching techniques

  5. PPT

    random assignment vs matching techniques

  6. Introduction to Random Assignment -Voxco

    random assignment vs matching techniques

VIDEO

  1. Matching Hats With Random Comments *Pick a Color*

  2. pari vs matching car 🚗🚗🚗

  3. Random Assignment-2023/24 Topps Chrome Basketball #16 Hobby 2-Box Random WITH A TWIST! (9/2/24)

  4. random sampling & assignment

  5. Random Assignment

  6. Baalveer Pariya vs Matching Icecream to their dresses|#spcreations#shorts#viral#trending#songs#

COMMENTS

  1. Random Assignment in Experiments | Introduction & Examples

    What’s the difference between random assignment and random selection? Random selection, or random sampling , is a way of selecting members of a population for your study’s sample. In contrast, random assignment is a way of sorting the sample into control and experimental groups.

  2. When randomisation is not good enough: Matching groups in ...

    VM procedure outperforms random assignment in matching groups on continuous and dichotomous variables. In the first fictional example, a researcher wants to evaluate whether the combination of cognitive training of executive functions and brain stimulation improves the clinical symptoms of ADHD.

  3. Matching methods for causal inference: Designing ...

    Matching methods constitute a growing collection of techniques that attempts to replicate, as closely as possible, the ideal of randomized experiments when using observational data. There are two key ways in which the matching methods we discuss replicate a randomized experiment.

  4. 5.2 Experimental Design – Research Methods in Psychology

    Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it. Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

  5. Random Assignment in Psychology: Definition & Examples

    Random Assignment vs Random Sampling. Random sampling refers to selecting participants from a population so that each individual has an equal chance of being chosen. This method enhances the representativeness of the sample. Random assignment, on the other hand, is used in experimental designs once participants are selected.

  6. 6.2 Experimental Design | Research Methods in Psychology

    Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question. Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies ...

  7. Microsoft Word - social research methods handbook

    Random assignment is a method for assigning cases (e.g., individuals) to groups (e.g., experimental and control) for the purpose of making comparisons in order to increase one’s confidence that the groups do not differ in a systematic way. How to Randomly Assign.

  8. Random Assignment in Experiments - Statistics by Jim

    Statistically control for them in an observational study. Use random assignment to reduce the likelihood that systematic differences exist between experimental groups when the study begins. Let’s take a look at how random assignment works in an experimental design.

  9. Matching and Randomization in Experiments - Jeremy Salfen

    Two arguments in this paper jumped out at me, the first about the value of matching and the second about the costs and benefits of conducting a randomized versus observational study. Matching Rubin proposes a hypothetical experiment on 2 N units in which the experimental treatment E is assigned to N units while the control treatment C is ...

  10. When randomisation is not good enough: Matching groups in ...

    We compared the matching obtained from the VM procedure with random assignment. In the second simulation, we showed that the VM procedure better detects group differences and provides better estimates of effects compared with the attempt to control for the effect of covariates.