The next section of the Sources of Statistical Biases Series will provide simulated explorations of key assumptions of causal analysis. It is important, however, to understand that two frameworks are primarily used to evaluate for causal associations: Causal Inference Through Experimental Design and Causal Inference Through Directed Acyclic Graphs (DAG). Distinct from popular belief, these frameworks are not mutually exclusive and can be complimentary if integrated properly. Nevertheless, that discussion is for another series. Importantly, both frameworks define causality the exact same: logical, temporally ordered, and non-spurious. Considering the focus of the current section is on violating assumptions when examining causal associations, we must talk about assumptions within both causal frameworks. These include statistical assumptions of (1) randomized controlled trials (Rubin Causal Models) and (2) structural causal models/DAGs. Before going into the deep end of the causal pool, let’s briefly review the causal frameworks.
In the vast majority of sciences, we teach that causal inference can only be achieved through experimental design. This framework was initially proposed by Ronald Fisher (1936) and popularized by Paul Rosenbaum and Donald Rubin. In the simplest terms, a randomized experiment – randomized controlled trial – allows one to estimate the causal association between two variables by randomly assigning scores on the independent variable. This random assignment, if conducted properly, makes the independent variable unpredictable, in turn allowing you to achieve the three requirements of causality. Although related, quasi-experimental designs will be discussed in a subsequent section.
In addition to experimental designs, directed acyclic graphs (DAGs) can provide the ability to make inferences about causal associations. DAGs are graphical displays with nodes and edges, where the nodes represent variables and the edges represent the relationship between the nodes. As argued by Judea Pearl, the estimation process of structural models as well as the theoretically guided selection of edges directly and indirectly connecting nodes permit DAGs to be used to develop causal inferences. Nevertheless, it is important to note that causal inference can only be achieved when theory, logic, and a transparent decision-making process is used to guide the development of a DAG. Moreover, unlike randomized controlled trials, for DAGs to permit the ability to make inferences about causal associations potential confounders must be included in the model or justifiably ruled out.
Now that we introduced the two causal frameworks, let’s begin the current section talking about randomized controlled trials. I hope you enjoy!