Introduction (PDF)
A modifier is a value we identify – either systematically or randomly – that is used to specify the relationship between two variables. Alongside the distributional commands, we can not conduct a comprehensive data simulation without modifiers. We can simply save a value as a letter or use the value independent of a letter to specify a modifier. For instance, we can set M (or modifier) to 12.
M<-12
Modifiers Within the Context of Data Simulations
Within the context of a data simulation, modifiers will commonly represent the true slope coefficient of the association between a new variable and the variable/distribution being used to generate the new variable (b; demonstrated in the next entry). As a default, R will always set the modifier to 1, where a 1-point increase in the independent variable corresponds to a 1-point increase in the dependent variable. This represents a condition where our modifier will equal the true slope coefficient. The modifier will not equal the true slope coefficient when: (1) we cross over levels of measurement, (2) any interaction term is included in the model, (3) we prespecify modifiers and then alter the terms throughout the simulation process. We, however, commonly will want to change the modifier from the default setting (i.e., b = 1). To do so, we can systematically select modifying values or we can draw a random modifier from one of the distributions discussed in the previous entry. I have commonly drawn modifiers from the uniform distribution because all values between two thresholds have an equal likelihood of being selected.
Important Information
The primary reason I wanted to briefly discuss modifiers before actually simulating linear associations is because the modifier does not represent the amount of variation in the new variable predicted by the variable being used to generate the new variable. If only one variable is being used to generate a new variable, the independent variable will perfectly predict variation in the dependent variable no matter what modifier is used. For example, if we simulate dependent variable to be equal to 12 (the modifier) times an independent variable, the slope of the association will be 12, while the R2 will be 1 and predicted-y values will equal the observed y-values. This is because no other variation was specified to exist in the dependent variable. This also means, importantly, that each variable used to generate a new variable contributes variation to the new variable proportional to the modifier set. For example, if we simulate dependent variable to be equal to 12 (the modifier) times two independent variables, each independent variable should predict approximately 50 percent of the variation in the dependent variable. That is, if all of the modifiers are equal, all of the vectors used to generate the new vector should predict an equal amount of variation in the new vector.
I know this might be confusing at this stage, but clarity will be provided over the next few entries, especially when we discuss simulating multivariate relationships.
License: Creative Commons Attribution 4.0 International (CC By 4.0)