Any psychological research, irrespective of its particular subfield and goal, requires assessment in order to prove that it is valid and that its findings are reliable, which is why the concept of validity has always been essential in psychology as a science. Hence, there is a need to develop and use effective and efficient instruments for the measurement of validity that would be applicable to a variety of diverse psychological studies aimed at researching and evaluating different psychological constructs. One of the instruments that challenge conventional validity measurements is based on correlational analysis and has been suggested by Bornstein (2011), who also introduced a process-focused model and suggested combining it with conventional analysis intending to improve assessment procedures in psychology. Overall, a critical overview of the process-focused model shows that this instrument really contributes to the comprehension and measurement of validity and can be effectively combined with the correlational analysis so that the assessment of outcome and process validity of tests aimed at studying psychological constructs can be improved.
Bornstein (2011) describes this approach to validity measurement as a process-focused model (hereinafter referred to as the PF) and compares it to traditional models of validity in the article entitled “Toward a process-focused model of test score validity: Improving psychological assessment in science and practice.” Prior to discussing the PF in detail, the author provides a definition of validity and describes traditional models of validity, as well as points out his views on shortcomings in the latter. Hence, he claims that traditionally “validity has … been operationalized as a statistic, the validity coefficient (usually expressed as r), which reflects the magnitude of the relationship between a predictor (the test score) and some criterion (an outcome measure)” (Bornstein, 2011, p. 533). In line with the above definition, there exist such types of validity coefficients as indexes of criterion validity and construct validity, depending on whether the study researches observable criteria or unobservable constructs (Bornstein, 2011). Respectively, criterion validity is further subdivided into concurrent and predictive validity, while construct validity is subdivided into convergent and discrimination validity (Bornstein, 2011). However, regardless of the subtype of validity measured, it is traditionally measured with the help of some type of statistical analysis, most likely the correlational one, and represented as a coefficient. Therofore, all of them are similar in terms of being correlations of some kind and may be viewed as indices of the strength of association, as well as being studied primarily in an observational manner (Bornstein, 2011). Bornstein (2011) believes that the traditional model fails to accurately measure the validity and focuses only on outcomes, while completely ignoring the study process.
The model of validity described above is the traditional one. It has been in use since the mid-20th century. However, recently it has been subject to the influence of confirmatory factor analysis (CFA) and structural equation model (SEM), which has resulted in the emergence of three revised model based on the traditional one (Bornstein, 2011). These three “substantive conceptual refinements of the traditional view of validity” include construct representation, attribute variation, and consequential validity (Bornstein, 2011, p. 534). Nevertheless, all of them are focused on the outcome and fail to assess the validity of the study process, which is why the author of the article under consideration has developed his own model of validity.
The PF model suggested by Bornstein (2011) drastically differs from the traditional models described above in a number of ways. The main difference lies in the conceptualization of validity that is “conceptualized as the degree to which respondents can be shown to engage in a predictable set of psychological processes during testing” in the new model. Hence, researchers use experimental manipulations to change the processes and see their impact on test scores (Bornstein, 2011, p. 536). Thus, it is evident that, contrary to the traditional models, the PF model focuses on the process rather than on the outcome. Besides, it considers extraneous variables not to be problematic as the traditional models do, but rather helpful since they can be used to manipulate the analysis aimed at determining validity of the test. Bornstein (2011) emphasizes that the PF model takes into consideration both instrument-based processes and context-based influences on the process of testing, which provides a comprehensive assessment of validity. Instrument-based processes assist with providing a classification of tests based on the nature of the instruments. The main test categories distinguished on the basis of instruments used include self-attribution tests, stimulus-attribution tests, performance-based tests, constructive tests, observational tests, and informant reports (Bornstein, 2011). Such classification is not used in the traditional models, which is another peculiarity of the PF model. In fact, this classification is very helpful to psychologists as it not only assists with assessing validity, but also with understanding different types of tests, their goals, and processes of data collection and analysis. In turn, context-based influences are situational factors that may impact responses of test participants in some way (Bornstein, 2011).
According to Bornstein (2011), the PF model envisions the implementation of four key steps that are further divided into substeps. First, the assessment instruments need to be deconstructed with a particular focus on underlying psychological processes and the impact on context variables. Second, process-outcome links have to be determined and assessed so that variables with the potential to impact the process can be used as manipulative factors and an experimental design could be developed. Third, outcomes are to be interpreted and process validity is to be assessed with an emphasis on limiting conditions. Finally, generalizability has to be assessed with a focus on context, target population, and the setting of the testing.
Taking into consideration everything said above, it is evident that the PF model differs from the traditional models in several respects. First, the PF model assesses the degree to which test score can be altered through various manipulations, while the traditional models assess the degree to which test score correlates with some related variables. Second, research methods are completely different since the traditional models use correlational analysis, while the PF model uses experimental manipulation. Third, the two methods are different in terms of generalizability of the validity coefficient. Finally, the goals of tests under the two models are considered from different perspectives with the traditional models focusing on the provision of the highest possible correlation between the test score and outcome, while the PF model shows the impact of the process on test scores and its alterations. Bornstein (2011) asserts that it is necessary to use the PF model as it can show the validity of the process of testing in addition to providing psychologists with an opportunity to comprehend psychological constructs better and develop measures that would assess target constructs and be generalizable to all genders, ages, and other markers of population groups. This way, test bias, misuse, and misinterpretation of test scores can be decreased.
The PF model of validity is also applicable to the Internalized Shame Scale (hereinafter referred to as the ISS) developed by David R. Cook. It was reviewed in The Sixteenth Mental Measurements Yearbook series in 2005 and first presented by the author in 1993. The ISS is a self-report questionnaire with 30 items assessed under the Likert scale, including 24 negative items and 6 positive items (Cook, 1993). The two psychological constructs measured by the ISS include shame, which is the primary construct, and self-esteem, which is the secondary construct interconnected with the former. According to Cook (2001), the validity of the test has been proven by the data collected in a wide range of research aimed at studying and measuring shame. Initially, Cook (1993, p. 13) stated that, “The test-retest correlation for the shame items was .84 and the self-esteem items was .69. Taken together these results substantiate that the ISS is a highly reliable measure of internalized shame.” This way, it is obvious that the traditional models of validity have been used to assess the ISS.
In line with the process-based classification of tests provided by Bornstein (2011), the ISS can be identified as a self-attribution test since its scores show the degree to which people attribute different shame-related attitudes, feelings, and experiences to themselves. Using the four steps of the PF model outlined above, it can be stated that the ISS studies such underlying psychological processes as shaming, internalization of shame and guild, and self-esteem. Some context variables that may alter these processes include presence or absence of mental disorders, family environment, the history of abuse, substance abuse history, and the socio-economic standing. Cook (1993) has taken into consideration some of these variables. For instance, he administered the ISS to different groups, including mental hospital patients, students, and prisoners. It can also be assumed that outcomes of the test will be impacted by the process of testing and that people under different circumstances, such as different settings, will provide different answers. Hence, students who take the test in their free time are likely to give more honest answers than patients in a mental hospital who may consider this test an important part of their state evaluation required for release. As to outcomes, the correlational analysis conducted by Cook (1993) shows the validity of the ISS, but the process validity is under question. Limiting conditions related to context-based influences have to be considered in order to establish the process-based validity of the results. However, it is also apparent that the ISS is generalizable, which satisfied the requirements of the fourth step outlined by Bornstein (2011). Thus, it can be concluded that in its current form the ISS can be assessed in terms of validity under the PF model as possessing “adequate outcome but not process validity” (Bornstein, 2011, p. 540). The issue is that the ISS measures shame and self-esteem as intended, but the process of testing needs improvements and it is necessary to reveal how respondents’ reactions and context-based influences in the process of testing impact the test scores.
In conclusion, the PF model discussed above offers an effective and efficient method of assessing the validity of various constructs and tests. It provides a valuable insight into the process of testing, which allows for a deeper understanding of the testing process and its process-based outcomes. Combining the PF model with the traditional models of validity seems to be the best way of assessing validity as it evaluates both the process-focused and outcome-focused validity. Besides, generalizability of the PF model has been proven by applying it to the ISS, which under Bornstein (2011) is valid in terms of the outcome and is not currently absolutely valid in terms of the process, even though the latter can be altered by improving the test.