Justified Conclusions & Sound Evaluation Design

Among the program evaluation standards are the two accuracy standards A1: Justified Conclusions and Decisions, and A6: Sound Designs and Analyses. A1 regarding justified conclusions and decisions is defined by Yarbrough, Shulha, Hopson, & Caruthers (2011) as, “Evaluation conclusions and decisions should be explicitly justified in the cultures and contexts where they have consequences” (p. 165). The associated hazards among this standard are many, they include assumptions of accuracy among evaluation teams, the ignoring of cultural cues and perspectives, assumptions of transferability, and finally a number of hazards concerning the emphasis on technical accuracy at the expense of cultural inclusivity and immediate environmental context in combination (Yarbrough et al., 2011, p. 167). Where this particular standard is most consequential concerns the sociological factors inherent in any assessment. Said of such factors, Ennis (2010) notes, “It is the liking part – the emotional, aesthetic, or subjective decision to actively cooperate with an institution’s assessment regime – that suggests the difficulties inherent in coupling the success of an assessment program to the establishment of an assessment culture” (p. 2). Thus, an assessment culture cannot be established through rigor or display of technical prowess alone.

An institutions’ absorptive capacity is not directly correlated with the rate of acceptance of new knowledge. Rather, the effect of hazards such as disregarding extant culture/subcultures, disregarding the needs of the immediate environment, and disregarding transferability all pose a direct threat to both acceptance and adoption of whatever findings an assessment produces. The recommendations for correcting for this therefore include (1) clarifying which stakeholders will form conclusions and permit the integration of those stakeholders’ knowledge frameworks; (2) clarify the roles and responsibilities of evaluation team members; (3) ensure findings reflect the theoretical terminology as defined by those who will draw conclusions; (4) identify the many definitions of accuracy as per assessment users; and (5) make effective choices regarding depth, breadth, and representation of the program (Yarbrough et al., 2011, p. 166).

A6, the standard of sound designs and analyses is defined by Yarbrough et al. (2011) as, “Evaluations should employ technically adequate designs and analyses that are appropriate for the evaluation purposes” (p. 201). The associated hazards for this standard include a number of considerations for responsiveness to the features, factors, and purpose(s) of a given program. Such hazards include choosing a design based on status/reputation rather than their ability to provide high quality conclusions, a lack of preparation for potentially disappointing evaluation findings, a lack of consideration for the many feasibility/propriety/utility standards, a lack of customization of design to the current environment, and a lack of broad-based consultant with stakeholders at multiple levels (Yarbrough et al., 2011, p. 204). The effects of a lacking, misguided, or inappropriate design can be devastating to the overall efficacy of a given assessment. Said of the need for sound design Booth, Colomb, & Williams (2008) comment, “In a research report, you must switch the roles of student and teacher. When you do research, you learn something that others don’t know. So when you report it, you must think of your reader as someone who doesn’t know it but needs to and yourself as someone who will give her reason to want to know it” (p. 19).

Performing an assessment based solely on the popularity of the design employed misses the point of assessing the program at-hand, which is to formulate a strategy for better understanding the unique program under study, and relate gathered data in a way both digestible and actionable by those who hold a stake. One can employ a procedure which reliably gathers data, yet if unrelated data, or unnecessary data, the design lacks both utility and in this instance accuracy. So how can accuracy be increased and applicability restored? It is suggested to instead select designs based on the evaluation’s purpose, secure adequate expertise, closely evaluate any designs which are in contention, choose framework(s) which provide justifiable conclusions, allow for compromise and uncertainty, and consider the possibility of ongoing/iterative modifications to the design over protracted periods to ensure currency (Yarbrough et al., 2011, p. 204). Doing so will not only ensure that your audience receives actionable results, yet that same audience can also hold the design and collective opinion of the efficacy of the assessment in greater confidence, as their understanding of it is equally elevated.

Booth, W. C., Colomb, G. G., & Williams, J. M. (2008). The craft of research (3rd Ed.). Chicago, IL: The University of Chicago Press.

Ennis, D. (2010). Contra assessment culture. Assessment Update, 22(2), 1–16.

Yarbrough, D. B., Shulha, L. M., Hopson, R. K., & Caruthers, F. A. (2011). The program evaluation standards (3rd ed.). Thousand Oaks, CA: Sage Publications, Inc.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s