This research uses fuzzy set/Qualitative Comparative Analysis (fsQCA) to explore the role of perceived fairness in promoting objective behavior in peer assessment. Drawing on three key antecedents - outcome fairness, anonymity, and explanation - the paper presents multiple combinations of antecedents that lead to high levels of perceived fairness among students. Contrary to the proposition that all fairness factors must be high, the findings reveal that perceived fairness can be achieved when any two of the three antecedents are present at high levels. Specifically, three effective combinations emerge: (1) high outcome fairness with high anonymity and low explanation, (2) high outcome fairness with low anonymity and high explanation, and (3) low outcome fairness with high anonymity and high explanation. These results underscore the compensatory nature of fairness perceptions and offer practical implications for educators and administrators designing peer assessment systems. By ensuring at least two fairness dimensions are adequately addressed, institutions can foster more reliable and ethical peer evaluations. The paper contributes to the literature on peer assessment by highlighting the configurational logic of fairness and its practical utility in educational contexts.
This research measures how perceived fairness can be ensured in peer assessment under a higher education environment. Higher education evolves in many aspects, including teaching, learning, assessment, etc. Empirical evidence supports the positive impact of peer assessment on performance regardless of disparities in peer assessment designs (Double et al., 2020; Li et al., 2020). Peer assessment and its corresponding review studies cover a wide variety of topics, such as the design and implementation of peer assessment practices, reliability and validity of peer assessment, quality criteria for peer assessment practices, impact of social and interpersonal processes, peer assessment of collaborative learning, and how instructional conditions relate to peer assessment outcomes (Alqassab et al., 2023).
Though peer assessment brings many benefits for instructors and students, such as efficiency, there are still many discussions about its suitability. A strand of the literature focuses on how instructional conditions relate to peer assessment outcomes (Ashenaf, 2017; Double et al., 2020; Hoogeveen & Van Gelderen, 2013; Huisman et al., 2019; Li et al., 2020; Panadero et al., 2018; Sanchez et al., 2017; Topping, 2003, 2013, 2021a, b; Van Popta et al., 2017). One issue is that perceived fairness greatly affects instructors’ attitude toward peer assessment.
Peer assessment has become a critical and popular topic, particularly during the COVID-19 pandemic period. Vander Schee and Birrittella (2021) conducted a study spanning two years to compare peer assessment in pre-COVID-19 hybrid and post-COVID-19 online courses. The peer group grading process and grades are analyzed against the instructor grades. The results demonstrate no significant difference in peer group grades and instructor grades for both courses. Student survey results also show that students perceive peer group grading as fair in both course formats (hybrid and online). Therefore, instructors can feel confident about using peer group grading as a fair assessment tool.
When it comes to questions about reliability, grading, and objectivity of student self and peer assessment, the agreement of peer assessment observed leaves a wide range of distribution of responses. Only 29% of teachers think that student self and peer assessments are as reliable as grading done by themselves. Indeed, most (85%) teachers are skeptical about using the grades obtained from student self and peer assessment as official grades (Gurbanov, 2016).
The literature has widely covered peer assessment using traditional variable-oriented methods (e.g., regression analysis and structural equation modeling (SEM), which focus on the net effect of independent variables on outcomes. Most quantitative studies surveyed by Alqassab et al. (2023) applied variable-oriented methods. However, linear models assume that causal relationships are symmetrical and additive, which may oversimplify the complexity of fairness and objectivity in peer assessment (Ragin, 2009; Schneider & Wagemann, 2012). Hence, exploring the combinations of causes to achieve perceived fairness is rather important.
Perceived fairness has long been studied ever since Greenberg (1986), but most papers target the working environment. This present research borrows the concept of perceived fairness from those studies as the outcome. It then goes further by using fuzzy set/Qualitative Comparative Analysis (fsQCA) as the analytic tool to explore the combinations of antecedents to reach perceived fairness.
Relationships among factors are usually complex (Urry, 2005). One interesting phenomenon in causal complexity is causal asymmetry, where the causes leading to the presence of an outcome may differ from those leading to the absence of the outcome (Ragin, 2009). One benefit of fsQCA is that it provides important benefits over regression-based methods (Woodside, 2013). It also focuses on the complex and asymmetric relations between the outcome and its antecedents, whereas regression-based methods examine factors as they compute the net effect between the factors in a model (Pappas and Woodside, 2021). This study explores the combinations of antecedents for peer assessment to achieve perceived fairness, taking fsQCA as the research method.
The rest of the paper runs as follows. Section 2 reviews relevant literature. Section 3 introduces the data and research methods. Section 4 provides the empirical results. Section 5 discusses them and their implications. Section 6 concludes.
Literature reviewPerceived fairnessGreenberg (1986) stressed the potential importance of fair evaluations in determining workers’ acceptance of appraisal systems (Dipboye & De Pontbraind, 1981; Lawler, 1967) and identified two approaches to the answer: distributive justice and procedural justice. Distributive justice represents fairness of the evaluations received related to the work performed, corresponding to the perceived fairness of received outcomes/resources (Adams, 1965). It is assumed that people perceive justice given a balance between effort and benefits (Peiró et al., 2014).
Procedural justice assesses the fairness of the evaluation procedure. Hence, perceived fairness also includes the procedures by which the resources are allocated. Landy et al. (1980) found that perceived fairness of performance evaluation relates to process components. Specifically, fairness of procedural justice (Thibaut & Walker, 1975) refers to the procedures developed to achieve outcomes or resources. Therefore, while distributive justice is more outcome-oriented, procedural justice is more relationship-oriented (Peiró et al., 2014).
There is growing empirical evidence that judgments are influenced by the enactment of the procedure as well. Interactional justice is identified from procedural justice (Bies & Shapiro, 1987). While the latter refers to the more structural facet of procedures, the former focuses on the more interpersonal aspects. It concentrates on the relevance of interpersonal treatment when procedures are implemented (Bies & Moag, 1986).
Peer assessment and perceived fairnessPeer assessment is an arrangement for students to specify the level of performance of other equal-status students (Topping, 2009). Different levels of schools have applied peer assessment. For example, peer assessment is a tool to enhance primary bilingual teachers’ training (Huertas-Abril, et al., 2021). In a college, peer assessment can foster proving skills in mathematics classes (Knop, et al., 2022).
Perceived fairness refers to the fair representations of effort and contribution (Greenberg 1986; Rasooli et al., 2025). Students consider fairness as a critical issue in classroom assessment approaches (Sambell et al., 1997). Heidari and Saghafi (2025) evaluated 29 architecture students in peer assessment and identified fairness challenges, such as concerns regarding collusion, power dynamics within friend groups, limitations of participatory culture, and overwhelming responsibility. The paper suggests that a multistage peer assessment process is effective at addressing fairness challenges.
Perceived fairness is taken as an important factor in online peer assessment (Lin, 2018). Vander Schee and Birrittella (2021) measured fairness in peer assessment. They demonstrated no significant difference in peer group grades and instructor grades for both courses.
There could unfortunately be rating bias in peer assessment. In Stonewall et al. (2024), both instructors and students detailed bias in their classrooms and with assessments where evidence of bias showed up in peer assessment scores. Kaufman and Schunn (2011) found that a significant drop in perceived fairness occurs after online peer assessment has been implemented. These studies all highlight the importance of perceived fairness in peer assessment.
AntecedentsTo measure perceived fairness, this study uses outcome fairness to represent distributive justice, anonymity to represent procedural justice, and explanation to represent interactional justice. Outcome fairness is rather straightforward, by measuring if the students consider the evaluation results are fair. Hence, it serves to reflect the characteristic of distributive justice.
In various contexts, anonymity is considered important in procedural justice (Harris et al., 2013; Hough et al., 2016; Kaur & Carreras, 2021; Panadero & Alqassab, 2019). Rater identity is concealed so as to reduce interpersonal bias and social pressure (Double et al., 2020). Lin (2018) investigated online peer assessment within a Facebook-based learning application with a focus on the effects of anonymity. The results indicated that the anonymous group provided significantly more cognitive feedback, while the identifiable group offered more affective feedback and more meta-cognitive feedback. Members of the anonymous group also perceived that they had learned more from peer assessment and had more positive attitudes toward the system, but they also perceived peer comments as being less fair than the identifiable group did. The findings provide important evidence for the cognitive and pedagogical benefits of anonymity in online peer assessment among pre-service teachers. In particular, anonymity affects perceived fairness and hence represents procedural justice.
The literature has identified a number of features associated with interactional justice, such as the provision of an explanation (Bies & Moag, 1986), honesty (Clemmer, 1993), empathy and assurance (Parasuraman et al., 1985), directness and concern (Ulrich, 1984), effort (Mohr, 1991), acceptance of blame (Goodwin & Ross, 1989), and the offering of an apology (Goodwin & Ross, 1992; Bies & Shapiro, 1987; Folkes, 1984). To measure perceived fairness in peer assessment, this present study employs explanation to denote interactional justice.
Explanation refers to explaining the grading criteria clearly and having them applied consistently (Falchikov & Goldfinch, 2000). Quantitative peer assessment studies have compared peer and teacher marks. Findings showed that peer assessments resemble more closely teacher assessments when grading criteria are explained very well (Falchikov & Goldfinch, 2000).
To explore perceived fairness in peer assessment, outcome fairness (representing distributive justice), anonymity (representing procedural justice), and explanation (representing interactional justice) serve as the antecedents, and perceived fairness is the outcome. Based on the literature, Fig. 1 depicts the research framework.
fsQCAThe set-theoretic approach of fsQCA uses Boolean algebra to determine which combinations of antecedents contribute to the outcome of interest (Boswell & Brown, 1999; Ragin, 1987; 2009). The combinations of antecedents provide various alternative causal relationships to help understand an outcome’s construct (Kraus et al., 2018). As a result, fsQCA with its complexity theory in business and management can be applied in a multitude of disciplines (Fiss, 2007; Rihoux et al., 2013).
FsQCA can solve many complex problems. For example, Huang et al. (2023) employed it to explore the influencing paths of college students’ entrepreneurial willingness in China. Cabrilo et al. (2024) used fsQCA to examine the contingency and complex relations between multidimensional intellectual capital, technology-based knowledge management, and innovation outcomes in the rapidly changing business environment via survey data collected from 102 publicly-listed firms in Taiwan. Chen and Chen (2024) investigated the configurational relationships among e-government online services’ technology, institutional frameworks, content provision, e-participation, service provision, and innovation, in order to enhance national governance capacity and offer governance support through fsQCA.
FsQCA may face some limitations in its application. It relies heavily on prior theory to select antecedents, set thresholds for calibration, and interpret results. Without a strong theory, the analysis may appear arbitrary or post hoc. Multiple solutions for this may thus arise. Which one is more valid may be determined subjectively. Furthermore, when multiple solutions lead to an outcome, interpreting the meaning and implications of all configurations can be challenging.
Data and methodologyPeer assessment set-upIn the teaching course, the students worked in groups to deliver presentations. Each group consisted of 4 to 6 students. The presentation topics revolved around current popular applications of artificial intelligence (AI), including but not limited to the following: Advantages of Tesla’s autonomous driving technology; the Cambridge Analytica case: A key factor in Donald Trump’s victory in the 2016 U.S. presidential election; applications of image recognition systems; and applications of speech recognition systems.
The topics and order of presentations were determined by drawing lots. Each group had 15 minutes for their presentation, and the slides and oral report were delivered in both Chinese and English to enhance bilingual communication skills. To improve overall quality of the presentations and fluency of spoken English, students were encouraged to use Google Translate for collaborative translation and generative AI tools (such as ChatGPT) to help organize content, highlight key points, and make the expressions clearer and more natural.
The grading criteria were announced at the beginning of the course and were to follow an anonymous peer review system. Students were encouraged to learn from the strengths and areas for improvement observed in other groups, continuously refining their own performance to enhance learning outcomes.
Performance assessment and assessment toolsThis study adopts a multiple assessment approach to conduct a comprehensive evaluation of students’ learning performance. The assessment is divided into different categories, including in-class AI-related English vocabulary instant Q&A and observational assessment, formative assessment through peer evaluation of AI project reports, and periodic assessments such as midterm and final exams. This approach aims to achieve a well-rounded evaluation of students’ professional learning performance.
Based on the assessment methods, various evaluation tools are designed, including paper-based tests, in-class AI-related English vocabulary instant Q&A, group presentations on key AI topics, and peer evaluation scores. These tools serve as the basis for assessing students’ learning outcomes.
Data and measurementsThe subjects of this study include 60 students from the Department of Industrial Design and 54 students from the Department of Industrial Engineering and Management in the Fall semester of 2023 (from September 2023 to January 2024), as well as 53 students from the Department of Business Administration and 48 students from the Department of Insurance and Financial Management in the Spring semester of 2024 (from February 2024 to July 2024). In total, there are 215 participants (N=215).
The questions were designed to find out the conceptions and attitudes of the participants towards peer-assessment. Each question applies a 5-point Likert scale, and there are three antecedents. Six questions measure the antecedent of Perceived Outcome Fairness: one question for Anonymity; three questions for Explanation; and two questions for the outcome, Perceived Fairness. We take the average of all the values from multiple questions as the value for each antecedent and the outcome.
fsQCAThere are three major steps in fsQCA (Pappas & Woodside, 2021). First, fsQCA calibrates the data and computes the degree in which a case belongs to a set (Ragin, 2000; Rihoux & Ragin, 2009). In short, it transforms data into fuzzy values between 0.0 and 1.0.
Second, based on the fuzzy values, fsQCA generates multiple solutions for the researchers to evaluate. A commonly-used method is consistency, which is analogous to correlation (Woodside, 2013). Solutions with consistency above 0.80 are considered useful and can serve as theory advancement (Woodside, 2017).
Consistency (Xi ≤ Yi) =∑[min(Xi, Yi)]/∑(Xi),
Finally, the analysis results are interpreted.
Empirical analysisFirst, this study calculates the average for each antecedent and outcome. Next, it runs correlation analysis between all antecedents. The correlation coefficient refers to a statistical measure that quantifies the relationship between two antecedents. Table 1 lists the results. The correlation coefficients between any two of the antecedents show that there exist positive relationships in-between.
The first step in fsQCA is calibration. There are various methods to conduct data calibration. Absolute and relative methods are two common ones. The absolute method determines thresholds based on fixed ranges, such as survey scales. Here, the thresholds are set without considering data distribution. The reason is that the upper and lower bounds of the survey data are known and fixed. For example, Ordanini et al. (2014) proposed calibrating 7-point Likert scale data with thresholds of 6 (full membership), 4 (intermediate membership), and 2 (full non-membership). Pappas et al. (2016, 2020) adopted this method in their studies.
The relative method determines thresholds based on data percentiles, making it adaptable to varying data ranges. Fiss (2011), Ragin (2009), and Rihoux and Ragin (2009) suggested using the 95th percentile for full membership (1.0), the 5th percentile for full non-membership (0.0), and the 50th percentile for intermediate membership (0.5). In this method, when the ranges of data vary, the thresholds change. This method accounts for diverse data scopes and distributions, which is why it is widely adopted in fsQCA studies (Huarng & Yu, 2024; Yu & Huarng, 2023; 2024). This present research adopts a survey approach to collect data and fits the absolute method.
After the calibration, this study conducts analysis of the necessary conditions. Consistency of all the antecedents is greater than 0.9, showing that all three are necessary conditions for the outcome. Table 2 lists the detail of the analysis.
Table 3 lists the results that fsQCA generates. There are three combinations of antecedents. Each combination has consistency over 0.8, implying consistent results. Solution consistency is also over 0.8, demonstrating the overall analysis is consistent.
Observing each individual combination, all three antecedents appear in the combination. Each of the three combinations consists of two High antecedents and one Low antecedent.
Discussion and ImplicationsDiscussionIn contrast to the proposition, which suggests that all three antecedents must be High to achieve a High level of Perceived Fairness, the results reveal a more nuanced pattern. Interestingly, each effective configuration contains one antecedent at a Low level, challenging the assumption that all conditions must simultaneously be High. This suggests that different combinations of conditions compensate for the absence of a single factor, reflecting the principle of equifinality in configurational research.
The first configuration, c_Outcome * c_Anonymity * ∼c_ Explanation, indicates that High Outcome Fairness, when combined with High Anonymity and Low Explanation, leads to High Perceived Fairness. This implies from the perspective of students that a strong sense of anonymity in the peer assessment process compensates for the lack of explanations. In other words, when students feel confident that their identities are protected, they are more willing to grade peers objectively - even if the assessment process itself is not entirely open or clearly explained.
The second configuration, c_Outcome * ∼c_Anonymity * c_ Explanation, highlights a different pathway to the same outcome. Here, High Outcome Fairness and High Explanation, despite Low Anonymity, still result in High Perceived Fairness. This suggests that when the grading process is explained well and perceived as fair, anonymity becomes less critical. Th explanation may instill trust and accountability, reducing the perceived need for concealment of identity. From the students’ standpoint, knowing how and why grades are assigned can encourage fairness, even when peer evaluators are identifiable.
The third configuration, ∼c_Outcome * c_Anonymity * c_ Explanation, presents a particularly intriguing case. In situations where Outcome Fairness is perceived to be Low, both Anonymity and Explanation must be High to maintain High Perceived Fairness. This finding suggests that when students are dissatisfied with the fairness of the assessment results, the system must compensate by ensuring both clarity in procedure and protection of evaluator identity. It indicates that a higher threshold of structural support is necessary to encourage objective behavior in the face of perceived unfairness.
These findings together demonstrate that no single antecedent is sufficient on its own. Students’ willingness to grade objectively instead depends on specific combinations of factors, each addressing different psychological needs - fairness, safety, and clarity.
ImplicationsThe results of this study provide several important contributions to the peer assessment literature, particularly in educational settings that aim to foster objective student evaluation. They reveal that High Perceived Fairness can emerge through multiple, distinct pathways with each one shaped by the interplay between perceptions of fairness, anonymity, and explanation.
First and foremost, educators at all levels, ranging from primary to tertiary education, can benefit from these insights. Careful attention should be given to how assessment procedures are designed and communicated. Educators should recognize that students do not require all ideal conditions to achieve perceived fairness; instead, well-balanced design elements can compensate for the absence of others. For instance, in contexts where an explanation is difficult to achieve (such as blind reviews), enhancing anonymity and clearly demonstrating fair grading outcomes still maintain High Perceived Fairness.
Second, the findings have practical value for educational administrators. Administrators tasked with implementing peer assessment systems should note the importance of student perceptions. Ensuring structural fairness is important, but so is fostering perceived fairness. Systems that incorporate mechanisms for anonymous feedback, well-explained rubrics, and post-assessment reviews are likely to gain higher acceptance among students and thus lower bias in peer evaluations.
Third, beyond educational settings, the findings herein can shape practices in organizational and corporate training environments. Human resource managers or training supervisors who employ peer evaluations, such as 360-degree feedback or collaborative project reviews, can apply the same principles. Recognizing the value of anonymity and explanation improves the credibility and acceptance of peer review systems within teams and departments.
This study challenges simplistic assumptions about what drives Perceived Fairness in peer assessments. It also highlights the importance of designing flexible systems that adapt to varying student expectations and psychological dynamics.
ConclusionThis research investigates one of the most critical challenges in peer assessment: students’ perceived fairness. By adopting a configurational approach, we explore how specific combinations of fairness-related factors influence their perceived fairness. On the basis of the perceived fairness literature, we establish the research framework, consisting of distributive, procedural, and interactional justices. In particular, we measure three core dimensions of perceived fairness by outcome fairness, anonymity, and explanation.
Our findings reveal three distinct and meaningful configurations that lead to High Perceived Fairness as follows.
High Outcome Fairness, High Anonymity, and Low Explanation
High Outcome Fairness, Low Anonymity, and High Explanation
Low Outcome Fairness, High Anonymity, and High Explanation
These results offer several important insights. First, they demonstrate that High Perceived Fairness is achieved through multiple pathways and not just one ideal scenario. While we initially hypothesize that all three fairness antecedents need to be at a High level to produce High Perceived Fairness, the data show otherwise. In practice, students only require two out of the three antecedents to be high in order to engage in High Perceived Fairness. This flexibility suggests a form of compensatory effect - when one fairness factor is lacking, the presence of the other two helps maintain a sense of balance and trust in the evaluation process.
Second, the study provides practical guidance for educators and administrators who aim to incorporate peer assessment into their teaching and evaluation practices. Rather than striving to maximize all fairness dimensions at once, which may not always be feasible due to resource or contextual constraints, educators can focus on designing assessment environments that fulfill at least two of the three key fairness conditions. Doing so likely yields comparable outcomes in terms of student objectivity and reliability in peer evaluations.
In summary, this research highlights that perceived fairness is not a rigid, all-or-nothing framework. Instead, it is a dynamic system where different elements interact to support students’ ethical and responsible grading behavior. By understanding these interactions, educational stakeholders can better design peer assessment processes that not only improve academic integrity, but also foster a stronger sense of fairness and engagement among students. Future studies may build on this foundation by examining cultural, disciplinary, or institutional differences in fairness perception and peer evaluation practices. In addition, research could focus on how individual differences further shape these preferences, potentially leading to even more tailored and effective assessment practices.
FsQCA is suitable for exploring problems with complex causal relationships and generates multiple solutions (in other words, equifinality). However, it faces some limitations. Equifinality may create an issue that determining which solution is more valid often depends on researcher judgment or theory. Multiple solutions also demand further efforts to provide proper interpretation.
This study conducts configurational analysis of perceived fairness in peer assessment. Future studies can measure how perceived fairness may affect grade objectivity. In addition, this research takes college students as the survey subject. Though the findings herein can be applied to various contexts, their adaptation may be constrained due to different environments.
CRediT authorship contribution statementDuen-Huang Huang: Writing – review & editing, Writing – original draft, Validation, Supervision, Project administration, Funding acquisition, Formal analysis, Data curation, Conceptualization.
The author would like to thank the Ministry of Education, Taiwan for its partial financial support to this study, under Project Number PGE1121015.





