Recent studies focus on machine learning (ML) algorithms for predicting employee churn (ECn) to save probable economic loss, technology leakage, and customer and knowledge transference. However, can human resource professionals rely on algorithms for prediction? Can they decide when the process of prediction is not known? Due to the lack of interpretability, ML models' exclusive nature and growing intricacy make it challenging for field experts to comprehend these multifaceted black boxes. To address the concern of interpretability, trust and transparency of black-box predictions, this study explores the application of explainable artificial intelligence (XAI) in identifying the factors that escalate the ECn, analysing the negative impact on productivity, employee morale and financial stability. We propose a predictive model that compares the best two top-performing algorithms based on the performance metrics. Thereafter, we suggest applying an explainable artificial intelligence based on Shapley values, i.e., the Shapley Additive exPlanations approach (SHAP), to identify and compare the feature importance of top-performing algorithms logistic regression and random forest analysis on our dataset. The interpretability of the predictive outcome unboxes the predictions, enhancing trust and facilitating retention strategies.
The biggest challenge in today's dynamic and competitive business environment is the effective retention of human capital. Due to globalisation, organisations need assistance to retain employees to overcome two key challenges: the loss of skilled and knowledgeable employees and synchronisation with trends in the skilled market. The transition to employee churn has impacted businesses in many sectors and economies. It gathered speed throughout the global epidemic and witnessed a significant increase during the phenomenon known as the 'Great Resignation', which began in the United States in 2021.
High costs are incurred from inducting and training a new employee; organisational stability and productivity are key challenges that organisations encounter (Siong et al., 2006). According to the EY 2021 Work Reimagined Employee Survey, around 54 % of surveyed employees would consider leaving their current job post-COVID-21. Society for Human Resource Management (SHRM) suggests that employee replacement costs 90 % to 200 % (Alsheref et al., 2022). Deloitte reported that employee costs are up to 1.5 to 2 times the yearly salary (Punnoose & Ajit, 2016).
Employee churn can be voluntary and involuntary (Gelbard et al., 2018). The voluntary churn cost can range by up to five times the average employee's annual salary depending on the criticality of the job and problems in adequately replacing an employee (Liborius & Kiewitz, 2022). Voluntary churn occurs when employees leave for personal reasons, while involuntary churn refers to the wilful release of employees by the organisation (Chalutz Ben-Gal, 2019). Organisations typically focus on voluntary churn, which arises when an employee resigns seeking enhanced financial compensation, perks, or an improved working environment (Anh et al., 2020). By contrast, unfavourable reasons for departure include disagreements with supervisors, lack of promotion opportunities, employees themselves, organisational culture, or a repetitive, unchallenging job. This necessitates for HR professionals, managers, and leaders to understand the key factors understanding and contributing to employee churn.
Academicians and practitioners have endeavoured to predict employee churn. This has led to including data-driven predictions using machine learning (ML) algorithms (Fallucchi et al., 2020). Various data mining methods were used to build ML models. Existing studies have used ML models to predict employee churn with high accuracy (Sisodia et al., 2017; Chaudhary et al., 2022; AbdElminaam et al., 2023; Bandyopadhyay & Jadhav, 2021; Mhatre et al., 2020; Thompson et al., 2022). However, ML prediction needs to provide an understanding of a complex data structure, and it is difficult to extract valuable insights and connections among employee features (El-Rayes et al., 2020).
Though studies reveal that ML predicts employee churn using various algorithms, minimal research has focused on explaining these predictions, limiting ML algorithms to opaque black boxes. The rapidly developing discipline of artificial intelligence (AI) offers an intriguing answer to the current problem of high ECn. By utilising sophisticated AI algorithms and adopting an explainable AI (XAI) approach, it is possible to discover complex patterns and indicators that point to an employee's likelihood of leaving an organisation (Marín Díaz et al., 2022). XAI not only reveals the decision-making process of the predictive model but also identifies the key features or key determinants with a particularly significant effect on the predicted outcome (Linardatos et al., 2020).
In addition, businesses equipped with this information might modify their recruitment procedures. Employers should proactively search for candidates who meet the specific criteria to increase employee retention and make staff more likely to stay with the firm for an extended period. The inclusion of AI, emphasising the ability to provide explanations, presents an unprecedented method to combat employee turnover. By analysing the complex network of factors that lead to employee churn, firms may strategically develop plans to retain current employees and attract new individuals who will likely form long-lasting connections with the organisation. This shift in perspective can significantly influence an organisation's stability, productivity, and overall development.
This requires extracting features and establishing connections to bolster the efficacy of predictive models using explainable AI (XAI). Shapley Additive exPlanations (SHAP), proposed by Lundberg and Lee in 2017, provide the solution to this limitation. SHAP benefits in balancing the precision and interpretability of results, which is otherwise a challenging endeavour. The study, therefore, clarifies crucial aspects of employee churn and the importance of using artificial intelligence in churn analysis. Overall, our research contributions are:
- •
Experiment: Comparable ML algorithms based on performance metrics for employee churn analysis.
- •
Development of a framework integrating ML and explainable AI for employee churn to address the limitation of algorithms on the scale of explainability, transparency and interpretability.
- •
Assess the applicability and implications of the framework in the organisation for further evaluation of employee churn.
Fig. 1 describes the research gap identified by the authors. Researchers focus on churn predictions using data-driven methods but have ignored the fundamental aspects of predicted results, i.e., not all decision-makers can understand these complex outputs. The explainability of the model is essential when output-based decisions are intertwined with ethical, security, confidentiality, and lawful concerns for the applicability of opaque models (Crawford, 2013). This study's novelty is developing a systematic approach to performing a study on interpretable employee churn that can be easily understood. This study identifies beneficial factors that impact EC that may be utilised to make well-informed decisions regarding employee retention and recruitment tactics.
By embracing the unique approach suggested in this study, businesses can examine and explain the various factors contributing to ECn. Properly comprehending the model is crucial for obtaining practical insights and understanding the reasoning behind predictions. Understanding this information is vital for individuals in positions of authority since it enables businesses to tailor their tactics and interventions precisely. This article encourages a fundamental change in how corporations see and handle ECn. By leveraging an interpretable approach, firms can strategically apply methods that improve employee retention and augment recruitment. The conclusions of this research provide valuable guidance for achieving a stable and motivated staff, promoting sustained organisational growth. The following study is organised in various sections. Section 2 summarises the literature on employee churn regarding ML and artificial AI. Section 3 elaborates on the proposed framework and preprocessing of the data and the research methods (logistic regression, random forest algorithms and SHAP) involved in the prediction and interpretability of the model. The analysis of the result is presented in Section 4, followed by a discussion of managerial and theoretical implications in Section 5. Section 6 presents the conclusion, followed by limitations and directions for future research in the last section.
Literature reviewEmployee churnIn recent decades there has been an increase in Human Resource Management academic literature due to the belief that organisation human resources were considered a valued resource for potentially achieving competitive advantage. Due to technological advancement and globalisation, there has been an upsurge in retaining talent through operative human resource practices (Du Plessis, 2006). Fig. 2 illustrates a primary search of publications related to ECn in the Scopus database. The authors identified 8394 documents using the combination of keywords (“Attrition” OR “Turnover” OR “Churn” AND “Employee”). The publications by year are graphically shown in Fig. 2 thus highlighting the growth of academic interest in this area in recent years.
Employee churn encompasses direct and indirect impacts ranging from replacement cost to low employee morale (Boushey & Glynn, 2012). Table 1 illustrates that existing research has demonstrated many factors that lead to employee churn. These factors analysis has contributed to the development of human resource analytics (Chalutz Ben-Gal et al., 2021). Demographic factors, namely age, marital status, and department, are determinants of employee churn (Chowdhury, 2015). Millennials prefer meaningful employment over well-paying professions (DeVaney, 2015). Chen (2020) developed a structural equation model that included confirmatory factor analysis and path analysis to assess the impact of leadership on job satisfaction and employee churn. The result confirms that working conditions have a mediating result on job satisfaction and leadership. Table 1 has compiled the studies focused on the factors/features that lead to employee churn in.
Literature review of factors contributing to employee churn.
Human Resource Predictive Analysis has accelerated technological advancement and can achieve 100 % accuracy in human resource decision-making. Over the last decade, as analytics implementation has increased, it has entered human resources, but it has yet to gain significant momentum. The current market situation necessitates that the HR function goes beyond reporting to accurate forecasting. Data mining is vital for scrutinising the relationship between the data variables (Jayanthi et al., 2008). Gartner notes the adoption and integration of artificial intelligence (AI) increased by 270 % between 2015 and 2019. Around 37 % of organisations will use AI in the workplace in 2019 (Costello, 2019). Cognitive technologies such as AI, robotics, and ML will replace 16 % of jobs by 2025. It is vital to comprehend the number of publications addressing ML methods in forecasting ECn during the research process. Furthermore, understanding the implementation of strategies designed to reduce attrition is important.
Fig. 3 provides a graphical illustration showing the decrease in the publication of research articles to 90 with the introduction of this additional variable. Gao et al. (2019) used an improved random forest algorithm to predict employee turnover. Nagadevara et al., 2008 implemented artificial neural networks (ANN), classification trees (C5.0), classification and regression trees (CART) and discriminant analysis to study the association between employee demographics, absenteeism, tenure and employee churn. Saradhi and Palshikar (2011) studied the comparison of Naïve Bayes (NB), Support Vector Machines (SVM), LR, Decision Trees (DT) and RF algorithms for employee churn prediction.
Table 2 emphasises the employee churn predictions based on various ML algorithms, but the authors need to look into the explainability of these models based on artificial intelligence.
Previous studies on employee churn and machine learning.
| Authors | Algorithms used | Explainable AI |
|---|---|---|
| Sisodia et al. (2017) | SVM, C5.0 Decision Tree Classifier, RF, KNN and NB | No |
| Alaskar et al. (2016) | LR, DT, NB, SVM, AdaBoost | No |
| Chaudhary et al. (2022) | CatBoost, SVM, DT, RF, XGBoost | No |
| Mhatre et al. (2020) | LR, DT, KNN, SVM, XGBoost, NB | No |
| Anh et al. (2020) | SVM, LR, RF | No |
| Bandyopadhyay and Jadhav (2021) | VM, RF and NB | No |
In recent years, ML algorithms have begun to concentrate on interpretability. Previously, the main emphasis was on algorithmic accuracy, which often resulted in a compromise between accuracy and interpretability. Increased precision is often associated with decreased interpretability (Stiglic et al., 2020). The very existence of this trade-off has contributed to the increasing importance of XAI as an essential tool. XAI strives to achieve improved prediction accuracy using opaque algorithms and tackles the crucial requirement of interpreting AI decisions. When assessing the interpretability of ML algorithms, researchers have classified them into two main categories (Gilpin et al., 2018). White-box models strive to establish a direct relationship between input factors and resulting outputs. By contrast, 'black-box models' lack decision rules that are simply comprehensible. It is pertinent to mention that interpretability is still being discussed, even among white-box models, where issues about their interpretability have been highlighted (Loyola-Gonzalez, 2019). The increased appreciation for XAI highlights the trend towards understandable AI systems, particularly when decisions have significant implications for individuals or society. Establishing the optimal balance between prediction accuracy and interpretability remains a constant challenge in the business. Striking a balance is crucial for cultivating confidence and ensuring the adoption of AI systems in several practical domains, like healthcare, banking, and more. Fig. 3 depicts the temporal evolution of the research quantity and algorithms' interpretability. Despite the widespread use of AI in HRM, understanding its legitimacy and workers' views is presently in the initial phase of development (Lu, 2021). Table 3 summarises the prominent research on XAI in human resource management.
Publications related to XAI in HRM.
| Authors | Source | Cited by | Method used |
|---|---|---|---|
| Harl, Weinzierl, Stierle and Matzner (2020) | Journal of Decision Systems | 51 | Gated graph neural networks (GGNN) |
| Yuan et al. (2022) | Science Robotics | 39 | Bidirectional human-robot value alignment framework |
| Langer and König (2023) | Human Resource Management Review | 31 | Comprehensive analysis |
| Berman, de Fine Licht and Carlsson (2024). | Technology in Society | 4 | Case-study research design |
| Abonamah, La Torre, Poulin and Repetto (2022) | IEEE Xplore | 2 | Model-agnostic technique |
| Bhattacharya, Zuhair, Roy, Prasad and Savaliya (2022) | IEEE Xplore | 2 | SHAP |
The six papers shown in Fig. 4 apply to this study, as they were identified using a combination of keywords to identify the relevant documents on explainable AI in human resource management.
Recent studies have focused on multivariate ML algorithms for predicting employee churn, but very few have focused on explaining these predictions. The explainability of the model is essential when output-based decisions are intertwined with ethical, security, confidentiality, and lawful concerns for the applicability of opaque models (Crawford, 2013). However, the definition of explainability and interpretability are domain dependent. There is currently a dominant trend to prioritise using interpretability tools not exclusive to any one model. Automating interpretability becomes significantly simpler when we detangle the interpretation technique from the utilised model. By employing agnostic methodologies, we can substitute both the learning model and interpretation mechanism, resulting in significant scalability potential.
XAI explains to the end-users who depend on recommendations and decisions given by the AI system, and these end-users may not be the developers but policymakers, operators or managers (Arrieta, 2020). Complex ML models emphasise the importance of input variables for prediction. These variable attributions can be global, assessing variable relevance over the entire dataset, or local, focusing on a single variable wherein importance is measured at individual-level observations (Kazemitabar et al., 2017). Individual predictions are decomposed into variable contributions using local approaches (Ribeiro et al., 2018). Local methods provide the main advantage of presenting the functional form of the correlation between a variable and the output determined by the classifier. Independent of predictive accuracy, Shapley values show which variables impact a predicted value. As described in Fig. 4 and Table 4, it is apparent that studies on explainable AI in human resource management are limited and in the early stage of exploration. The authors have identified the research gap regarding XAI application in interpreting the ECn prediction. The novel contribution of this research lies in filling the research gap.
Studies on interpretable models.
| Authors | Model | Description |
|---|---|---|
| Kovalev, Utkin and Kasimov (2020) | Modified LIME | Solve an unconstrained convex optimisation problem |
| Zhang, Cho and Vasarhelyi (2022) | LIME and SHAP | Auditing the likelihood of significant errors in financial statements. |
| Tsoka et al. (2022) | LIME and SHAP | Classification of building energy performance certificates (EPC) |
| De Lange, Melsom, Vennerød and Westgaard (2022). | SHAP | Predicting credit default |
| Chen, Ke, Han, Gupta and Sivarajah (2024) | SHAP | The effect of product descriptions on sales forecasting |
| Wasilefsky, Caballero, Johnstone, Gaw and Jenkins (2024) | SHAP | Interpretable model for pilot candidate selection |
The methodology used in this study aligns with the knowledge discovery in databases (KDD) approach, and the established method for data mining, known as the cross-industry standard procedure for data mining (CRISP-DM) in the framework (Fig. 5). The IBM dataset from Kaggle is sourced. After the data source selection, we proceeded to progress to the next steps that constitute the model systematically.
The first step of CRISP-DM is to understand the domain, referring to understanding churn analysis. It is important to understand the HR processes, policies, terminologies, and challenges related to employee churn. Based on this understanding, the next step is to clarify the objectives. This involves identifying the critical drivers that lead to employee churn and establishing success criteria like prediction accuracy. We utilised the IBM HR dataset for churn analysis. To increase the effectiveness of the ML algorithm, data is pre-processed to increase the data quality, including processing the missing data/values, constant feature processing and label coding. Next, data transformation is making raw data into a structure that improves quality by making it easier to analyse. The dataset is randomly split into training and testing datasets in an 80:20 ratio. The training data are used to build the prediction model, while the unknown test data are used to evaluate the model's performance. The training dataset is subjected to 10-fold cross-validation. ML algorithms, namely LR, SVM, KNN, RF, decision tree classifier and Gaussian NB, are applied using Python. Therefore, LR and RF, which are performed with the highest accuracy, are used for further evaluation. In the next phase, SHAP is integrated to derive feature values and compare the SHAP results of both algorithms. The proposed framework used in this study is represented in Fig. 5.
Data demographicsTo validate the study's objectives, the dataset from Kaggle is used for analysis. The dataset contains details of 1470 employees and 31 features, out of which 14 are continuous, and 17 are categorical variables. The dataset includes 882 males and 588 females, and the number of employees with marital statuses of married, single, and divorced are 673, 470, and 327, respectively. The data consists of thirteen categorical, thirteen continuous, and two flag variables. The 'churn' variable is the target variable.
Research methodsDuring the process's start, a series of methodologies will be used to carry out a preliminary exploratory data analysis. Subsequently, a set of ML algorithms will be used to create a predictive model with the highest potential accuracy for identifying ECn. Python is used for implementing many methods, including logistic regression (LR), support vector machine (SVM), K-nearest neighbours (KNN) random forest (RF), decision tree classifier, and Gaussian native Bayes (GNB). The LR and RF models with the highest accuracy are selected for further evaluation. During the next phase, the SHAP model is incorporated to calculate feature value and compare the SHAP outcomes for both algorithms. The structure utilised in this investigation is depicted in Fig. 5.
Logistic regression (LR) and random forest (RF)LR states the connection between distinct variables, giving improved results in numerical data values (Gladence et al., 2015). LR defines the probability of realising dependent variable Y (Yan & Su, 2009) based on observation of input variable X and is more useful when the dependent variable is categorical (Yedida et al., 2018). The RF algorithm was proposed by Breiman in 2001 and is very popular as a classification and regression model. Random forest combines tree classifiers and offers to correct over-fitting for training datasets (Chaudhary et al., 2022). It is a non-parametric method that uses an ensemble of tree topologies. Besides being simple, it is recognised for its accuracy and capability to deal with a wide range of features. It estimates more complex patterns than the traditional ones and cannot be characterised by smoothness and standard sparseness (Biau & Scornet, 2016). The codes for top-performing RF and LR algorithms based on the dataset we used are mentioned below.
SHAPSHAP determines each player's contribution to success based on the game theory. SHAP can explain multiple supervised learning models and assign importance values to each input variable for each prediction. Furthermore, recent research demonstrated that SHAP is the only method for identifying a new class of additive feature significance measures. Shapley's value defines the average marginal role of the eigenvalue with the consistency of local interpretation with global interpretation, which effectively overpowers the limitation of the LIME method (Lundberg et al., 2020). The input variables are ranked in order of relevance; the highest mean SHAP value determines the most essential feature.
SHAP values measure feature contribution value to explain the black-box model. SHAP has an advantage over LIME as it can be used for any tree-based model, and each feature has its own set of SHAP values. Explainable models provide explanations made by machine learning models, which are feature-based and rule-based models. Feature-based models' output presents the top features to elucidate the machine learning prediction and associated weights (Ribeiro et al., 2016). However, the rule-based model functions on if-then-else rules (Oxborough et al., 2018). It functions on three properties, i.e. local accuracy that refers to the outcome, which is the total of the feature attributions and necessitates the model to match the output for the simplified input; missingness assures that no significance value is attributed to the missing features. Consistency ensures that change in more prominent impact features will maintain the attribution ascribed to that feature.
SHAP values aim to simplify the result of a function f into the sum of the individual impacts ϕi of each independently entered feature. We must recognise that the sequence in which the attributes are introduced holds significance for nonlinear functions. The SHAP values are determined by calculating the average of all attainable values, where ∑i=0Mϕi=f(x). The additive feature attribution methods utilise a model with interpretation g, represented by a linear function of binary variables.
Where zi′∈{0,1}M, M is the number of input features, and ϕi∈R.The concept of local precision refers to the sum of the feature attributions equivalent to the output of the function we are attempting to explicate. The lack of features suggests that z0 equals zero; no significance is assigned to them. Consistency refers to the principle that modifying a model to boost the influence of a particular feature will not result in a fall in the importance attributed to that characteristic.
Compared to LIME, SHAP is more computation-intensive, theoretically sound, robust and globally consistent for complex models and large datasets.
AnalysisThe study aims to strengthen the existing literature by investigating the interpretability mechanisms employed by the machine learning models utilised in generating predictions (Sekaran & Shanmugam, 2022). This analysis will enable us to identify the critical variables in the churn process, enhancing the business's internal processes to manage employee retention effectively. Based on the content presented in Section 3 of this study, the author examines the complete HR process by utilising the KDD and CRISP methodologies (Shafique & Qaiser, 2014) (as seen in Fig. 5. This examination will involve the identification of the following subprocesses:
The training dataset is subjected to 10-fold cross-validation, meaning that the dataset is divided into ten subsets of datasets and performs training and model evaluation on different datasets each time. Cross-validation aims to enhance performance and avoid model generalisation, i.e., overfitting and underfitting. The authors have used model parameter tuning for the given dataset. LR, SVM, KNN, RF, decision tree classifier, and Gaussian NB are evaluated on accuracy parameters. As per the dataset analysis, logistic regression outperforms random forest when several explanatory factors are fewer than or equal to the number of noisy variables. LR and RF gained the highest accuracy, 87.96 % and 87.29 %, respectively.
Table 5 demonstrates the result of five ML algorithms based on accuracy and accumulated AUC. As the number of explanatory factors in a dataset grows, the random forest's true and false positive rate increases. In binary classification, the class is categorised as positive and negative classes. A confusion matrix is a combination of four possible outcomes are true positive, symbolised as TP (the predicted positive value is true); true-negative, symbolised as TN (the predicted negative value is actual); and false-positive, symbolised as FP (the negative value is predicted as positive). False-negative symbolised as FN (positive value is predicted as negative). The confusion matrix measures the model's accuracy by comparing predicted values with the absolute ones. The performance of the RF and LR algorithms is measured using three performance metrics: accuracy (ACC), precision (PREC), F1 score, and area under the curve (AUC). These metrics are generated for all ten folds as follows:
AUC measures the total area under the Receiver Operating Characteristic (ROC) curve, one of the criteria used to assess model performance. The AUC value varies from 0 to 1, with a value closer to 1 suggesting a more accurate model. The AUC score is 0 when 100 % of predictions are wrong; one whose predictions are 100 % correct has an AUC of 1.
We have used logistic regression and random forest algorithms for further analysis since the calculated accuracy is 87.96 % and 87.29 %, respectively (Table 6). The AUC - ROC curve measures the performance of classification problems at several threshold configurations. ROC is a likelihood and monotonic increasing curve, while the AUC quantifies the level or extent of distinguishability. It indicates how well the model can differentiate between classes. When the area under the ROC curve is high, true negative and accurate positive distributions do not intersect, reflecting that the classes have been appropriately segregated. AUC shows the performance of the model.
As a result, LR shows improved performance compared to the RF algorithm. The fine-tuned LR model has a better AUC score than the RF classifier. The variables with the most positive correlation are performance rating, monthly rate, number of companies worked, and distance from home. In contrast, the variables with the most negative correlation with churn are total working years, job level, years in a current role, monthly income, and age. Our study used a heatmap to demonstrate a correlation between variables/features; see Fig. 6.
Heatmap depicts a data matrix with a colour gradient indicating numerical disparities in Fig. 4. Diverging palettes fix colours at both the lower and upper ends of the data and in the middle; it is ideal for data that goes in both negative and positive directions. It provides an excellent picture of the matrix's most significant and smallest values. Heatmaps describe relationships between variables in the form of colours instead of numbers. In simpler terms, it represents a correlation matrix to show a correlation between multiple variables. Closer to zero values indicate that the two variables have no linear relationship. The correlation is more positively connected when its value is close to 1. SHAP balances the accuracy and comprehensibility of black box models generated by machine learning. SHAP values show the distribution of each variable's impact. The feature is represented on the y-axis, while the Shapely values are mentioned on the x-axis. SHAP summary plots depict the mean effect of a variable on the complete result and display the knowledge using bars of appropriate length or dimension.
SHAP demonstrates the values and features more understandably. As shown in Fig. 7, it can be interpreted that the overtime feature is the highest predictor of employee churn, followed by monthly income, stock option, job level and so on, while the job role of sales executive has the most negligible chances of being churned. However, compared to the SHAP result of logistic regression, the overtime feature/variable is the highest predictor and contributor of employee churn, followed by the employees whose marital status is single. The result is shown in Fig. 8. SHAP plots differ for different models; however, the general features trend remains constant in both models.
SHAP presents substantial logical analysis by discovering the most influential features that enabled the model to effectively forecast whether or not an employee will leave the organisation.
DiscussionWith the introduction of advanced technology, human resource management is going through a paradigm change (McCartney & Fu, 2022). The decision-making process in the field of human resources (HR), specifically with employee retention or hiring procedures, is the essential foundation of every company's business model. Existing literature has primarily focused on mechanism-driven predictions and identifying factors contributing to employee churn (Tao et al., 2021); however, churn prediction is complex due to diverse parameters. Therefore, assessing and forecasting employee turnover are crucial in HR procedures.
The study aims to establish a thorough methodology to execute interpretability in a context such as human resources management. This will create a baseline for formulating guidelines derived from the study to conduct recruitment procedures that align with the drivers of attrition and serve as a complement to retention operations. The proposed method facilitates comprehension and prioritisation of the necessity for using AI in business applications, particularly in areas about Management of employees, which is considered the most essential and significant asset for the organisation. The interpretability of the acquired results supports the findings derived based on the preliminary data analysis. Interpreting the forecasts allows us to figure out the variables that influence turnover risk and take proactive steps to mitigate them. The novel contributions of this study are outlined below:
- •
Inclusive Methodological Framework: This paper proposes an integrated and comprehensive methodology approach to incorporating interpretability into the ML pipeline, particularly emphasising the HR field. The framework includes data preprocessing, model selection, hyperparameter tuning, and a model-agnostic interpretability tool (SHAP). It provides an organised approach to improving the transparency and applicability of predictive models.
- •
Improved Decision-Making in HR: The research aims to enhance HR professionals' capabilities in making well-informed and impactful decisions about ECn. Analysing the significant elements contributing to attrition will inform HR initiatives to help retain talented employees and create a more efficient and productive workplace.
- •
Enhancing Trust through Model-Agnostic Interpretability: Adopting the interpretability model for complex algorithms like LR and RF enhances trust and transparency in prediction. Consistently interpreting and illustrating model predictions to stakeholders increases comprehension and approval of ML results, particularly in HR decision-making.
- •
Exploratory data analysis (EDA) enables an initial comprehension of how variables interact regarding the outcome variable, churn. Before initiating the ML process, it is imperative to proactively identify any potential biases present in the dataset. Moreover, the approach used ought to be flexible and capable of adapting, continually gaining knowledge from employee data to prevent the introduction of bias in a proactive manner.
These novel aspects strengthen the comprehensive ECn prediction. Implementing HRM will facilitate informed managerial, administrative, and adequate staff retention strategies.
Theoretical contributionThis paper has illustrated the importance of employee churn prediction and the explainability of results using XAI. To validate, we relied on the existing dataset with variables proven to contribute to employee churn in the literature (Table 1). We have identified the gap in the existing ML models used in employee churn and focused on filling the gap in the existing literature (Table 2).
Regardless of the theoretical account for human decision-making, people, including experts, only sometimes make the best choices. To overcome these limitations, predictive analytics scrutinise the hidden patterns in the HR database to derive predictions. This approach decreases the associated business risk through churn analysis. However, the decision-makers cannot understand the algorithms used in these predictions due to a lack of interpretability, making it difficult to trust these models, leading to ML being considered a black box. Explainable AI techniques answer the questions of transparency, interpretability, and trust. It focuses on the why aspect of prediction results. The constant increase in business pressures and success stories of AI has paved the way for the inclusion of AI in different aspects of business, proving to be a competitive advantage.
Transparency and explainability have accelerated the adoption of AI, ensuring ethics and compliance and providing strategic insights into business (Oxborough et al., 2018) and subsequently improving decision-making and strategic planning. Explainable AI discovers hidden information related to different facets of business, collecting information from an employee database, social media, and transactions. It works on the cause-and-effect model to identify patterns. Such correlations are seen utilising explainable algorithms, showing the main drivers behind them. Transparency supports the assessment of the precision of output forecasting, identifying the potential risks associated with model utilisation, and anticipating situations in which the model might not perform as expected. It can also discourage adversarial assaults by educating business users on manipulating model inputs to impact outputs. Individuals in charge of a model can anticipate instances of failure and take action accordingly by obtaining an intuitive awareness of its behaviour.
Industrial Revolution 4.0 is centred on advanced technologies that convert information to gain competitiveness, growth, efficiency, and effectiveness (Park et al., 2019; Neumeyer et al., 2020). This era counts on cutting-edge technology like AI and equipment for increased production and wealth creation. Businesses use robotics and AI to automate repetitive and simple operations while projecting algorithms to aid sophisticated decision-making. Providing explanations for human decisions is required to improve human predictions or decisions. Decisions often are made based on biases and heuristics, sometimes based on the rationale of the decision-maker. In contrast to heuristics and biases, bounded rationality in human decision-making using complying with limitations is an alternative theory.
Customers, regulators, and industry confederations pressure businesses to guarantee that their AI systems adhere to ethical standards and function within publicly reasonable bounds. Securing vulnerable consumers, ensuring data privacy, fostering ethical behaviour, the performance of accounting firms (Oberoi et al., 2022), environmental concerns (Gaur et al., 2021) and preventing bias are all regulatory concerns. Using explainable models is one technique to check for bias and make decisions that do not breach business ethics or harm a company's reputation. Considering the factors mentioned above, the proposed framework ensures the explainability and interpretability of the prediction model. It lets the decision-makers identify the factors that may lead to employee churn and develop individual preventive plans.
Managerial implicationsThough employee churn is like customer churn, in some organisations, the cost of ECn is comparatively higher and requires increased attention from the researchers. The high employee churn rate has unfavourable consequences for the company, affecting ongoing initiatives and causing customer discontent. Replacing individuals with specialised skill sets can be challenging. Considering the increasing employee churn rate, identifying factors leading to churn has become paramount. Existing literature has worked on the factors mentioned in Table 1. However, more than merely identifying and representing facts is required in the dynamic business environment. The inability to understand why churn occurs paralyses a company's ability to prevent it. Decision-makers must identify the factors and make their predictions. Therefore, HR professionals need to keep abreast in the industry to keep their organisations updated about the change.
Organisation leaders can better understand person-organisation fit to create or improvise internal process design to increase job satisfaction and employee retention, thus minimising employee churn. Employees can influence client satisfaction favourably, but newly hired employees may not. This demonstrates how high employee turnover impacts a company's efficiency since client needs may need to be appropriately handled.
Industrial Revolution 4.0 has transitioned from 'employee engagement' to 'employee experience', where employees want to contribute. Design thinking can improve the employee experience by integrating the three ways experience mapping means identifying the actual experiences and needs, touchpoint simplification referring to improvising emotional points of contact between employees and the organisation and rapid prototyping. Recreational programmes can be opted to break the monotony of work, and employees must be awarded and appreciated for their contribution and hard work. To reduce the repercussions of employee churn, efforts must be made to reduce it. A positive employee workplace facilitates firm profitability and potentially increases employee satisfaction (Storey, 2016).
Summarising the above points, it is crucial to retain employees and take proactive steps to predict employee churn, which can be a competitive advantage. Thus, ML is being integrated, but we must remember that all decision-makers and HR professionals may need to be more comfortable understanding the 'black-box' models.
Explainability ensures that AI-based decisions are based correctly. Rather, the transparency of model predictions makes the results more interpretable. XAI may have substantial social roles in the future that will include learning and explaining individuals and connecting knowledge to get inter-disciplinary insights, connecting previous knowledge for further application. Ozcan et al. (2020) summarised that HR mining can be used for academic, industrial, and governmental needs.
ConclusionsEmployee churn is the most prominent issue that decision-makers of organisations have to deal with, as adverse effects range from low morale to disruption in productivity and long-term growth strategies. This paves the way for the prediction of the risk factors that prompt employee churn using ML algorithms. Though analytics has contributed to the business in many ways, it also faces criticism regarding the 'black-box' method and lacks actionable matters (Gelbard et al., 2018). A critical component of our research is the interpretability of predictions, providing an improved awareness of the factors which contribute to employee churn. Besides the accuracy of the prediction model, users' trust directly impacts the usability of the model (Gelbard et al., 2018). Data mining techniques are an inherent part of industry revolution 4.0. Due to the diverse applicability of ML, it is not limited to specific industries but also across FMCG (Tarallo et al., 2019), hospitality (Gaur et al., 2021), healthcare (Gaur et al., 2021; Mahbub et al., 2022; Saeed et al., 2022), and many others. Machine learning models must achieve high AUC values.
Our study focuses on most of the features that contribute to employee churn and ranks them based on their importance and correlation to churn. Implementing machine learning algorithms can make decision-making tasks easy, but adding SHAP to the system can act as a catalyst for non-technical decision-makers to understand and interpret the result. Based on the SHAP summary plots of logistic regression and random forest, the features that contributed positively and negatively to employee churn are approximately the same. However, the order and weight differ in both models. Businesses can modify institutional policies to address the grievances that contribute to employee churn. Resources, such as money and time, can be invested in more profitable ventures. Assessing the influence of each variable on attrition permits management to determine the underlying reasons for the issue and concentrate their efforts on addressing these problems proactively before they worsen. In addition, it assists firms in prioritising concerns and addressing the most critical ones if they cannot handle all the issues simultaneously.
Limitations and direction for future researchWe have used the proposed framework (Fig. 5) on the dataset available on Kaggle, but this can be used on real-time data for further research to increase the efficiency and effectiveness of business. The hyperparameter tuning can be executed for real-time dataset for higher accuracy. Academics have focused on theoretical aspects of explainable AI; however, more focus can be laid on integrating XAI in different possible ways. Our study addressed employee churn; however, future research can focus on problem statements related to HRM and cross-functional issues in business. One of the limitations of XAI implementation is the requirement of technical expertise to derive the results of the problem statement; then only it can help decision-makers. XAI may have substantial social roles that include learning interpretability and coordination with other agents to develop cross-disciplinary understanding based on previous knowledge to enhance the future of XAI.
FundingThis work was supported by the Open Access Publishing Fund provided by the Free University of Bozen-Bolzano.
CRediT authorship contribution statementMeenu Chaudhary: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Loveleen Gaur: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Amlan Chakrabarti: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Gurmeet Singh: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Paul Jones: Writing – review & editing, Visualization, Validation, Supervision, Resources, Project administration, Investigation, Formal analysis, Data curation, Conceptualization. Sascha Kraus: Validation, Writing – original draft, Writing – review & editing.
The authors have no pertinent financial or non-financial conflicts of interest to disclose.

















