Competence-based human resource management (HRM) emphasises the identification, development, and utilization of employee competence to boost the organizational performance, particularly in high-tech sectors that demand continuous competence advancement. Advanced artificial intelligence (AI)-based solutions, such as large language models (LLMs), are transforming competence-based HRM by streamlining job position selection, predicting emerging competencies, and designing targeted training plans, thereby enhancing knowledge sharing and transfer. However, there is a significant gap in the literature regarding comprehensive LLM-based solutions that automate the association of competence with professional roles and the semantic enrichment of corporate competence taxonomies. In this study, we present two innovative solutions: the automated semantic taxonomy enrichment methodology (ASTEM) and the role-competence embedding-based (RCE) framework. In particular, we demonstrated the effectiveness of LLMs in bridging the informational gaps by generating coherent competence descriptions and creating accurate role-competence associations through a qualitative case study involving a big company operating in the aerospace, defence, and security industry. The proposed solutions aim to reduce manual effort, improve the precision of role-competence matches, and support data-driven decision-making. This enables companies to efficiently identify the suitable candidates, develop focused training programs, and maintain a competitive edge by rapidly adapting to changes in the market and technology.
Given the rapidly evolving business environment, marked by fast-paced technological innovation and transformative changes, companies must enhance the competence of their workforce to ensure a competitive advantage (Brunello & Wruuck, 2021). ‘Competence’ refers to a comprehensive set of attributes that includes knowledge, skill, motivation, and behaviour (Di Luozzo et al., 2021; Gugnani et al., 2018). These attributes must co-exist to improve the performance of employees in specific professional roles (defined as a recognizable pattern of behaviour and attitude corresponding to an identity (Turner, 2001), which enables the effective execution of tasks and responsibilities (Boyatzis, 1982; Corallo et al., 2010). Furthermore, assigning the employees in the right roles presents significant professional development opportunities, while helping companies reduce their turnover (Mathis et al., 2017) and align with the professional standards required by the labour market (Brunello & Wruuck, 2021).
Competence-based human resource management (HRM), also known as ‘talent management’ (Goonawardene et al., 2010; Kurek et al., 2024), focuses on identifying, developing, and leveraging the competence of employees to enhance the organizational performance, particularly high-tech sectors that require continuous competence development to ensure a competitive advantage. This approach enables companies to better achieve their business goals (Wallo et al., 2020), enhance market competitiveness (Mathis et al., 2017), foster innovation (Ubeda et al., 2017), and implement targeted training programs (Corallo et al., 2010).
In the aerospace sector, integrating human factors with advanced technologies, such as AI-assisted systems, is essential for meeting the industry demands and driving continuous innovation (Morandini et al., 2024; Reagan, 2021). Aerospace is a particularly challenging sector, and is characterised by high costs and risks, a vast and multifaceted supply chain, and a multidisciplinary knowledge base that requires firms to function as system integrators, thereby integrating design and production (Acha et al., 2007). The production of aerospace systems, including aircraft, engines, guided missiles, and space vehicles, involves highly dynamic interactions between physical components and information exchange. It requires efficient coordination across multiple organizational levels (Brunton et al., 2021). Consequently, aerospace organizations focus extensively on the continuous assessment of design and manufacturing competencies, and strategic allocation of human resources to optimise the engineering processes (Morandini et al., 2024). Managing the technical competencies in aerospace engineering is particularly challenging due to the highly advanced technologies, innovative materials, and knowledge-intensive processes involved. The interactions between individual system elements influences others, creating an intricate network of dependencies that must be comprehensively coordinated (Cilliers & Spurrett, 1999). Aerospace products comprise multiple interdependent components; therefore, the production processes are distributed among several companies at different supply chain levels to effectively manage the costs, risks, and technological challenges (Esposito & Raffa, 2007). Consequently, aligning competencies with evolving industry demands is essential for maintaining a competitive edge and operational resilience.
Structured taxonomies are defined as a hierarchical tree-structure in which concepts are organised based on the semantic meaning (Caratozzolo et al., 2023). Organizations can leverage structured taxonomies to systematically manage competence-related information (e.g., hard and soft skills), thereby ensuring that they align with evolving competence demands and facilitate accurate role assignments (Kimble et al., 2016). This structured approach is crucial in industries where technological advancements and regulatory requirements necessitate continuous workforce adaptation (Konrad et al., 2022; Morandini et al., 2024). The aerospace sector presents a unique case for analysing the role of competencies owing to its high level of complexity and dependence on cutting-edge technology. Limited research has been conducted on the systematic management of competencies despite its importance. Bridging this gap through structured competence management frameworks enhances the operational efficiency, promotes innovation, and ensures that organizations remain at the forefront of disruptive advancements.
Advanced artificial intelligence (AI)-based models, such as large language models (LLMs), drastically transform competence-based HRM, by enhancing various aspects including recruitment, training, and resource allocation. They serve as powerful tools for accurate analyses of competence and personnel information, while improving knowledge sharing and transfer (Kurek et al., 2024; Malik et al., 2023). The adoption of LLMs in HRM, leveraging their considerable natural language processing (NLP) capabilities, supports a wide range of applications, from streamlining the process of selecting the ideal job position based on competence (Martínez & Fernández, 2019), predicting emerging job competence (Sheriff & Sevukan, 2023), to developing targeted training plans (Santana & Díaz-Fernández, 2023). This helps in establishing a more genuine and engaging work environment (Shivanjali et al., 2019).
Despite the significant technological advances in the field of competence-based HRM, limited research has been conducted on the development of automated competence-role association solutions and guidelines in the taxonomy semantic enrichment process, which is essential for maintaining relevance in rapidly evolving industries. In particular, studies conducted on text generation using LLMs (e.g., August et al., 2022; Nguyen et al., 2021; Velásquez-Henao et al., 2023) introduced various approaches to effectively employ prompt engineering methods. However, none of these works focused on the semantic enrichment of taxonomy, which is defined as the process of integrating new and contextually concepts, ensuring that their placement is neither overly general nor excessively specific to maintain the structural relevance of taxonomy (Arslan & Cruz, 2022).
Consequently, practical methodologies have not been developed on using pre-trained embedding to map competencies to roles by exploiting the existing competence and role taxonomies in organisations. For instance, Amin et al., 2022 employed pre-trained embedding models by exploiting an organization skill taxonomy to optimise the research request assignment process. Various other studies (e.g., Gugnani & Misra, 2020; Kurek et al., 2024; Skondras et al., 2023) employed these models to improve the recommendation systems using data obtained from job postings on social media. However, research has not been conducted on identifying the ideal competence set for a job role by exploiting the latest advances in AI and actual company data based on a ground truth.
In this study, we proposed the automated semantic taxonomy enrichment methodology (ASTEM) that leverage LLMs to automate the enrichment of corporate competence taxonomies addressing the existing gap in literature. Furthermore, we established the role-competence embedding-based (RCE) framework to automate the association of roles with a set of relevant competencies by integrating two disconnected business taxonomies.
We conducted a qualitative intrinsic case study involving a company in the aerospace, defence, and security sector to test the ASTEM and RCE framework using real business taxonomies (i.e., competence and role taxonomies). This sector is characterised by complex regulatory environments, advanced technological demands, and multi-layered organizational structures, making it crucial to analyse innovative competence management solutions within their real-world setting (Corallo et al., 2012; Morandini et al., 2024). The data collection process involves semi-structured interviews with company experts and the iterative analysis of enriched competence taxonomy descriptions and role-competence associations.
Therefore, in this study, we aim to demonstrate the key concepts of LLMs, making complex AI concepts accessible to HRM professionals. Furthermore, we emphasised the significance of prompt engineering for the optimisation of LLM performance and proposed the ASTEM and RCE framework as innovative solutions to automate competence-based HRM processes. In particular, the proposed solutions aim to automate the semantic enrichment process of taxonomies and the association of roles with competence in the organisation. This automation reduces the time and resources required for manual updates, enhances the precision of role-competence matches, and supports data-driven decision-making. Thus, companies can more effectively identify the suitable candidates, develop targeted training programs, and maintain a competitive edge by rapidly adapting to the market and technological changes.
The paper is structured as follows. Section 2 `State-of-the-Art' presents a detailed description of the state-of-the-art in AI focusing on LLMs and their architecture, their impact on the competence landscape, and the previous studies conducted on the use of LLMs in HRM. Section 3 `Research methodology' outlines the research methodology of the study. Section 4 `Proposed frameworks' presents a detailed description of the ASTEM and RCE framework. Section 5 `Case study results' presents the case study, describes the implementation of the proposed solutions in the aerospace company, and the highlights challenges faced by the proposed methods. Section 6 `Discussion' highlights the research and practical contributions of this study, along with the main limitations and future trends. Lastly, Section 7 `Conclusions' presents the conclusion of the study.
State-of-the-ArtAI and competence-based HRM in industryTechnological advancements, combined with process digitisation and automation, are drastically transforming the competence landscape in the labour market, driving the development of new competencies while rendering others obsolete (Frierson et al., 2023). This phenomenon is particularly evident in high-tech industries, where rapidly changing technologies require the constant acquisition of new competencies to ensure a competitive advantage (Kurek et al., 2024). The increasing complexity and dynamism in global markets requires companies to be highly flexible to constantly adapt to the changing conditions, thereby requiring new and updated job profiles and related competencies (Frierson et al., 2023).
Based on the dynamic capabilities theory (Teece et al., 1997), organizations must develop and refine their ability to sense, seize, and reconfigure resources, including human capital, to respond effectively to technological shifts and market uncertainties. Competence-based HRM becomes a crucial mechanism in this process, as companies that effectively identify, develop, and allocate talent can better navigate industry disruptions and maintain a competitive advantage. Employee productivity primarily depends on the strategic alignment of competencies with business objectives and technological advancements, thereby reinforcing the requirement for systematic competence management (Bafna et al., 2019; Corallo et al., 2010). The competence-based approach facilitates precise role assignments, while optimising task execution, reducing HR-related costs, and increasing worker motivation and satisfaction (Gupta & Kumar, 2024). Furthermore, by reducing the turnover associated with job dissatisfaction and fostering a meritocratic work environment, firms can enhance their ability to reconfigure workforce capabilities corresponding to external changes, which aligns with the core tenets of dynamic capabilities.
Additionally, the knowledge-based view (KBV) of the firm (Grant, 1996) indicates that knowledge is the most strategically significant resource in an organization. Within this framework, competence identification and mapping become essential for firms that aim to effectively integrate, share, and apply knowledge. The ability to harness employee expertise and align it with evolving business requirements is essential for maintaining a competitive advantage (Shivanjali et al., 2019). However, limited research has been conducted on competence-based management, as workforce knowledge has not been recognised as a key determinant of organizational success in the increasingly competitive environments (Cao & Zhang, 2022).
The conventional HRM methods used to analyse text data (e.g., from interviews and questionnaires), such as topic models (e.g., latent Dirichlet allocation) or linear regression techniques, do not consider the underlying context of words in sentences (Arslan & Cruz, 2022; Lin et al., 2023). Furthermore, job role and competence descriptions established within the company are typically subjective, and can present ambiguities (Jorzik et al., 2024).
LLMs comprise an advanced category of solutions in the AI field, characterised by the usage of neural networks trained using billions of parameters on large datasets (Lin et al., 2023). These data include text from books, articles, and websites, and enable models to learn the grammar, context, and nuances of language (Liu et al., 2023).
In 2017, Vaswani et al., 2017 introduced the transformer architecture, a novel approach to NLP tasks based on the self-attention mechanism and encoder-decoder architecture framework. The self-attention mechanism enables the model to focus on different parts of the input text sequence while processing each element, thereby capturing dependencies between words based on their position in the sequence. A notable example of the progress of LLMs is the Generative Pre-Trained Transformer (GPT) model, which has demonstrated advanced capabilities in NLP and coherent text generation (OpenAI et al., 2023).
In these LLMs architectures, the encoder helps in determining and extracting relevant information from the input text by producing a continuous representation (text embedding) (Iqbal & Qureshi, 2022), which contributes to applications in sentiment analysis, topic classification, spam detection and text generation (Kurek et al., 2024). When the architecture includes the decoder, the vector representation created by the encoder is used to generate a sequence of words, where each word depends on the previous ones due to the self-attention mechanism (Vaswani et al., 2017). However, not all LLMs adopt the encoder-decoder structure. For example, GPT uses only the decoder architecture, feeding the input data directly into the decoder and still achieving excellent results due to the self-attention mechanism (Kurek et al., 2024).
Adopting LLMs in HRM automation is a strategic application of AI that concurs with both the dynamic capabilities theory and KBV by enhancing the ability of an organization to dynamically analyse, allocate, and develop workforce competencies (Malik et al., 2023). Business applications include matching the job role requirements with the organizational requirements (Jorzik et al., 2024), thereby optimising competence-based job assignments (Martínez & Fernández, 2019), predicting emerging competencies (Sheriff & Sevukan, 2023), and developing training programs corresponding to workforce evolution (Santana & Díaz-Fernández, 2023).
Prior to recent advances in NLP, Chen and Chien (2011) developed a data-mining framework for talent identification and job assignment in the manufacturing sector. Similarly, Dickson and Nusair (2010) analysed the use of AI in HRM in the hospitality industry, demonstrating the ability of these technologies in streamlining the recruitment process and improving employee retention. Recently, Mahmoud et al. (2019) evaluated the implementation of LLM-based chatbots for social media screening into the HRM systems, thereby reducing the time spent in hiring processes and providing valuable information on the candidates. Conversely, Mathew et al. (2018) analysed the capabilities of LLMs in predicting the suitability of a candidate for replacing a qualified worker vacancy. Lastly, Gupta and Kumar (2024) developed a framework to leverage LLM-based AI solutions to overcome the bias in recruitment, analyse employee data for competence development programs, and monitor employee well-being through chatbots and wearable devices.
Pre-trained embedding models to associate job roles and competenceAlthough LLMs with only-decoder architecture, such as GPT, can produce text representations, they are not ideal for tasks such as text retrieval and matching (Wang et al., 2022). Conversely, pre-trained models comprising only encoder architecture produce embeddings, which are vector representations that can capture semantic nuances in sentences; they contribute significantly to various NLP tasks, such as the retrieval of large-scale information (Wang et al., 2022). These pre-trained embedding models, trained using large amounts of data and capable of performing various types of tasks, such as classification, clustering, and semantic similarity, also present excellent performance in specific tasks, such as candidate and job position matching (Kurek et al., 2024).
Several organizations already have taxonomies of competence that can be enhanced using embedding models, thereby facilitating efficient retrieval and matching with other information, such as job profile descriptions (Kurek et al., 2024) and tasks (Amin et al., 2022). For instance, Kurek et al. (2024) transformed job descriptions and candidate profiles into multidimensional embedding vectors, and calculated the scalar product between the vectors to determine the degree of similarity. A higher scalar product indicates greater similarity, indicating a stronger match between the job requirements and candidate qualifications. Similarly, Amin et al. (2022) employed the embedding model to match the research requests and skills based on a real business taxonomy.
Furthermore, Arslan and Cruz (2022) and Nikishina et al. (2022) employed embedding to enrich and populate business taxonomies using the data obtained from external sources (such as articles and lexical taxonomies) yielding promising results. Several previous studies focused on embedding models to improve systems for recommending ideal candidates for specific job positions. For example, Lin et al. (2020) developed a human resource planning method that suggests suitable candidates for a given position. Similarly, Bafna et al. (2019) established a system to map the competence of a candidate to the corresponding job requirements and suggest specific training in case of a gap, demonstrating the effectiveness of LLMs for analysing the curricula of candidates by capturing the meaning of sentences, even with different terminologies. Furthermore, Clavié and Soulié (2023) proposed a system to extract the competence based on the job vacancies and rank the most suitable candidates. Conversely, Qin et al. (2020) developed a framework called topic-based ability-aware person-job fit neural network (TAPJFNN) for targeted talent search and job recommendation, whereas Martínez and Fernández (2019) established an intelligent system that identifies emerging talent and optimises the hiring process by excluding biases and distortions such as physical appearance, tattoos, gender, or race. Lastly, Dehbozorgi and Parizi (2023) proposed a training course recommendation system that identifies student learning gaps at the course level and recommends suitable courses to prepare them for the desired job positions.
Table 1 presents an overview of the contribution of embedding-based AI approaches to HRM. It highlights their role in enhancing taxonomies, improving candidate-job matching, and optimising the recruitment processes. Although these approaches present increased precision and efficiency, they also present challenges corresponding to the data quality, contextual understanding, and potential biases in training datasets.
Contributions, strengths, and limitations of embedding-based AI approaches in HRM.
| Contribution | Strength | Limitation | References |
|---|---|---|---|
| Enhancing Taxonomies with Embeddings | Organizations can facilitate more precise candidate-task matching | Requires large, well-labelled datasets and depends on the relevance of the data sources used | (Amin et al., 2022; Wang et al., 2022) |
| Embedding-Based Candidate Matching and Job Recommendation | Embedding models transform job descriptions and candidate profiles into vectors, enabling similarity calculations to assess candidate-job fit | The scalar product similarity approach may not fully capture the contextual meaning and human implicit knowledge | (Kurek et al., 2024) |
| Bias-Free Talent Identification and Recruitment Optimization | AI systems reduce hiring bias by focusing on skills and qualifications rather than demographic attributes, ensuring fairer hiring practices | Biases in training data can still affect AI-driven recruitment | (Qin et al., 2020; Ubeda et al., 2017) |
LLMs are widely implemented in generative AI owing to their ability to capture complex word relationships and generate texts that reflect the semantic structure of concepts (Nguyen et al., 2021). This capability is crucial for semantic enrichment, and involves improving the understanding and representation of the meaning of texts by adding semantic information (Arslan & Cruz, 2022). Typically, manual semantic enrichment presents significant challenges in terms of the time and effort, with risks of bias or inaccuracy (Arslan & Cruz, 2022), coverage problems for specific information, and the need for frequent updates (August et al., 2022).
The advanced knowledge gained during the training of LLMs enables the generation of accurate and contextually relevant definitions for various terms and concepts (Eriksson & Jönsson, 2023). Large vector representations help in capturing complex relationships between words, contributing to definitions that reflect the semantic structure of concepts (Vartinen et al., 2022). Additionally, the autoregressive approach in text generation enables LLMs to construct long and complex definitions, thereby ensuring consistency and cohesion in the generated text (Eriksson & Jönsson, 2023; Lesage et al., 2024).
Recent studies demonstrated the effective reasoning ability of LLMs, particularly GPT-4, in answering medical rheumatology (Madrid-García et al., 2023) and engineering (Pursnani et al., 2023) questions without requiring specific training. However, limited research has been conducted on leveraging LLMs to simplify the semantic enrichment process. For instance, in fields such as online commerce, these models are used to generate product descriptions on websites (Nguyen et al., 2021) or to automatically identify sustainable product features in e-commerce (Roumeliotis et al., 2023). In particular, no relevant studies have been conducted in the field of competence-based HRM. Magron et al. (2024) proposed a framework to generate synthetic job postings to enhance matching with European skills, competencies, qualifications and occupations taxonomy (ESCO, 2017). Additionally, Skondras et al. (2023) employed a GPT model to generate synthetic curricula to train models to classify them into various occupational categories.
The paucity of studies conducted on the application of these models for text generation in business settings typically corresponds to the possibility of LLMs generating unwanted and inconsistent responses (Chen et al., 2023; Dahlkemper et al., 2023). However, Polverini and Gregorcic (2024) reported that the ineffectiveness of LLMs in performing tasks can be attributed to inadequate question formulation. Extensive research has been conducted on prompt engineering and its impact on the performance of LLM solutions (Chen et al., 2023; White et al., 2023), and OpenAI presents a dedicated web space to prompt engineering guidelines (OpenAI, 2023). Prompt engineering is defined as a methodological strategy for optimising the performance of LLMs through the careful design of textual prompts (Brin et al., 2023; Polverini & Gregorcic, 2024).
Extensive research has been conducted on the definition and development of this concept, demonstrating the significance of prompt design in improving the performance of LLMs (e.g., Liévin et al., 2023; Polverini & Gregorcic, 2024; Ranade et al., 2024; Velásquez-Henao et al., 2023). For instance, the chain-of-thought approach significantly improves the model capabilities via step-by-step resolution of queries and reducing the probability of generating hallucinations (i.e. fabricated or factually incorrect outputs) and illogical responses (Wei et al., 2023). Moreover, the Socratic methodology, which employs inductive, deductive, and abductive approaches to effectively improve the output quality and consistency (Chang, 2023). Furthermore, Velásquez-Henao et al. (2023) developed the goal prompt evaluation iteration (GPEI) framework, a four-step iterative methodology that can guide users to design prompts based on output evaluation. It has also been demonstrated that providing the LLM with contextual information enables it to generate more specific and relevant responses (Nguyen et al., 2021). Another key element during prompt design is the selection of pattern (i.e., output automater, persona, visualization generator, recipe, or template) that helps in optimising the quality and relevance of the generated responses (White et al., 2023). Lastly, when designing the prompt parameters, such as temperature (i.e., a parameter that controls the randomness of word selection in text generation) and top-p (i.e., a parameter that sets a probability threshold limiting word selection), that influence the probability distribution of words during text generation must be considered (Lesage et al., 2024). A lower temperature produces more concentrated and deterministic responses, whereas a higher temperature encourages more diverse and creative responses. The top-p parameter, which limits the word selection to the most likely words, enables the adjustment of the consistency of the responses (Lesage et al., 2024; Tam et al., 2022). The combined usage of these parameters helps in optimising the model to generate sequences of sentences of arbitrary length, thereby avoiding unwanted loops and maintaining consistency with the prompt (Eriksson & Jönsson, 2023; Lesage et al., 2024).
Table 2 the main strengths and limitations of LLM-based methods in enhancing semantic structures through text generation and prompt engineering.
Contributions, strengths, and limitations of LLM-based text generation methods.
| Contribution | Strength | Limitation | References |
|---|---|---|---|
| Generative AI and Semantic Enrichment | LLMs improve semantic enrichment by capturing complex word relationships and generating definitions that reflect the semantic structure of concepts | LLM-generated definitions may sometimes lack specificity or contextual accuracy | (Eriksson & Jönsson, 2023; Nguyen et al., 2021) |
| Autoregressive Text Generation | LLMs construct long and complex definitions, ensuring consistency through autoregressive generation | May still generate hallucinations or inconsistencies, requiring human oversight | (Lesage et al., 2024; White et al., 2023) |
| Domain-Specific Applications | LLMs automate text generation for product descriptions, competency frameworks, and job postings, enhancing efficiency in various business applications | Limited research on LLMs for HRM. Outputs in business settings may be inconsistent or irrelevant without domain-specific fine-tuning | (Madrid-García et al., 2023; Pursnani et al., 2023; Skondras et al., 2023) |
| Prompt Engineering for LLM Optimization | Well-crafted prompts significantly improve LLM performance, reducing hallucinations and enhancing output relevance | Poorly designed prompts can lead to misleading, verbose, or inconsistent outputs. Requires expertise to craft effective prompts | (B. Chen et al., 2023; Wei et al., 2023; White et al., 2023) |
| Parameter Tuning for Controlled Text Generation | Adjusting temperature and top-p enables control over creativity vs. determinism in text generation, improving coherence and relevance | Incorrect parameter settings may lead to inconsistencies and multiple attempts are needed before finding the optimal combination | (Lesage et al., 2024; Tam et al., 2022) |
In this study, we analysed the latest advances in AI, particularly in the application of generalist LLMs without specific training, for competence-based HRM solutions. We aim to leverage the potential of LLMs for implementing semantic enrichment of competence descriptions and the association between role information and related competence from business taxonomies.
To this end, we defined and designed ASTEM (Barba et al., 2025) to guide practitioners of LLMs to effectively generate consistent outputs by integrating various Socratic prompt engineering methodologies (Chang, 2023), the GPEI framework (Velásquez-Henao et al., 2023), and OpenAI guidelines (OpenAI, 2023), along with contextual information (Nguyen et al., 2021) and prompt pattern model (White et al., 2023). Conversely, the RCE framework aims to create semantic connections between different pieces of information, specifically roles and competence, using pre-trained embedding models known for their effectiveness in semantic similarity tasks (Amin et al., 2022; Kurek et al., 2024).
We selected a qualitative investigation approach suitable for addressing the requirements of a specific context to analyse the ASTEM and RCE applications (Stake, 1995; Yin, 2017). A case study is particularly appropriate for analysing contemporary events and non-controllable units of analysis (Yin, 2017). We aim to demonstrate a real practice of particular relevance owing to the characteristics of the industry and the current research trends on HRM and LLM. Therefore, the case study is guided by the pragmatism knowledge claim that is problem-centric. Consequently, we focused on the problem and the methods used to solve it in a real organizational setting (Creswell, 2022). Multiple sources of evidence, such as interviews with managers and direct observations of researchers, were used to increase the construct validity of the case study (Yin, 2017). The case study was conducted from January to June 2024 in collaboration with a large company operating in the aerospace, defence, and security sector. Fig. 1 presents the research steps used in this study based on the guidelines provided by Crowe et al. (2011) and Kekeya (2021).
First, we defined the case study to set the research objectives and case boundaries. We primarily aim to investigate how AI can be used to enhance the alignment between professional competence and roles, as well as to semantically enrich existing corporate taxonomies. A qualitative case study involves an in-depth contextual analysis of a specific instance within its real-life setting (Crowe et al., 2011). This design was well-suited for our study because it helped in analysing the nuanced challenges and opportunities faced by a major aerospace company in implementing advanced AI-driven competence management solutions. The intrinsic nature of the case study helps in obtaining a comprehensive understanding of the subject matter (Baxter & Jack, 2015).
The second step comprises the selection of the case study (Crowe et al., 2011). The aerospace sector is characterised by rapid technological advancements and evolving competency demands. Therefore, it strongly depends on qualitative insights to evaluate the adaptability of innovative AI-based models in dynamic contexts (Morandini et al., 2024). These insights are essential for enhancing competence management and optimising the information management processes. This complexity presents an excellent testing ground for ASTEM and RCE, as advanced AI-based models can effectively capture domain-specific knowledge (Excoffier et al., 2024; Xianming & Jing, 2023).
Therefore, we selected a large company operating in the aerospace, defence, and security sector that employs over 11,000 engineers worldwide owing to its relevant contribution in terms of the interest, data availability, expertise and its proximity to the researcher's network (Creswell, 2022). By focusing on real-world documents and engaging with domain experts through interviews, the case study helped in capturing the interplay between information management through taxonomies and AI-based solutions to promote competence management. This approach also facilitated the testing and validation of ASTEM and RCE under authentic conditions. Furthermore, despite the highly technical nature of the aerospace sector, the findings of this case study enable their adaptation to broader use cases, such as corporate training, educational content evaluation, and cross-industry talent development.
The third step involves data collection (Crowe et al., 2011; Kekeya, 2021). A qualitative case study helps in capturing in-depth data using methods such as interviews, observations, and document analysis, thereby uncovering subtleties and contextual factors that may not be obtained through large-scale quantitative surveys (Stake, 1995). In particular, this step comprised a series of interviews, including three unstructured interviews to analyse the company's priorities and challenges, followed by seven semi-structured interviews to guide discussions towards the refinement of the ASTEM and RCE framework. These interviews were conducted online using the Webex platform by a team comprising university researchers and experts in the systems and governance engineering department from the selected company. Additionally, the company documents (i.e., competence and role taxonomies, role-competence matrix) were analysed to capture qualitative information regarding the company’s perspective on competence management requirements.
The fourth step involved data analysis (Crowe et al., 2011; Kekeya, 2021). The insights gained in the previous phase contributed significantly to identifying operational strategies to enhance competence management through the effective use of advanced AI technologies. The third and fourth phases are repeated iteratively to refine the proposed solutions and implement them accurately with the real business data (Crowe et al., 2011; Kekeya, 2021). Subsequently, we developed and implemented ASTEM and RCE using Google Colab (a cloud-based platform to write and run Python code).
The last phase of the methodology involves reporting the results obtained from the case study (Crowe et al., 2011). ASTEM and RCE provided robust evidence from the real data of the company, demonstrating the effectiveness of AI-driven approaches. The implementation of the proposed solutions validated the findings, and established the best practices for future applications in AI-enhanced competence management.
Proposed frameworksThe ASTEM and RCE frameworks are designed to work in tandem (Fig. 2), with each serving distinct but complementary functions within the competence-based HRM. In this case study, ASTEM is used to generate enriched, contextually accurate descriptions for each competence within a corporate taxonomy by leveraging the capabilities of the LLMs. Once produced, these enriched descriptions serve as the input for the RCE framework, which uses pre-trained embedding models to automatically associate these enriched competencies with the relevant job roles. The interaction between these two frameworks ensures a continuous, automated process for updating and maintaining the accuracy of the role-competence associations. We demonstrated the unique methodologies and technological foundations of ASTEM and RCE in the following sections. This separation presents a clearer understanding of their individual contributions and the specialised processes involved in each, and demonstrates their combined application in the case study.
ASTEMThe ASTEM framework (Fig. 3) is an iterative methodology designed to automatically improve the semantic enrichment descriptions process of a corporate taxonomy using LLMs. The steps involved in ASTEM include: 1) preliminary taxonomy analysis, 2) prompt design (divided into system prompt and user prompt design), and 3) output evaluation. ASTEM is primarily used to reduce the manual effort involved in semantic enrichment and to improve the contextual linkage of the taxonomy elements to specific job roles and competence. This represents a significant improvement over conventional methods that employ topic modelling of retrieved information from external sources (e.g., websites, document) to add them into a taxonomy. This methodology can be applied when the taxonomy under consideration already has starting descriptions that must be enriched with more contextual information. They can also be applied when there are no descriptions. In the latter case, the performance of the model may be worse as it must generate text based on minimal information, such as the labels of the taxonomy alone.
In the preliminary taxonomy analysis phase, we defined clear objectives for the enrichment process. This phase involves a detailed assessment of the existing taxonomies to identify areas that lack detail or contain errors. Obtaining a comprehensive understanding of the current state of a taxonomy and determining the specific concepts that require semantic improvement to prepare the taxonomy is essential for effective enrichment. During this phase, control questions must be established (Table 3) based on the CRAAP test for to evaluate the information (Blakeslee, 2004). However they must also be applicable for the qualitative assessment of AI-generated content and prompt design (Newcastle University, 2024).
Control questions from CRAAP test to evaluate GenAI outputs (Blakeslee, 2004; Newcastle University, 2024).
The prompt design phase involves designing effective queries that guide the LLM to generate accurate and relevant content. This includes selecting the most appropriate methodological approach (e.g., Chang, 2023; Velásquez-Henao et al., 2023; White et al., 2023) based on the requirements identified in the first phase. Relevant information, such as the application scenario and end goal, is added to the system prompts to enrich the context, thereby improving the ability of the model to produce useful responses (Nguyen et al., 2021). It is essential to define the required task specify how it must be performed within the system prompt for the LLM. In the user prompt, we included information on the hierarchical relationships (e.g., category membership and any descriptions already present) of the taxonomic element to be semantically enriched. During this phase, we set parameters such as the temperature, which generates more deterministic and less random word sequences, at low values (OpenAI, 2023).
The evaluation phase involves refining the prompts based on the qualitative analysis of a random sample of output based on the control questions established in the first phase. Additionally, a further check can be implemented by determining the semantic similarity between the enriched descriptions obtained using the LLM and each branch of the source taxonomy based on the method proposed by Excoffier et al. (2024). This helps in verifying that each enriched description was associated with the corresponding competence label via its unique code. Furthermore, if the descriptions are already present in the taxonomy, the outputs can be further verified by comparing the semantic similarity between the representative embeddings of the generated texts and the original ones to demonstrate that the generated text does not vary significantly from the original text (Celikyilmaz et al., 2021).
The prompts are iterative modified based on the feedback received to continuously improve the quality and accuracy of the generated descriptions, thereby ensuring that the responses are contextually appropriate.
RCEFig. 4 depicts the RCE framework, which was developed to automatically associate each job role with a suitable and consistent set of competence. We used two reference taxonomies for this purpose: the taxonomy of professional roles and the taxonomy of competence. The steps involved in RCE include: 1) data pre-processing, 2) embedding vectors to process text corresponding to the job roles and competencies, 3) calculating the semantic similarity between these texts, which involves computing the dot product between the role and competence embedded vectors, and 4) re-ranking the results to identify the most appropriate competence sets for each role within the corresponding taxonomies.
The proposed framework involves using datasets in the Excel format that represent the taxonomies of the professional roles and competence. In particular, the role taxonomy comprises an identifier code for each role and the associated textual strings (i.e. professional family, macro-role, activity descriptions, and mission), which are all placed in the appropriate columns. Similarly, the competence taxonomy includes an identification code for each competence, membership categories, and descriptions.
Pre-processing must be first performed to effectively apply the embedding models (Amin et al., 2022). For the role taxonomy, we concatenated all the textual strings for each role, including the professional family name, macro-role, role, role description and activities. This process is crucial as it enables the embedding model to better capture the overall semantic meaning of each role and competence, thereby increasing the probability of appropriate association. This process produces a two-column dataset comprising the unique identification role code and a concatenated description where each row in this dataset represents a specific role. The same procedure is applied to the competence taxonomy.
The embedding model must be carefully selected. There are several open source models available on https://huggingface.co/spaces/mteb/leaderboard; however, they face various limitations regarding the number of tokens that they can process (from 512 to 4096). Therefore, not all the text may actually be processed, presenting a significant loss of semantic information and causing inefficiency in the process.
During the creation of the embedding vectors, the embedding models create separate vector representations for each role and competence concatenated description in an n-dimensional space that can range from 64 to 30522 dimensions, based on the selected model. This results in a set of representative vectors for each individual role and a set of vectors for each individual competence. This step is crucial for ensuring that each role and competence are represented accurately and consistently, thereby facilitating semantic comparison.
Subsequently, we calculated the semantic similarity using dot product, which is particularly effective in determining the similarity between the vectors within a multidimensional space (Kurek et al., 2024), between the role and competence embeddings vectors. In particular, it helps in determining the embedding of the first role of the professional role taxonomy with all the embeddings of the competence taxonomy. This process is repeated for each role within the role taxonomy.
Following the semantic similarity calculation, we performed the re-ranking step. For each role embedding, the final dataset was reorganised by placing the competence in the descending order of the similarity score. Lastly, we extracted only the top K competence for each role, based on the highest similarity scores. This process ensures that the competence associated with each role are the most relevant and appropriate ones, thereby improving the accuracy and efficiency of competence-based HRM.
Case study resultsIndustrial scenario descriptionThe interviews conducted with the key informants of the selected company highlighted the challenges faced by the company and inform the development of the targeted solutions to enhance the integration and usability of role and competence taxonomies. In this section, we present the key outcomes emphasising the interplay between the insights derived from collaborative discussions and document analysis. There are two types of taxonomies: a competence taxonomy and a role taxonomy.
The role taxonomy comprises an initial division corresponding to the professional engineering families, a second hierarchical level corresponding to the macro-roles with descriptions, and the third level corresponding to the division of the roles, totalling 246 branches. Each role is accompanied by sufficiently comprehensive descriptions, including both strategic and operational aspects. Enrichment methodologies, such as ASTEM, are unnecessary for this specific area owing to the richness and level of detail of the role taxonomy. Although the role taxonomy is solid and well developed, it operates in isolation from the competence taxonomy.
The competence taxonomy comprises multiple ambiguous and overly sparse descriptions (see example in Table 4), making it ineffective for RCE. Therefore, the definitions for 766 competencies must be enriched using ASTEM. Previous studies and expert interviews demonstrated the complexity of obtaining precise and contextualised definitions for these competencies.
The company manually created a matrix associating each role with a set of up to 20 competencies; however, these taxonomies are subject to revision every two years, and the organizational requirements may change over time, making the manual process inefficient. This matrix supports strategic HRM by clearly defining the requirements across the organization. The matrix comprises a total of 3,954 records, with the average number of competencies associated with roles categorised by the job family, as shown in Table 5. The matrix is constructed manually, which directly involves division managers and role members through a self-assessment activity presenting several significant limitations:
- •
The manual creation of the matrix requires considerable effort from managers, making the process lengthy and resource-intensive.
- •
The manual method is not sufficiently scalable, particularly for complex taxonomies or growing organizations.
- •
The matrix cannot be easily updated to reflect changes in the required competencies or role characteristics.
An analysis of the matrix construction process demonstrated the implementation of the RCE to significantly improve the connections between the roles and competencies.
Therefore, two core requirements of the company were identified through interviews and document analysis:
- 1.
Semantic enrichment of competence taxonomy: A recurring theme was the requirement to enrich the sparse descriptions in a taxonomy to make it more actionable and to facilitate the integration between role taxonomy.
- 2.
Automated role-competence matrix: The company highlighted the significance of automating the process of associating roles with competencies. This automation conserves save time, while enabling the company to dynamically update these mappings based on changes in the organizational requirements and taxonomy revisions.
Automating these processes reduces the manual effort and errors; furthermore, it enables the company to adapt rapidly to evolving demands in sectors with high level of complexity.
Frameworks ImplementationFor the application of ASTEM and RCE in the case study, we employed advanced technological resources selected to ensure the effectiveness and scalability of the taxonomy enrichment and integration processes. In particular, we adopted technological solutions based on AI models and specific computational resources to satisfy the requirements of the various tests.
Competence taxonomy semantic enrichment using ASTEMFor the semantic enrichment of the competence taxonomy with ASTEM, we employed the GPT-4 model (OpenAI et al., 2023) based on the benchmark studies conducted by Chen et al. (2024) that highlight its advanced reasoning capabilities and ability to generate contextualised descriptions based on instructions, despite the absence of specific training (Excoffier et al., 2024). The GPT-4 model was particularly effective in producing coherent and contextualised definitions, thereby satisfying the specific requirements of the case study and overcoming the limitations of taxonomies lacking detailed textual content (OpenAI et al., 2023).
Therefore, we queried GPT-4 model (OpenAI et al., 2023) through the OpenAI API to semantically enrich the competence taxonomy descriptions. In this case, GPT-4 generalises its knowledge to generate descriptions for competence within the taxonomy, despite the absence of explicit training on those specific elements. We developed a Python script to consider the competence taxonomy as input, extract the category, competence name, and original description from each taxonomy branch, and then query GPT-4 to semantically enrich the competence description based on the extracted labels. The documentation provided by OpenAI (OpenAI, 2023) helped in constructing a Python script to automate the API requests and adjust the process parameters, making the system efficient and robust. The system prompt specifies the task to be performed by the model and the method used to perform the task, whereas the user prompt provides only the hierarchical information corresponding to the competence description to be generated.
After generating the enriched competence descriptions, we randomly selected 10 outputs selected, which were subjected to an initial qualitative assessment by the partner company. This assessment utilised the control questions presented in Table 6, which were derived from the CRAAP test (Blakeslee, 2004; Newcastle University, 2024). However, the Currency (timeliness of data) and Authority (credibility of sources) criteria were omitted as they do not directly correspond to the evaluation of the quality of the content generated by GPT-4. This is because the validity of these aspects inherently corresponds to the training data of the model, which was considered sufficient to reference the documentation presented by OpenAI (OpenAI et al., 2023).
Control questions extracted from the CRAAP test used to evaluate ASTEM output.
The remaining criteria, i.e., Relevance, Accuracy, and Purpose, were qualitatively assessed through a detailed review of each output from the randomly selected sample. This review was conducted during structured interviews, with key informants from the partner company involved in the case study.
Based on the feedback collected, the model prompt was iteratively refined until it reached its final version, as shown in Table 7.
System Prompt and User Prompt used in the ASTEM case study.
A further analysis was conducted to determine whether the generated outputs (i.e., the descriptions of competencies) were consistent with the input data (i.e., the initial labels of the competence taxonomy).
The original study conducted by Excoffier et al. (2024) was applied in a hospital setting, where LLMs were used to generate textual datasets from ICD-10-CM code descriptions, a widely used taxonomy in U.S. hospitals; however, the core methodology remains valid in our context. The approach involves using generalist embedding models to verify whether the generated texts align with their corresponding classification codes.
In particular, the method relies on calculating the dot product similarity score using the E5-large-V2 embedding model, which was widely implemented for tasks requiring single-vector textual representations (Wang et al., 2022). In our case study, each ASTEM-generated competence description was compared against all the entries in the original competence taxonomy, and the competence identification code was assigned based on the highest similarity score. This comparison was performed to obtain the correct competence identification code for each generated description, thereby ensuring consistency between the input data and LLM-generated outputs. The same approach was applied to the original competence descriptions to determine the capability of embedding models to obtain the appropriate identification code, thereby enabling a comparison of the retrievability of ASTEM-generated descriptions versus the original ones.
The results of the semantic similarity analysis for validating the output presented in Table 8 indicated that, out of the 766 enriched competence descriptions, 73.4 % were associated with the correct identification code (Exact_Match), demonstrating the effectiveness of the ASTEM and GPT-4 in generating new coherent competence descriptions based on their labels. Additionally, 16.4 % of the enriched generated descriptions were associated with the wrong identification code, but fell within the correct competence category (Exact_Category). This indicates that there is some ambiguity in certain competencies within the taxonomy. In the remaining 10.2 % of cases, the association did not match despite the high values of the semantic similarity range (77 to 83).
We conducted a qualitative content analysis of two random competencies (i.e., radio communication and AFCS development for system engineering) to analyse the failure of No_match. The analysis indicated that the No_match enriched descriptions are primarily attributed to semantic ambiguity, loss of contextual specificity, and the limitations of the employed embedding model in differentiating the technical nuances. For instance, the enriched description for Radio Communication competence was incorrectly linked to Network Design, Planning & Engineering because they both share keywords such as ‘radio’ and ‘TETRA’ (see the complete descriptions in Table 9). Similarly, an enriched description corresponding to AFCS development for System Engineering was incorrectly associated with AFCS development for Aeronautic Engineering due to overlapping technical terms such as ‘AFCS’, despite the fact that the enriched descriptions comprised different aspects (see the complete descriptions in Table 10). However, this misalignment does not indicate that the enriched descriptions are off-topic or irrelevant. Instead, it highlights a limitation in the semantic analysis capabilities of the embedding model, which prioritises lexical and phrase similarity over precise conceptual alignment. This indicates that while the model can effectively identify typical thematic similarities, struggles with finer distinctions that require deeper contextual understanding.
Comparison between radio communication and web communication and marketing competence descriptions.
Comparison between AFCS development for system engineering and AFCS development for aeronautic engineering competence descriptions.
Furthermore, we calculated the semantic similarity scores between the embeddings of each generated description and its original counterpart to assess whether the prompt constructed with ASTEM produced coherent outputs. The scores distribution presented in Fig. 5 tend to be high, indicating that the semantic enrichment process generated new descriptions very similar to the originals. This is likely because GPT-4, operating on a probabilistic basis, contained more information to draw upon, enabling it to produce coherent and consistent outputs. Conversely, the lowest similarity scores are associated with competencies that comprised very sparse original descriptions as demonstrated by the example depicted in Table 11. Consequently, GPT-4 generated outputs that varied more significantly from the originals although not necessarily producing incorrect content.
Example of comparison between original and enriched descriptions with low and high similarity score.
The second step of the case study involves automating the linkage between roles and competence, thereby comparing the performance of two different embedding models: E5-large-V2 (Wang et al., 2022) and mxbai-embed-large-v1 (Xianming & Jing, 2023). The models were selected from the Hugging Face repository based on their ranking in the Massive Text Embedding Benchmark (MTEB), which evaluates models using several key metrics, including the model size, embedding dimensions, maximum token capacity, and memory usage (Muennighoff et al., 2023). These models were selected based on two main considerations: minimising computational costs (e.g., reducing memory usage and model size) while maximising the number of processable input tokens. The two pre-trained embedding models were selected to integrate the role and competence taxonomies within the RCE framework due to their strong semantic search capabilities and computational efficiency, even without domain-specific fine-tuning (see Table 12 for details). Although both the models share similar strengths and limitations, E5-large-V2 has been extensively evaluated on widely recognised benchmarks, particularly for sentence similarity tasks (Excoffier et al., 2024). Conversely, mxbai-embed-large-v1 has fewer publicly available test results, making it more difficult to evaluate its precise performance without direct experimentation. Consequently, this study aims to empirically compare their effectiveness in linking roles and competences.
Pre-trained embedding models used for RCE application in case study.
| Model #1 | E5-large-V2 (Wang et al., 2022) |
|---|---|
| Model #2 | mxbai-embed-large-v1 (Xianming & Jing, 2023) |
| Description | Transformer-based sentence embedding models trained on multi-task data (e.g., natural language inference, question–answer retrieval, paraphrase detection, etc.) and designed for tasks involving semantic similarity, information retrieval, and other NLP applications where good dense text representations are necessary. |
| Key Features |
|
| Limitation |
|
These models were employed separately to vectorise the texts corresponding to each individual role within its taxonomy and for each competence in the respective taxonomy. The most appropriate competence sets for each role can be extrapolated by calculating the semantic similarity through the dot product between the representative vectors.
Following the RCE framework, the available data was pre-processed into the role taxonomy and the competence taxonomy, both in Excel format. All the information corresponded to a role, such as professional family, macro-role, and other textual information (mission, activities, and descriptions), was concatenated into a single cell to represent the role description (see an example in Fig. 6). A similar process was applied to each competence, creating a single text cell comprising descriptions and the category it belongs to (see example in Fig. 7). This approach enables the embedding models to better capture the overall semantic meaning, thereby improving the likelihood of accurate association.
The embedding models vectorised the role and competence descriptions in parallel. Subsequently, the representative vector of each role was compared with all the vectors representing the competence, thereby calculating the dot product score. The associated competencies were sorted in descending order of the similarity score for each role, and the top 16 competencies were extracted per role. These competencies were selected based on an average derived from the manually created role-competence matrix by the company (Table 5).
In this phase, we primarily aimed to test whether RCE combined with embedding models is a valid alternative for automatically creating role-competence matrix. The automatically generated role-competence pairs from two different embedding models (i.e. E5-large-V2 and mxbai-embed-large-v1) was compared against the manually created ground truth (GT) dataset provided by the company.
The records are categorised into job families and the matching role-competence pairs are identified to evaluate the performance of the models (i.e. precision, recall, and F1-score), as shown in Table 13. This analysis was conducted on both the original competence descriptions and the ASTEM-enriched descriptions to determine whether semantic enrichment effectively enhanced the text, thereby making it more semantically informative and reducing ambiguity in the embedding-based similarity analysis. The results indicate that semantic enrichment with ASTEM, combined with GPT-4, enhances the competence descriptions, making them more useful and interpretable for embedding models when compared with the original descriptions in the competence taxonomy. Furthermore, the comparative analysis between the two embedding models demonstrates that mxbai-embed-large-v1 consistently outperforms E5-large-V2 in terms of the precision, recall, and F1-score across most job families. Notably, differences and patterns in the model performance metrics emerged across various job families. For instance, considering the job families, although the TM roles demonstrated higher precision in associating competencies, the EM roles exhibited lower precision, indicating that both models frequently suggested ‘incorrect’ competencies for EM roles despite high similarity scores. This discrepancy raised concerns regarding the potential biases, embedding model limitations, and structural differences between the descriptions of the roles.
Performance metrics of role-competence pairing by job family.
JF, Job Family.
GT, Role-competence pairs in the ground truth.
RCE, Role-competence pairs found with RCE framework and embeddings models.
To analyse this, we systematically extracted two EM and TM roles with the highest number of incorrect (false positive) competence associations to perform a qualitative comparative analysis. The number of common competencies between those selected automatically (RCE) and those defined manually (GT) is significantly lower in the EM roles, as shown in Table 14.
The EM roles (see an example in Table 15) present greater semantic complexity, as they include both strategic governance and operational management aspects. This caused ambiguity in the competence selection for the embedding models, where the models incorrectly prioritised operational and task-specific competencies (e.g., supplier management and work package definition) over broader engineering governance ones (e.g., system integration and project control). Conversely, the TM roles presented less ambiguity as they focused on innovation, technology scouting, and research collaboration. The embedding models were more successful in aligning competencies for these roles as their descriptions are clearer and less mixed with operational tasks. In fact, the embedding models are typically trained on datasets that contain rich documentation on technological innovation and business strategy (Excoffier et al., 2024; Wang et al., 2022), whereas they have less detailed coverage on specialised engineering knowledge (Pursnani et al., 2023; Roemer et al., 2024), thereby limiting the ability of the model to capture the EM role-specific competencies.
Example of RCE output for a EM-1 role from different embedding models and GT.
Another analysis was conducted to evaluate the total number of role-competence pairs identified by both the models and their alignment with GT, grouped by semantic similarity scores. Table 16 presents the results, which indicated that both mxbai-embed-large-v1 and E5-large-V2 exhibit similar performance patterns. In particular, the precision of role-competence associations improves significantly with the increase in the semantic similarity score.
E5-large-V2 does not present any matches in the lower ranges (60–75) due to its similarity scores clustering around higher values (0.7 to 1.0) because of the model architecture (Wang et al., 2022). Even if the scores are normalised (e.g., to a 0–1 scale), this process does not inherently make them comparable across models. The meaning of a normalised score can still vary based on the specific characteristics and training of the model. Therefore, for tasks such as text retrieval or semantic similarity, the relative ranking of the scores is more important than their absolute values (Wang et al., 2022).
The Venn diagram in Fig. 8 presents significant insights regarding the performance and overlap of three sets (i.e., GT, E5-large-V2, and mxbai-embed-large-v1). The GT set contains 2397 unique elements, indicating a significant portion of data that neither model captures. Notably, the two models share 2527 elements, highlighting their effectiveness in capturing certain aspects of the role-competence data. Additionally, there are 1017 common elements across all three sets, indicating areas where both the models align well with the GT.
DiscussionResearch implicationsThe results of this study help in establishing a knowledge base that can serve as a starting point for more in-depth studies in the future, making innovative AI technologies accessible and promoting informed use in the fields of competence-based HRM.
First, this study presents a detailed overview of the LLMs, including the fundamental principles of their architecture and functioning. Extensive research has been conducted (e.g., Vaswani et al., 2017; Wang et al., 2022) on explaining the working of these models; However, the previous studies were typically highly technical and limited research was conducted (e.g., Amin et al., 2022; Kurek et al., 2024) on applying competence-based HRM. Therefore, this study helps researchers and professionals without a specific background in computer science, as it elucidates complex concepts and presents a comprehensive understanding of the technologies underlying modern AI. By providing clear and in-depth explanations of the working of LLMs, the transformer architecture (Vaswani et al., 2017), and the role of embedding in capturing the semantic nuances of texts (Excoffier et al., 2024), this study helps in establishing a solid knowledge base that can facilitate further studies and applications in this field.
Furthermore, this study proposes ASTEM, which leverage the LLMs to generate enriched descriptions for corporate competence taxonomy elements, thereby emphasising the importance of prompt engineering in formulating requests to LLMs to obtain relevant and accurate responses (Chen et al., 2023). Although previous studies demonstrated that LLMs can generate structured text by capturing complex word relationships, they typically suffer from contextual inaccuracy and specificity issues (Vartinen et al., 2022). ASTEM mitigates these limitations by integrating a preliminary taxonomy analysis, defining clear enrichment objectives to ensure that the LLM-generated descriptions align with the corporate taxonomies and prompt engineering techniques, thereby refining queries to optimise the LLM performance and improve the contextual accuracy (Chen et al., 2023; White et al., 2023). The iterative evaluation and prompt refinement, improves outputs to reduce hallucinations and factual inconsistencies (Lesage et al., 2024).
This methodology helps in reducing the manual effort required in taxonomy management and improving the contextual integration of various taxonomies (e.g., roles and competence). In particular, using LLMs to generate synthetic competence descriptions highlights a new application of generative AI in the context of HRM. The capability to generate accurate and contextualised enriched data can be extended to other types of taxonomy (e.g., technology, processes), thus presenting significant opportunities to improve the management recommendation systems even without the unavailability of the structured labelled dataset.
This study also proposes the RCE framework combined with embedding models to automate the association between the professional roles and required competence. The RCE reduces the time and resources required to create and maintain these associations, while also ensuring that the data remains up-to-date, regardless of changes to the company taxonomies. The RCE framework builds upon the previous studies conducted on embedding models for HRM (Amin et al., 2022; Kurek et al., 2024) but extends their application beyond simple job-description matching to create a fully automated taxonomy integration framework.
The automation of the role-competence association process represents a significant innovation in competence-based HRM, thereby providing a replicable model for future research. The comparison of E5-large-V2 (Wang et al., 2022) and mxbai-embed-large-v1 (Xianming & Jing, 2023) pre-trained embedding models demonstrates their differences in the semantic similarity tasks. This comparison serves as a benchmark for future studies aiming to evaluate the performance of other embedding models in role-competence association tasks and similar applications.
The results of the RCE application with two different pre-trained embedding models also demonstrated that the ASTEM enriched competence taxonomy improves the effectiveness of the embedding models in creating accurate associations. This indicates that the embedding models can become fundamental tools to retrieve information with high accuracy if they are supported by adequate and rich textual content. Therefore, the effectiveness of the embedding model in capturing the semantic nuances between texts and creating relevant associations between the roles and competence could be a starting point to analyse other types of applications, such as the competence-technology association.
The ASTEM and RCE frameworks also contribute by extending the dynamic capabilities theory (Teece et al., 1997) and the KBV of the firm (Grant, 1996). Based on the dynamic capabilities perspective, the structured use of LLMs in HRM enhances the ability of an organization to sense, seize, and transform competence-related information. By improving the management of taxonomies and promoting a standardised competence-based approach, companies can dynamically adapt to evolving workforce requirements while avoiding dependence on biased or opaque decision-making processes. By automating the semantic enrichment of corporate competence taxonomies, ASTEM enables organizations to dynamically update and refine their competence datasets, thereby ensuring that newly emerging competencies and technologies are promptly incorporated. RCE further enhances the sensing and transformation capabilities of an organization by automating the identification and matching of competencies to roles, thereby improving the workforce adaptability and strategic alignment.
Additionally, from a KBV perspective, this study employs LLMs as a tool to enhance the ability of the firm to organise, structure, and utilise knowledge assets effectively. Furthermore, this study states that firms gain a competitive advantage through effective knowledge management (Grant, 1996). ASTEM employs KBV by structuring unstructured corporate knowledge into enriched competence descriptions, thereby making knowledge more accessible and actionable. The RCE framework strengthens the KBV applications by employing AI-driven embeddings to systematically associate knowledge with the organizational roles, thereby reducing dependence on manual and the typically subjective knowledge mapping processes. Rather than delegating critical decision-making to AI-based models, the proposed approach ensures that these models enrich the corporate knowledge, thereby making information more accessible and actionable without compromising the ethical and legal considerations. By consciously limiting the applications of the LLMs to non-intrusive, but high-value knowledge structuring tasks, organizations can leverage the strengths of AI while mitigating the risks, thereby ensuring a balanced integration of human expertise and technological advancements.
Practical implicationsIn this study, we demonstrate how adopting LLMs can significantly transform the field of competence-based HRM, presenting new avenues for research and practical applications. In particular, the integration of advanced LLMs and prompt engineering represents a crucial step towards the automation and optimisation of competence management, promoting a more efficient usage of human resources and improving the business performance.
The implementation of the ASTEM and the RCE framework enables the automation of the semantic enrichment and association process between roles and competence enabling companies to significantly reduce the time and resources needed to keep their competence and roles taxonomies up to date, thereby improving operational efficiency and information accuracy. These contributions enable the companies to save on human resources costs by reducing the need for manual interventions, which are typically time-consuming and resource-intensive. The conventional methods typically require subjective and manual evaluations, which can cause errors, inefficiencies and introduce subjectivity in the evaluation (Arslan & Cruz, 2022). Companies can achieve more accurate and consistent evaluations by leveraging LLMs, thereby eliminating many of the inefficiencies associated with the manual processes.
ASTEM includes iterative phases of evaluation and improvement, thereby ensuring that the generated descriptions and associations are constantly refined based on feedback. This continuous improvement process ensures that the adopted solutions remain relevant and effective over time.
In particular, companies can more precisely identify the ideal candidates for specific job positions and develop targeted training plans by using pre-trained embedding models to associate competence and roles through RCE. This process improves employee satisfaction as they are assigned to roles that best match their competence and aspirations. Additionally, targeted training plans help employees to further develop the competence required to excel in their roles, thereby increasing their motivation and reducing the turnover rates.
The ability to keep competence taxonomies updated enables companies to rapidly adapt to market changes and new technological requirements (Frierson et al., 2023). This is particularly relevant in high-tech sectors, where the evolution of the required competence is continuous and rapid. An updated and dynamic system enables the companies to promptly respond to new challenges and opportunities, thereby maintaining a competitive advantage. This supports the decision-making process by providing information based on accurate and current data, thereby improving the quality of strategic decisions corresponding to talent management and competence development (Qin et al., 2020). Decisions based on concrete data and advanced analyses, known as data-driven decision-making, are typically more effective and can achieve better outcomes when compared with decisions based on intuition or outdated data, particularly in the industrial sectors (Bousdekis et al., 2021).
A common concern when discussing LLMs is the bias present in their training datasets, which can affect the quality and neutrality of their outputs (Furizal et al., 2024). These datasets are vast and diverse, incorporating conflicting perspectives and a wide range of ideological perspectives (Vasanthan et al., 2021). Unlike a single human evaluator, who may unconsciously prioritise certain competencies over others based on the subjective preferences, the biases of LLMs can be rectified through standardised processing methods and prompt engineering, as demonstrated by the gender bias (Dwivedi et al., 2023). Despite the concern regarding bias in the LLM outputs, their ability to process vast and diverse information makes them more advantageous than conventional HR decision-making, presenting a more systematic, scalable, and objective approach to competence management.
Lastly, this study promotes competence-based HRM approach as it facilitates the assignment of the most appropriate roles for each employee, promoting a healthy and meritocratic work environment. This encourages employees to develop the competence necessary to effectively contribute to the objectives of the company, thereby improving both the individual and collective productivity. A positive and meritocratic work environment also contributes to employee retention, thereby reducing the costs associated with turnover (Shivanjali et al., 2019). Table 17 presents the strengths and limitations of ASTEM and RCE when compared with the conventional HRM approaches, which emphasises adaptability, bias reduction, scalability, and interpretability.
Comparative analysis of key aspects of HRM practices, ASTEM and RCE.
| Adaptability and Scalability | Bias & Subjectivity | Interpretability | |
|---|---|---|---|
| Description | Updates in HRM information require considerable time and resources to adapt to different contexts, making it difficult to apply them across industries and as job roles evolve (Cao & Zhang, 2022) | HR decisions are typically subjective, influenced by individual experiences, cultural backgrounds, and unconscious biases (Kurek et al., 2024) | AI-based HRM tools typically function as ‘black boxes’, making recommendations difficult to justify or understand (Malik et al., 2023) |
Although this study presents significant research and practical contributions, it faces limitations that may pave the way for future research improvements in the field of LLMs and their applications in competence-based HRM.
One of the main limitations of the study concerns the generalizability of the results. The study is based on a specific case in the aerospace sector, and the solutions and results obtained may not be readily transferable to other sectors without further adaptation and validation. The exclusive focus on the aerospace industry limits the applicability of the findings, and further studies in diverse industries such as healthcare, finance, and manufacturing will help validate and refine the proposed frameworks. Future research must also consider sector-specific competence taxonomies and the varying organizational structures that may influence the effectiveness of the ASTEM and RCE framework.
Furthermore, we demonstrated that the use of pre-trained embedding models (specifically the open source E5-large-V2 and mxbai-embed-large-v1), supported by adequate semantic enrichment, can significantly improve the automatic association process between the roles and competence. This indicates that the quantity and quality of texts to be analysed are crucial for embedding models to perform well in semantic similarity analysis tasks. However, the overall precision of the two models indicates that there is still room for improvement, particularly in the lower score semantic similarity ranges. The possibility of reducing the human effort in validating the obtained results by establishing an optimal semantic similarity threshold further optimises the internal processes, enabling managers to focus on higher value-added activities. Identifying the threshold requires further research as embedding models are trained on different datasets and model textual information differently. Analysing other bigger models such as SFR-Embedding-Mistral (Meng et al., 2024), is an essential approach for achieving a more robust contribution on using embedding models to create role-competence associations.
Although we performed some analyses on random samples regarding the bias of the GPT-4 for ASTEM using the CRAAP test (Blakeslee, 2004), the bias of the embedding models for RCE was not comprehensively analysed. Therefore, future works must analyse bias mitigation strategies, including explainable AI techniques to improve the transparency and interpretability (C.-S. Lin et al., 2023). Structured interviews with experts from the partner company help in evaluating the obtained ASTEM-enriched competence descriptions and RCE role-competence association from a more quantitative perspective (Taherdoost, 2022). Furthermore, an analysis based on clustering RCE and GT role-competence pairs is also performed to obtain a better understanding of this gap and the reasons behind the fact that the model does not fully reflect the GT dataset. However, increasing the trust in these models requires continuous collaboration between academia and industry to ensure their responsible development and usage. Implementing continuous evaluation mechanisms is essential for mitigating the risk of amplifying the existing societal biases and to promote fairness and equity in HR practices (Prabhakaran et al., 2022).
In summary, Table 18 presents the key future research directions to enhance the ASTEM and RCE frameworks corresponding to generalizability across various industries, the use of larger models, bias mitigation in LLMs, extensions to competence-technology associations, and improving AI interpretability for HRM applications.
Future research directions and related research questions.
In this study, we analysed the transformative potential of LLMs to improve competence-based HRM by conducting a specific case study in the industrial sector. We introduced two innovative solutions using LLMs: ASTEM and RCE frameworks. These solutions represent significant advancements in automating and enhancing competence-based HRM processes, thereby promoting more efficient and effective competence management.
This study contributes significantly to academic research in the fields of competence-based HRM and the application of LLMs in business contexts. In particular, we elucidated complex AI concepts and made them accessible to HRM professionals by providing a detailed overview of LLMs, including the principles of their architecture and functioning. The emphasis on prompt engineering is particularly notable, highlighting its significance in optimising the LLM performance to generate accurate and relevant responses. From a practical perspective, the proposed solutions help companies to use LLMs to semantically enrich the competence information based on their context of usage and to use embedding models to identify the set of competencies corresponding to a specific professional role. Companies can achieve greater operational efficiency, reduce costs, and improve employee satisfaction by automating the semantic enrichment of taxonomies and associating the roles with the corresponding competencies.
The findings of this study present a solid foundation for future analysis and broader applications of LLMs in HRM and beyond. The aerospace sector serves as a compelling case study owing to its complexity, dependence on high-level expertise, and extensive supply chain networks. However, the principles and methodologies presented in this study can be applied to other high-tech and knowledge-intensive industries. Similar challenges in competence management are observed in other complex manufacturing sectors, where the integration of AI-driven competence frameworks could present comparable benefits. However, further research must be conducted to determine cross-industry applicability, identifying context-specific requirements and adapting LLM-based HRM solutions accordingly.
Future works must focus on optimising prompt engineering techniques, addressing biases in LLMs, and analysing larger and more diverse datasets to improve the robustness and accuracy of the role-competence associations. Additionally, comparative studies across different industrial domains could provide valuable insights on effectively adapting AI-driven competence management frameworks beyond the aerospace sector, extending their relevance to broader HRM and workforce planning strategies.
CRediT authorship contribution statementGiuliana Barba: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Angelo Corallo: Validation, Supervision, Methodology, Conceptualization. Mariangela Lazoi: Writing – review & editing, Validation, Supervision, Methodology, Conceptualization. Marianna Lezzi: Writing – review & editing, Validation, Supervision, Methodology, Investigation, Conceptualization.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
The authors thank Massimo Scalvenzi and Chiara Spina for the valuable insights.



























