A systematic review of studies using translated versions of the Attribution Questionnaire (AQ-27) to measure public stigma towards people with schizophrenia

Thirkettle, Claire; Oduola, Sheri; Black, Sharon; McEntegart, Lucy; Beazley, Peter

doi:10.1016/j.ejpsy.2025.100290

Article information

Abstract

Full Text

Bibliography

Download PDF

Statistics

Figures (4)

Show moreShow less

Tables (3)

Table 1. Overview of Study Characteristics (Part II).

Table 2. Overview of the Quality of Translation Processes.

Table 3. Reliability and Validity of Translated Versions of the AQ-27.

Show moreShow less

Additional material (1)

Abstract

Background and Objectives

The Attribution Questionnaire (AQ-27) is a widely used measure of public mental illness stigma. The AQ-27 was originally developed in the USA in the English language. Since its inception in 2003, several translations of the measure have been produced. This is the first review to explore the use of translated versions of the AQ-27 to measure stigma towards people with schizophrenia.

Methods

A systematic review was conducted. MEDLINE, PsycInfo and Web of Science were systematically searched between 2003 and 2024. The COSMIN Study Design Checklist was adapted to appraise the quality of the translation processes. Data were extracted relating to measurement properties (reliability and validity) of the translated measures.

Results

Forty-one studies were identified, spanning fifteen countries and eleven languages. Most studies (n = 26, 63.4 %) were located in Europe. Twelve original translations of the AQ-27 were identified, of which, four studies were primarily focused on translation and validation of the measure. The Turkish, Italian and Arabic translations were rated highest for methodological quality of the translation process.

Conclusions

Researchers should consider the quality of the methodology used to develop existing translated versions of the AQ-27 before adopting them, as this may have implications for the validity and equivalence of the measure within the target culture. Translation frameworks are available to support the high-quality translation and cross-cultural adaptation of self-report measures.

Keywords:

AQ-27

Translation

Cross-cultural

Stigma

Schizophrenia

Full Text

IntroductionDefining stigma

Across countries and cultures, the psychiatric diagnosis of schizophrenia is associated with a high level of public stigma and experienced discrimination.1,2 Stigma has been defined in a variety of ways. Erving Goffman's conceptualisation of stigma as being an ‘attribute that is deeply discrediting’3 has been built on by authors such as Link and Phelan4 who conceptualise stigma as consisting of several interacting components: the labelling of difference, stereotyping, separation of ‘us’ and them’, status loss, and discrimination. Power differences (social, economic and political) are considered crucial to enabling stigmatisation. Other perspectives emphasise the role of culture and the social context in defining stigma, whereby stigma is thought to pose a threat to one's moral standing within the local social world.5,6

The reduction of stigma, discrimination and human rights violations towards people with mental health difficulties has been identified as a key priority within the WHO Comprehensive Mental Health Action Plan (2013–2030).7 Further to this, a recent report by the Lancet Commission outlines eight key recommendations for action worldwide.8 Regarding global mental health and stigma reduction, research suggests that cross-cultural variation exists in public stigma.9 However, there is limited research taking place outside of the Global North to indicate effective, culturally appropriate strategies for stigma reduction.10 Research is needed across different countries and cultural settings, including developing countries and the Global South to explore the efficacy and feasibility of methods to address stigma.2,11 Additionally, a recent review of interventions to reduce stigma highlighted that few studies have used well adapted and validated outcome measure for stigma, particularly in Low and Middle Income Countries (LMICs).12 This is important to note given that stigma is strongly influenced by culture, for example in regard to the way in which mental health difficulties are conceptualised, beliefs about causes of these difficulties, and culturally determined values.8

Measuring stigma

Stigma has been studied extensively over the past several decades. This has evolved from qualitative research methods to include a range of methodologies, including self-report and behavioural measures of stigmatisation.13 One of the key challenges in stigma research relates to the cacophony of approaches to its measurement. Fox et al.14 conducted a systematic review of studies using mental illness stigma measures between 2004 and 2014. Over 400 different stigma measures were identified, over two-thirds of which had been created for a specific study and had not been systematically psychometrically evaluated. This suggests that the field is at saturation point with regards to the development of new measures. Clearly, there is a need for greater convergence within the field and this should include psychometric evaluation and validation of existing, well-used measures.

From a global perspective, a further issue within the literature is the predominance of studies focusing on Western, English-speaking countries and cultures. Thornicroft et al.12 conducted a narrative review of anti-stigma intervention research (1970–2012) and found that 83 % of studies took place in high-income countries, with just 17 % taking place in middle-income countries. Strikingly, fewer than 30 % of studies took place in a country other than the US. This indicates a need for research across a wider range of cultural settings, to better understand cross-cultural differences in stigma.15 Additionally, there is a need for further research within LMICs, given that the generalisation of methods and findings from research conducted in high-income countries is not advisable.12

Progression of such research, is, however, a challenge in the context that most stigma measures have been developed in the English language, for use in English-speaking countries.16 Efforts to measure stigma in non-English speaking countries may either rely on development of a new measure – a potentially time consuming process, or may take an existing measure to be translated, adapted and psychometrically evaluated within the target cultural context. Research suggests that the latter is more common. Indeed, Yang et al.17 conducted a systematic review of stigma research with non-Western European cultural groups (1990–2012) and found that 77 % (n = 151) of included studies used adaptations of existing, Western-developed stigma measures. While this approach may not account for culturally specific aspects of stigma, and makes assumptions about the generalisability of the underlying theory, the translation and use of existing, standardised measures may facilitate comparisons across linguistic and cultural settings.16

To summarise, it appears that much stigma research has been conducted in high-income Western countries, yet findings are assumed to be universally applicable rather than culturally specific. Further research is required to better understand cross-cultural differences in stigma, and this depends on developing the research base with respect to stigma measurement. Clearly, there is a need for greater convergence within the field of stigma measurement in general, and this should include psychometric evaluation and validation of existing, well-used measures.

The AQ-27

Within Fox et al.’s14 review, Corrigan et al.’s18 Attribution Questionnaire was identified as one of the most widely cited stigma measures. To date, the paper has been cited 1830 times on Google Scholar (checked on 10th March 2024). The AQ-27 is a self-report measure of public stigma which was developed in the USA in 2003. It contains a brief vignette, as follows:

‘Harry is a 30-year-old single man with schizophrenia. Sometimes he hears voices and becomes upset. He lives alone in an apartment and works as a clerk at a large law firm. He has been hospitalized six times because of his illness’.

This is followed by twenty-seven statements which measure nine domains related to stigma: blame, anger, pity, help, dangerousness, fear, avoidance, segregation and coercion. Respondents rate their agreement with each statement on a nine-point Likert scale. Higher scores indicate more stigmatising views towards people with mental illness. A short form version of the AQ-27 (the AQ-9)19 was also developed by the original authors of the measure by selecting the single item that loaded most onto each factor.20

The AQ-27 was originally designed to measure stigma towards people with schizophrenia as the condition is frequently associated with public perceptions of dangerousness.18 Contemporary research suggests that schizophrenia remains one of the most stigmatised psychiatric diagnoses today.21,22 A multinational study by Thornicroft et al.,1 surveying 732 people with a diagnosis of schizophrenia across 27 countries identified high rates of experienced discrimination, most commonly within friendships, family relationships and in finding and maintaining employment.

Theoretical underpinnings of the AQ-27

The AQ-27 is underpinned by attribution theory, a social cognitive theory which has been applied to understand the relationship between mental health stigma and discriminatory behaviour, in relation to beliefs about causality (personal responsibility for causing one's difficulties) and controllability (the amount of influence an individual can exert over their difficulties).23 These attributions are thought to lead to differential emotional responses (e.g., pity, anger, fear), which lead to helping or punishing behaviour. The AQ-27 is underpinned by a nine-factor path model which suggests that individuals are more likely to respond negatively to a person with a label of mental illness when they are judged to have a high degree of control over their presentation (e.g., with anger, leading to avoidance and withholding help). Additionally, fear has been found to be a strong predictor of avoidance and support for coercive treatment.18

Approaches to questionnaire translation and cross-cultural adaptation

It is important to note that questionnaire translation, cross-cultural adaptation and cross-cultural validation are each distinct concepts. We briefly define these terms here. Translation can be defined as the process of transferring meaning from a ‘source language’ (the primary language in which a measure is written) into a ‘target language’.24 This involves consideration of linguistic elements including accuracy, fluency and conceptual equivalence.25 Cross-cultural adaptation considers both language translation and the identification of differences between the ‘source culture’ and ‘target culture’ to maintain the equivalence of concepts between both cultural groups. Note that cross-cultural equivalence encapsulates several aspects,16 including semantic equivalence (equivalence in the meaning of words), experiential equivalence (the relevance of situations or experiences described for the target population) and conceptual equivalence (the validity of the concept described). Lastly, cross-cultural validation aims to ensure that the translated instrument has the same properties as the original instrument.25 Translated measures need to be psychometrically evaluated within the target cultural context.26

Translation and cross-cultural adaptation are complex processes which requires a rigorous, multi-step and collaborative approach. Guiding frameworks have been produced to support the cross-cultural adaptation of self-report measures,16 such as Beaton et al.’s, ‘Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measures’.27 Additionally, a variety of translation frameworks are available and these approaches have been reviewed and critiqued extensively within the literature.24,25,28 The translation framework used will impact on the quality and validity of the translated measure.

Research questions

The overarching purpose of this systematic review is to review and synthesise the literature in relation to the translation processes of the AQ-27, including assessment of the quality of the translation processes, and associated psychometric properties of translated versions. The review is précised by a broader review of research which has adopted a translated version of the AQ-27 (Part I), followed by a more in-depth review and synthesis of studies which have used a primary translation of the AQ-27 (Part II). Taken together, these components allow us to review the way in which research and literature in the use of the AQ-27 is developing outside English-speaking populations.

Part I: Overview of the use of translated versions of the AQ-27. With what populations, and within what cultural contexts have translated versions of the AQ-27 been used? The purpose of this element is not primarily to establish or summarise the main findings from these papers, but rather to identify the countries and populations in non-English speaking countries in which AQ-27 research is active.

Part II: Assessment of the quality of the translation process, within original translation studies, as well as a review of the psychometric validation of the associated translated version. This component was intended to consider in more detail a smaller subset of papers which had developed a primary translation of the AQ-27 into a different language.

a)
What languages has the AQ-27 been translated into, from English?
b)
What is the quality of the procedures used to translate and adapt the AQ-27?
c)
What is known about the reliability and validity of translated versions of the AQ-27?

MethodRegistration

This systematic review was registered on the International Register of Prospective Systematic Reviews (PROSPERO) on 29th June 2023 (registration number CRD42023440611).

Search strategy

The systematic search was completed on 19th September 2023, followed by an update search on 14th January 2024. Searches were carried out with a date limitation from July 2003 until 19th September 2023, in three electronic databases: MEDLINE (PubMed), Web of Science and PsycINFO (EBSCO). To increase the chance of retrieving international papers, Google Translate was used to translate key search terms into the ten most common languages spoken worldwide29 (Mandarin Chinese, Spanish, Hindi, Portuguese, Bengali, Russian, Japanese, Yue Chinese, Vietnamese and Turkish) and these were added to the search strategy. Therefore, the search terms used were:

“attribution questionnaire” OR “AQ-27” OR “AQ27” OR "问卷分配" OR "asignación de cuestionario" OR "प्रश्नावली असाइनमेंट" OR "atribuição de questionário" OR "প্রশ্নপত্র নিয়োগ" OR "задание анкеты zadaniye ankety" OR "アンケートの割り当て" OR "bài tập câu hỏi" OR "anket ödevi".

Eligibility criteriaInclusion criteria

Studies included in the review were published, peer-reviewed empirical studies which used a translated version of the AQ-27 (from English, into another language) to measure stigma, primarily towards people with schizophrenia. Studies which translated an existing abbreviated version of the AQ-27, such as the AQ-9 were included.

For Part II, an additional criterion was applied. Only studies carrying out an original translation of the AQ-27 were included (i.e., studies which used an existing translated version of the measure were excluded).

Exclusion criteria

Studies were excluded based on the following criteria:

a)
The AQ-27 was explicitly modified to measure stigma towards a condition other than schizophrenia, or stigma towards mental illness in general, however modifications to the wording or structure of the AQ-27 as part of a translation process were included
b)
The study assessed stigma towards multiple conditions (i.e., the primary focus was not schizophrenia).
c)
The AQ-27, or abbreviated version was not used in full (e.g., only one subscale was used).
d)
It was not explicitly stated that the AQ-27 was translated into another language.
e)
Articles not available in English language.
f)
For Part II, studies which reported carrying out an original translation, but provided no description of the translation process (as this prohibited any assessment of the quality of the translation process).

We recognised that exclusion criterion (e) is arguably in tension with the core project aims. However, the use of raw machine translation output alone, without the input of qualified human translators, was ruled out for the purposes of the current review due to concerns around the quality and accuracy of the translations. While neural machine translation (NMT), used by systems such as Google Translate is widely regarded as the best performing type of machine translation invented to date, NMT can be inaccurate, is known to output words that do not exist in the target language, and can also amplify biases.30 Moreover, despite literature calling for greater emphasis on publication of non-English papers, the reality remains that most scientific literature is published in the English language, arguably limiting the practical impact of this pragmatic decision.31,32

Screening and selection

Studies identified by the searches were extracted into Microsoft Excel. After duplicates were removed, titles and abstracts were screened for eligibility and removed if they clearly did not meet inclusion criteria. The remaining articles were read in full, and if they were excluded they were coded as to the primary reason for exclusion. Where multiple exclusion criteria applied, the most fundamental exclusion criterion was cited (e.g., studies which did not use the AQ-27, or did not use a translation of the AQ-27). A subset of full-text articles (20 %) were checked by the fourth author, blind to the ratings of the primary reviewer to ensure that they met eligibility criteria.

Quality assessment

The COSMIN Study Design Checklist33 was used to assess the methodological quality of the translation processes. Additionally, selected items from the COSMIN were used to assess the validity and key psychometric properties of the translated measures. (eTable 1). Each item from the COSMIN is rated on a four-point scale, whereby a score of four indicates the highest methodological quality. Items are weighted according to relative importance. While the COSMIN does not require the use of an overall quality rating, in the present study we calculated a total score by summing the scores for all elements considered. Therefore, the maximum possible overall score was sixty.

Table 1.

Overview of Study Characteristics (Part II).

Authors (year)	Version of AQ; number of citing papers in the current review	Country	Study design	Sample size, age range (mean), % female	Participant occupation	Aims	Main findings
Spanish (n=3)
Muñoz et al. (2015)	SpanishAQ-27,AQ-27-E; 7 citations	Spain	Translation and psychometric evaluation	439, mean age 39 years, 52.6 % female	Residents in Madrid	To translate and analyse the psychometric properties of the Attribution Questionnaire for use in Spanish-speaking populations (AQ-27-E), and to test the dangerousness and responsibility models of mental illness stigma in a Spanish sample.	“The AQ-27-E has acceptable psychometric properties comparable to previous versions, which can be used to assess stigma in Spanish-speaking populations.”
Chamorro Coneoet al. (2022)	Colombian-Spanish adaptation of the AQ-27; 0 citations	Colombia	Cross-sectional	271, 18–79 years (32), mean age32 years, 67.4 % female	Community sample	To examine pathogen-disgust sensitivity and danger appraisal mechanisms in responses of stigma towards SMI.	“Pathogen avoidance and danger appraisal systems interplay in the generation of discriminatory behaviour towards SMI.”
Crespo et al. (2008)	SpanishAQ-27; 0 citations	Spain	Cross-sectional	439, mean age 39 years, 52.6 % female	Community sample from Madrid	To analyse the stigma associated with severe and persistent mental illness in the general population of Madrid.	“Most of the participants showed a helping attitude toward the mentally ill persons, and especially, a disposition to coerce them into treatment.”
Chinese (n=2)
Chiu et al. (2021)	Modified Chinese AQ; 1 citation	Taiwan	Cross-sectional	123, mean age 21.7 years, 41.5 % female	Medical students	To compare the differences of public stigma, self-stigma, and social distance associated with schizophrenia between old and new name of schizophrenia in Taiwanese medical students.	“After renaming schizophrenia, we noted significant differences in the scores in the modified AQ, the perceived psychiatric stigma scale, and the modified social distance scale in all participants and the fourth-year students, respectively.”
Ho et al. (2018)	Chinese translation of AQ-9; 0 citations	Hong Kong	Cross-sectional	218, 17–51 years (22.4), 67 % female	University students	To evaluate the latentprofiles of social stigma related to mental illness in the under-researched Chinese context through Factor Mixture Analysis.	“Most of the sample belonged to the low-stigmatizing class, with low to moderate expressions of stigma toward PLMI. The high-stigmatizing class was significantly more likely to be male, not working, and younger and to report significantly higher social distance, personal distress, and empathetic concern.”
Italian (n=1)
Pingani et al. (2012)	Italian AQ-27, AQ-27-I; 4 citations	Italy	Translation and psychometric evaluation	214, 18–89 years (40.2), 52.3 % female	Relatives of university students	To translate the Attribution Questionnaire-27 (AQ-27) to the Italian language (AQ-27-I), and to examine the reliability and validity of this new Italian version.	“The AQ-27-I demonstrated acceptable internal consistency. Test–retest reliability was also satisfactory. Fit indices of the model supported the factor structure and paths. The AQ-27-I is a reliable measure to assess stigmatizing attitudes in Italian.”
Arabic (n=1)
Saguem et al. (2021)	Arabic AQ-27; 2 citations	Tunisia	Translation and psychometric evaluation	310, 18–29 years (22.6), 41.9 % female	University students	To translate and validate the AQ in Arabic, by assessing its content validity, construct validity and reliability.	“The Arabic AQ showed acceptable psychometric properties in the assessment of stigma in the Tunisian population. Structural equation models for the responsibility and dangerousness models were mostly supported. The Arabic version of AQ is valid andreliable for the assessment of stigma in Tunisian and Arabic-speaking populations.”
Hebrew (n=1)
Romem et al. 2008)	HebrewAQ; 1 citation	Israel	Quasi-experimental (pre/post intervention)	136, mean age 26.1 years, 14.7 % female	Third year nursing students	To evaluate the degree to which a four-week psychiatric clinical clerkship alters nursing students’ attitudes toward individuals with mental illness.	“After the clinical clerkship, students became more compassionate and less frightened by psychiatric patients, were more willing to care for individuals with mental illness and expressed less need to segregate them from the community.”
Turkish (n=1)
Akyurek et al. (2019)	TurkishAQ-27; 0 citations	Turkey	Translation and psychometric evaluation	424, mean age 36.9 years, 52.1 % female	Hospital visitors	To translate the AQ-27 into Turkish and evaluate the reliability and validity of the new Turkish version on a multi-centred selected adult sample.	“A good internal consistency was obtained, and a statistically significant test–retest reliability was detected. Fit indices of the model supported the factor structure and paths. AQ-27-T was determined as a reliable and valid questionnaire assessing stigmatization toward mental illness in Turkish population.”
Sinhalese (n=1)
Baminiwatta et al. (2023)	SinhaleseAQ-9; 0 citations	Sri Lanka	Cross-sectional	405, mean age 39.6 years, 90.6 % female	Nurses	To assess whether higher trait mindfulness among Sri Lankan nurses was linked to lower stigma towards psychiatric patients, and whether compassion mediated this relationship.	“Those with higher trait mindfulness were more likely to believe they would help a person with mental illness, and less likely to believe a person with mental illness should be avoided or segregated from the society. Compassion partially mediated the effects of trait mindfulness on helping and avoidance.”
Bengali (n=1)
Giasuddin et al. (2015)	26-item Modified Corrigan AttributionQuestionnaire (MCAQ); 0 citations	Bangla-desh	Cross-sectional	200, mean age of first years 18.9, mean age of fifth years 23.4, 59 % female	First and fifth-year medical students	To explore stigma among medical students toward persons with mental disorders and their attitudes toward psychiatry.	“Upper medical school year, older age, mother's lower academic level, upper and lower socioeconomic level affiliation and self-consultation for mental or neurological complaints were associated with increased stigma toward PMDs. More favourable attitudes toward psychiatry were found in uppermedical school year and were significantly associated with female gender and middle socioeconomic level affiliation.”
Finnish (n=1)
Ihalainen-Tamlander et al. (2016)	Finnish AQ-27; 0 citations	Finland	Cross-sectional	264, mean age 48 years, 98 % female	Nurses in primary healthcare	To describe nurses’ attitudes towards people with mental illness and examine factors associated with their attitudes in primary care health centres.	“Nurses’ attitudes towards people with mental illness in general were positive inprimary care health settings. Younger nurses expressed feeling afraid of mentally ill patients. They not only lacked a feeling of safety around these patients but were also often of the opinion that people with mental illness should be segregated from the general population.”

Translation standards outlined within the COSMIN focus on key processes such as completing forward and backward translations, ensuring that the translation is reviewed by a committee and conducting a preliminary pilot study. These processes are critical to achieving linguistic and cross-cultural equivalence and checking the validity of the translated version.27 The COSMIN has been used in a previous systematic review relating to questionnaire translation.34

Using the COSMIN, the first author independently conducted quality assessments. For inter-rater reliability, the fourth author completed quality ratings for 25 % of included studies (n = 4). Any discrepancies were discussed and resolved.

Data extraction

For Part I of the review, the following data were extracted: name of translated measure, language, country, study design, sample size and demographic information, research aims and main findings.

Part II of the review focused on studies which carried out an original translation of the AQ-27. Information relating to the translation method, and psychometric properties, including factor structure, internal consistency and test-re-test reliability were extracted. This was guided by the COSMIN and informed by quality criteria reported elsewhere.35 Details of any modifications to the AQ-27 were extracted.

Analysis

For Part I, studies and main findings are presented in a table, grouped by country, and key characteristics are summarised narratively. The intention is to allow an overview of the scope of the extant AQ-27 literature within each country. For Part II, a narrative synthesis approach36 was primarily used, combined with visual synthesis of patterns in relation to the quality appraisal (i.e. colour coding) and tabular representation of psychometric properties. Studies were grouped by language and the version of the measure used. Studies were ordered according to frequency of the translation (most translations first) and year of publication (newest first).

ResultsSearch results

A PRISMA Flow Diagram is shown in Fig. 1. A total of 1404 papers were identified from the initial searches. Following removal of duplicates, 1099 papers remained to be screened. After title and abstract screening, 273 papers were read in full and assessed against the eligibility criteria.

Fig. 1.

PRISMA Flow Diagram.41

Of note, six papers were excluded due to the full-text articles being published only in a language other than English. These included a German translation of the revised AQ-9, adapted for adolescents,37 an adaptation of the Portuguese version of the AQ-27 for Brazilian speakers,38 an 8-item Spanish translation of the revised AQ-9 for adolescents39 and a Spanish translation of the AQ-14.40 It is not known if these papers would have been included in either Part I or Part II had English translations been available. Of the excluded papers, German is the only language which has not been represented within the current review as a result of this exclusion criterion.

Forty-one studies were identified as eligible for inclusion in Part I of the review. Of those, two papers were obtained during the updated search. The 41 papers were then screened for eligibility for inclusion in Part II of the review. Twelve studies were identified as eligible. Of the papers independently checked by LM there was 100 % agreement.

Part I: Overview of the Use of Translated Versions of the AQ-27: With What Populations, and Within What Cultural Contexts Have Translated Versions of the AQ-27 Been Used?

Study characteristicsLanguage and country of study

Forty-one studies used a translated version of the AQ-27 to measure stigma towards people with schizophrenia. A summary of the study characteristics and key findings are shown in eTable 2.

Table 2.

Overview of the Quality of Translation Processes.

Note. Colour coding reflects scoring from the quality assessment using the adapted COSMIN Study Design Checklist (0–4). Dark green=4 (very good), light green=3 (adequate), light orange=2 (doubtful), dark orange=1 (inadequate), grey=0 (not reported).

aStudies with a primary aim of translating and analysing the psychometric properties of the AQ-27.

We identified that the AQ-27 has been translated into eleven languages, including Spanish (n = 16 studies), Portuguese (n = 6), Italian (n = 5), Chinese languages (n = 4; note, the specific Chinese languages were not reported), Arabic (n = 3), Hebrew (n = 2), French (n = 1), Turkish (n = 1), Sinhalese (n = 1), Bengali (n = 1) and Finnish (n = 1).

Studies took place across fifteen countries. Most studies took place in Europe (n = 26; 63.4 %), with the most common location being Spain (n = 14), followed by Portugal (n = 5), Italy (n = 5), France (n = 1) and Finland (n = 1). Nine studies (22 %) took place in Asia, including Taiwan (n = 2), Hong Kong (n = 2), Sri Lanka (n = 1), Bangladesh (n = 1), Israel (n = 1) and Turkey (n = 1). Three studies (7.3 %) took place in South America, including Chile (n = 1), Colombia (n = 1) and Brazil (n = 1). Three studies (7.3 %) were carried out in Africa, in Tunisia (n = 3).

The total sample sizes for each country represented in the review are shown in Fig. 2. The largest total samples were obtained from Spain (n = 2597), Italy (n = 1379) and Portugal (n = 703).

Fig. 2.

Total Sample Size for Each Country Represented Within the Review.

Participant characteristics

In total, 8709 participants were recruited. Sample sizes ranged from 22,42 to 2746.43 Most studies (n = 35, 85.4 %) consisted of a majority female sample (≥ 50 %). The mean age of participants, where reported ranged from 17.8 to 54.9 years. Studies sampled from a range of populations, including university students (n = 17, 41.5 %), the general public (n = 8, 19.5 %), mixed populations (n = 5, 12.2 %), health professionals (n = 4, 9.8 %), high school students (n = 3, 7.3 %), service users (n = 1, 2.4 %), service users’ relatives (n = 1, 1.2 %), school staff (n = 1, 1.2 %) and college students (n = 1, 1.2 %).

Study design

A wide variety of study designs were observed. These included cross-sectional studies (n = 17, 41.5 %), quasi-experimental designs (n = 8, 19.5 %), studies investigating measurement properties of the AQ-27 (n = 7, 17.1 %), correlational studies (n = 6, 14.6 %), randomised controlled trials (n = 3, 7.3 %), and mixed designs (n = 1, 2.4 %).

Part II: Translations of the AQ-27

Assessment of the Quality of the Translation and Adaptation Process, Within Original Translation Studies.

a) What Languages has the AQ-27 Been Translated Into, From English?

Part II of the review focused on a subset of the studies included in Part I, which reported carrying out an original translation of the AQ-27 (i.e., rather than using an existing translation).

Of the 41 studies initially identified, 14 studies produced an original translation. However, two studies44,45 provided no information about the translation process and were therefore excluded. This left 12 studies remaining for inclusion in Part II of the review. Table 1 provides an overview of the study characteristics.

Language and country of study

The 12 original translation studies spanned nine languages, including Spanish,44-46 Chinese languages,47,48 and Italian,49 Arabic,50 Hebrew,51 Turkish,52 Sinhalese,53 Bengali,54 and Finnish.55 The Spanish AQ-27,44 had the highest number of citing papers within the current systematic review (n = 7 citations), followed by the Italian AQ-27,49 (n = 4), Arabic AQ,50 (n = 2), Chinese AQ47 (n = 1) and Hebrew AQ-27,51 (n = 1). This suggests that the Spanish, Italian and Arabic versions of the AQ-27 are gaining traction.

Studies took place across Asia (Taiwan,47 China,48 Israel,51 Turkey,52 Sri Lanka,53 Bangladesh54), Europe (Spain,44,46 Italy,49 Finland55), Africa (Tunisia50) and South America (Colombia45).

Participant characteristics

Across the studies, 3004 participants were recruited. Sample sizes ranged from 123,47 to 439, .44 Studies sampled university students47,48,50,51 (n = 5), the public44-46.49-52 (n = 5) and nurses53,55 (n = 2). Most studies (n = 9, 75 %) contained predominantly female samples (≥ 50 %). The mean age of participants ranged from 18.9 years,54 to 48 years.55

Study designs

Importantly, there was significant heterogeneity with regards to the study designs and aims. Only four studies (33.3 %) had a primary aim of translating and psychometrically evaluating the AQ-27; those were the Spanish AQ-27,44 Italian AQ-27,49 Arabic AQ50 and Turkish AQ-27.52 The remaining studies consisted of cross-sectional designs45-48,53-55 (n = 7) and pre/post intervention designs (n = 1).51

b) What is the Quality of the Procedures Used to Translate and Adapt the AQ-27?

Quality Assessment of the Translation Process

Selected items from the COSMIN Study Design Checklist (eTable 1) were used to assess the quality of the translation method. This informed Research Question II(b). Table 2 provides an overview of the findings and full results are provided in eTable 3.

Table 3.

Reliability and Validity of Translated Versions of the AQ-27.

Authors (year)	Name of measure, location	Participant occupation, sample size, age range (mean), % female	Modifications to items	Modifications to vignette	Changes to factor structure, factor analysis (e.g. CFA, EFA)	Internal consistency (Cronbach's alpha)	Test-retest reliability (e.g. intraclass correlation coefficient)
Spanish (n=3)
Muñoz et al. (2015)a	SpanishAQ-27, AQ-27-E, Spain	Residents in Madrid; 439, mean age 39.01 years, 52.6 % female	No changes- retained27-item AQ.	No changes – “AQ-27 includes a neutral vignette that represents a hypothetical person (Harry) who suffers from a severe mental illness.”	No changes - retained the original nine factor structure.No factor analysis.	Total= 0.855Fear = 0.896; Anger = 0.577Help = 0.766; Dangerousness = 0.849; Avoidance = 0.730;Segregation = 0.848;Pity = 0.494;Responsibility = 0.390;Coercion = 0.478	Not reported.
Chamorro Coneoet al. (2022)	Colombian-Spanish adaptation of AQ-27, Colombia	Community sample; 271, 18–79 years (32), mean age 32 years, 67.37 % female	Reduced the number of items to 20, however the process by which this was achieved is not described	No changes – “The AQ-27 in Colombian Spanish comprised four vignettes describing the story of “Juan”, a man with a SMI. The story in each vignette was different regarding Juan's aggressiveness and causes associated with the cause and exacerbation of his symptoms.”	Factor structure unclear.No factor analysis.	Total alpha not reported.Anger = 0.81; Fear = 0.96;Helping/avoidance = 0.84;Coercion/segregation = 0.86;Responsibility = 0.60;Pity = 0.55	Not reported.
Crespo et al. (2008)a	SpanishAQ-27, Spain	Community sample; 439, mean age 39.01 years, 52.6 % female	No changes- retained27-item AQ.	No changes – used neutral version of the vignette.	No changes - retained the original nine factor structure.No factor analysis.	Total = 0.76Subscale alphas not reported	Not reported.
Chinese (n=2)
Chiu et al. (2021)	Modified Chinese AQ (20 items), Taiwan	Medical students; 123, mean age 21.7 years, 41.5 % female	“Due to the similarity after translation into Chinese, we extracted 20 items of the Corrigan's attribution questionnaire according to experts’ opinions for this study” - removed items 4, 12, 19, 21, 22, 24 and 26	Modified the vignette to compare the old and new name of schizophrenia in Taiwan (“disorder with dysfunction in thought and perception”).	Items were grouped into nine subscales.Exploratory factor analysis yielded a six-factor solution.	Total (old name)= 0.83Total (new name) = 0.82Subscale alphas not reported	Not reported.
Ho et al. (2018)	ChineseAQ-9, Hong Kong	University students; 218, 17–51 years (22.4), 67 % female	No changes - retained 9-item AQ.	No changes – “John is a single man who lives alone in an apartment and works as a clerk at a large law firm. He was diagnosed with schizophrenia. He often hears voices of unknown origin and becomes upset. He has been hospitalized for two months because of his illness”.	“Preliminary factor mixture analysis supported a one-factor structure for the scale.”	Total = 0.80Subscale alphas not reported	Not reported.
Italian (n=1)
Pingani et al. (2012) a	ItalianAQ-27(AQ-27-I), Italy	Relatives of university students; 214, 18–89 years (40.15), 52.3 % female	No changes- retained27-item AQ.	No changes – “the vignette described ‘Harry’, a 30-year-old single man with schizophrenia”.	Confirmatory factor analysis (CFA) “Our major goal was to determine whether the Italian model mirrored the American; fit indicators were equivalent on the matter”.	Total=0.818Responsibility = 0.615;Pity = 0.676; Anger = 0.521Dangerousness = 0.755Fear = 0.912; Help = 0.814Coercion = 0.570;Segregation = 0.801; Avoidance = 0.570	Total intraclass coefficient (test-retest reliability) =0.72Subscale ICCs ranged from 0.51 (Anger) to 0.89 (Fear)
Arabic (n=1)
Saguem et al. (2021)a	ArabicAQ, Tunisia	University students; 310, 18–29 years (22.6), 41.9 % female	Translated a 21-item version of the AQ which omitted terms for segregation and coercion.	No changes reported – “The questionnaire starts with a short statement about “Harry,” a 30-year-old single man who works as a clerk in a law firm and who has been hospitalized for schizophrenia.”	Describe a seven-factor model for the 21-item Arabic translation;Responsibility, Pity, Help, Avoidance, Dangerousness, Fear, Anger.No factor analysis.	Total = 0.71Responsibility = 0.78Pity = 0.82; Help = 0.72Avoidance = 0.72Dangerousness = 0.78Anger = 0.73; Fear = 0.74	Not reported.
Hebrew (n=1)
Romem et al. (2008)	HebrewAQ, Israel	Third year nursing students; 136, mean age 26.1 years, 14.7 % female	“One statement was excluded due to difficulties retaining the original meaning following translation into Hebrew..”	No changes – “the final questionnaire included vignettes about four 30-year-old men with schizophrenia, which vary in the level of danger and controllability attributed to the patient”.	Six constructs, with 3–4 items each; Responsibility, Pity, Anger, Fear, Willingness to Help, Segregation.No factor analysis.	Total alpha not reported.Subscales (pre/post intervention):Responsibility 0.55, 0.86Pity = 0.87, 0.83; Anger = 0.87, 0.83; Fear = 0.87, 0.82;Willingness to Help = 0.78, 0.80; Segregation = 0.84, 0.87	Not reported.
Turkish (n=1)
Akyurek et al. (2019)a	Turkish AQ-27, Turkey	Hospital visitors; 424, mean age 36.9 years, 52.1 % female	“The wording of items 4, 11, 12, 13, 14, 17, 19, 20, 22, 24, 27 were amended to preserve the original meaning, as part of the cultural adaptation process.” - all wording changes are described in full.	No changes– “Hasan is a 30-year-old single man with schizophrenia. Sometimes he hears voices and becomes upset. He lives alone in an apartment and works as a clerk at a large law firm. He had been hospitalized six times because of his illness.”	CFA indicated that the original nine factor structure was supported.	Total = 0.88Individual items ranged from 0.866 to 0.892	Pearson correlation coefficient (for total score)=0.793Item correlation coefficients ranged from 0.35 to 0.77
Sinhalese (n=1)
Baminiwatta et al. (2023)	SinhaleseAQ-9, Sri Lanka	Nurses; 405, mean age 39.6 years, 90.6 % female	No changes - retained 9-item AQ.	No changes – “hypothetical vignette about a man named Harry who has schizophrenia”.	N/A – “each domain in the AQ-9 was measured by only a single item”.	N/A – “each domain in the AQ-9 was measured by only a single item”.	Not reported.
Bengali (n=1)
Giasuddin et al. (2015)	Bengali 26-item Modified Corrigan AttributionQuestionnaire (MCAQ), Bangla-desh	First and fifth-year medical students; 200, mean age of first years 18.9, mean age of fifth years 23.4, 59 % female	“One question from the original questionnaire was deleted: ‘If I were in charge of the treatment of Hasib, I would force him to live in a group home’, since this service option is unavailable in the country”.	No changes– “The MCAQ provides a brief vignette about Hasib, a 30-year-old single man with schizophrenia who lives alone and works as a clerk at a large private firm. He had been hospitalized six times because of his illness.”	No factor analysis.	Total = 0.71	Not reported.
Finnish (n=1)
Ihalainen-Tamlander et al. (2016)	Finnish AQ-27, Finland	Nurses; 264, mean age 48 years, 98 % female	No changes- retained27-item AQ.	No changes – “Harry is a 30-year-old single man with schizophrenia. Sometimes he hears voices and becomes upset. He lives alone in an apartment and works as a clerk at a large law firm. He has been hospitalized six times because of his illness”.	No changes - retained the original nine factor structure.No factor analysis.	Cronbach's alpha not reported.	Not reported.

a

Studies with a primary aim of translating and validating the AQ-27.

Overall quality ratings varied widely from 25,47 to 54,52 out of a maximum of 60. The Turkish AQ-27,52 was the highest rated translation, followed by the Italian AQ-27,49 and Arabic AQ,50 scoring 48 and 44, respectively. All of these studies were primarily focused on translation and psychometric evaluation of the AQ-27. However, two-thirds of the translation studies (n= 8) were not focused on translation of the AQ-27 as a research aim and subsequently provided limited information about the translation method or framework. This significantly limited our ability to appraise the quality of the translation approach.

Nonetheless, the quality appraisal highlighted some key themes. Firstly, in the COSMIN (and indeed, in most translation guidelines25) it is advised that at least two forward and backward translations are completed by independent translators, to enable the translations to be synthesised and for any differences to be resolved. In the current review, most studies (n = 10, 83.3 %) had completed at least one forward and one backward translation, but only four studies49,50,52,55 (33.3 %) had completed multiple forward and backward translations. A key limitation of this simple ‘direct and back’ method, particularly where only two translations are produced overall, include that this method may focus only on linguistic equivalence while neglecting cultural considerations.24

Questionnaire translation is a complex process which requires a combination of linguistic, cultural and subject matter expertise. As such, it is recommended that forward and backward translators have specific linguistic backgrounds and knowledge.27,33 In the current review, many studies did not report on the profiles of the translators, and three studies did not use professional translators at all, but rather, took an ‘ad hoc’ approach. This included the Spanish AQ-27-E, which was the most widely adopted version within the review. While one could speculate about the possible reasons for this (e.g., lack of time, access to professional translators), this approach is not considered sufficient to produce an accurate and equivalent translation.

A third step which is crucial to the translation process involves carrying out an expert committee review, to consolidate all versions of the questionnaire prior to pilot testing. It is recommended that the multidisciplinary committee should comprise all translators, and language, culture and subject matter experts, ideally including the original developers of the measure.27 This ‘team-based’ approach is considered essential to establishing cross-cultural equivalence.24 In the current review over half of the studies (n = 7, 58.3 %) did not involve an expert committee in the translation process.

The final step of questionnaire translation is to carry out pilot testing within the target setting.27 The purpose of this is to check respondents’ understanding of the questionnaire items. Within the review, half of the included studies (n = 6, 50 %) did not carry out pilot testing.

While the current systematic review focused on approaches to translation, rather than cross-cultural adaptation, it was interesting to note that one only study (the Turkish AQ-27)52 referred to cultural adaptation. Akyurek et al.52 describe in detail a multi-step adaptation method, citing Beaton et al.’s27 widely cited cross-cultural adaptation guidelines. This facilitated auditing of the translation methodology and provides increased assurance of the quality and cross-cultural equivalence of the measure.

c) What is Known About the Reliability and Validity of Translated Versions of the AQ-27?

Data were extracted relating to the reliability and validity of the translated measures, where provided. Results are shown in Table 3.

Reliability

i) Internal Consistency

Internal consistency reflects the extent to which items in a questionnaire, or its subscales are correlated and therefore measure the same construct.35 Cronbach's alpha (α, expressed as a number between 0 and 1) is a commonly used measure of internal consistency. Alpha values of between 0.7 and 0.95 can be considered indicative of good internal consistency.35

Eight studies (66.7 %) reported on internal consistency for the AQ-27 as a whole and all reported values were above the threshold for acceptability. Subscale alpha values were provided for the Spanish,44 Italian49 and Hebrew51 translations (41.7 %, n = 5). Low alpha values were reported for the Responsibility (α=0.39 - 0.615),44,51 Pity (α=0.494 - 0.676),44,45,49 and Anger subscales (α=0.521 - 0.577) across several studies,44,49 which may indicate that some subscale items need to be revised or removed. This could be further explored by assessing the extent to which subscale items correlate with each other and with the total score.56 Internal consistency was not assessed for the Finnish AQ-27,55 or Sinhalese AQ-9.53

ii) Test-Retest Reliability

Test-retest reliability refers to the degree to which repeated measurements with the same participants under the same conditions produces consistent results.35 The Intraclass Correlation (ICC) is a widely used measure of test-retest reliability.57 Values range from 0 to 1, with values closer to 1 indicating stronger reliability.

Only two studies49,52 (16.7 %) reported on test-reliability. For the Italian AQ-27,49 both total and subscale ICCs were provided. The total ICC (0.72) was within the range for moderate reliability57 (0.5–0.75) and subscale ICC values ranged from 0.51 (moderate) for Anger, to 0.89 for Fear (approaching excellent reliability). For the Turkish AQ-27,52 both total and item Pearson correlation coefficients were provided as a measure of test-retest reliability. The total Pearson correlation coefficient (0.793) suggested that the Turkish AQ-27 had adequate test-retest reliability.58

Validity

i) Factor Structure (Structural Validity)

Factor analysis explores the relationship between questionnaire items and underlying dimensions of the measured construct (i.e., factor structure) which may explain these relationships.59 The two main forms of factor analysis are Exploratory Factor Analysis (EFA),which explores the underlying relationships between variables, and Confirmatory Factor Analyses (CFA), which assesses whether the data fit a hypothesised measurement model. The AQ-27 was originally conceptualised as consisting of a nine-factor structure.18

In the current review, two-thirds of the included studies (n = 8, 66.7 %) did not carry out a factor analysis. CFA was carried out for the Italian49 and Turkish AQ-27,52 and in both cases, results supported the original nine-factor structure of the AQ-27. EFA was carried out for the 20-item, Modified Chinese AQ,47 resulting in a six-factor solution. The Arabic AQ50 was derived by translating an existing 21-item version of the AQ, and consists of a seven-factor structure. The 21-item measure excluded the Segregation subscale (items 6, 15 and 17) and Coercion subscale (items 5, 14 and 25) due to a lack of support for these subscales in previous translated versions.46

Discussion

Since its inception in 2003, the AQ-27 has become a well-established measure of public mental illness stigma in the English language. This was the first systematic review to explore the use of translated (non-English language) versions of the AQ-27 to measure stigma towards people with schizophrenia. In Part I, we conducted a review of studies which had used a translated version of the AQ-27 in pursuit of a wider research question, and in Part II we considered in more detail the studies which had conducted a primary translation of the AQ-27. The methodological quality of the translation processes was assessed using COSMIN criteria,33 and psychometric data were reviewed.

Part I of the review identified that to date, the AQ-27 has been translated into eleven languages and implemented across fifteen countries. As highlighted in eTable 2, it has been used in a wide range of studies considering a range of different research questions and adopting different methodologies with a range of different types of samples (see also Fig. 2). There are few obvious findings from these studies which can be synthesized, except it is clear that the AQ-27 appears to be being used in a diverse range of ways including determination of between-group differences, assessment of potential outcomes from interventions, and as an independent variable in a range of different ways.

Regarding geographical distribution, Western Europe was grossly over-represented in the review. Most studies (63.4 %) took place in Europe, with the largest samples being obtained from Spain, Portugal and Italy. In particular, the Spanish literature (predominantly arising from Spain) appears relatively well advanced, which is possibly related to the fact that three separate efforts appear to have been made to develop a translated AQ-27 in Spanish (a fact that is not without its problems, considered in more detail within Part II).

Outside of Europe, a smaller proportion of studies took place in Asia (22 %), Africa (7.3 %) and South America (7.3 %). Similar findings were reported in a previous review (1990–2012) by Yang et al.17 The current review therefore adds to existing literature which suggests that stigma research is overall skewed towards ‘WEIRD’ countries and populations. An important implication of this is the need to avoid making assumptions about the suitability of the AQ-27 in contexts in which stigma research is less well established, i.e. many LMICs (Low- and Middle-Income Countries). In particular, it is noted that the underlying assumptions of wider Attribution Theory – on which the AQ-27 significantly draws – may not necessarily generalise into other cultures directly. One recommendation therefore is that future research considering cross-cultural translation and adaptation of instruments such as the AQ-27 should ideally take a more ‘bottom up’ approach where the underlying theory behind the measure is first developed and adapted in the relevant cultural context before the translation process begins. For researchers considering adopting the AQ-27 directly in a non-English context, consideration should be given to cultural equivalence of the relevant underlying theoretical concepts. Additionally, factor analysis is required following development of a translated measure in order to establish the underlying factor structure.

Part II of the review considered, in more detail, the studies which had conducted a primary translation of the AQ-27 from English into another language. Overall, these studies can be grouped into a smaller group (n = 4) which were primarily focused on translation and validation of the measure,44,49,50,52 and another group (n = 8) where the translation had occurred in the context of a separate research question. The first group of papers appeared to have notably better rigour and quality of translation methodology. Notably, the rigour and quality of translation methodology did not necessarily appear to correlate with the extent of research activity; the Spanish and Chinese papers are a case in point: these were the only languages where more than one author had approached development of a primary translation, but there were (relative) gaps and important areas for improvement.

Overall, the Turkish,52 Arabic50 and Italian49 versions were rated highest in terms of the quality of the translation processes. While the current systematic review focused on efforts to translate (rather than culturally adapt) the AQ-27, it was interesting to note that Akyurek et al.52 were the only authors to address cultural considerations as part of the translation and adaptation process. Future researchers wishing to adapt the AQ-27 for non-English-speaking cultures should consider using translation frameworks which incorporate cultural considerations, as this may increase the validity of the AQ-27 as a measure of mental illness stigma within the target culture. Attribution theory is likely to be implicated in cross-cultural adaptation (e.g., the extent to which respondents view mental distress as being controllable and within one's personal responsibility) and this should be considered as part of the translation and adaptation process. Akyurek et al.’s paper provides an example of how this might be achieved using Beaton et al.’s27 cross-cultural adaptation guidelines. Additionally, researchers should be aware that translation is not equivalent to cross-cultural adaptation and therefore these terms should not be used interchangeably.25 More widely, we hope that our approach to quality appraisal can help authors seeking to develop translated measures to identify important methodological priorities (including for instance pilot testing and use of committees), as well as what information to report in their manuscript.

Unfortunately, these better examples of translation can be contrasted with the majority of the other translation studies, and the review has overall identified many areas in which translation processes were weak or where insufficient information was provided to make a judgement. For instance, several studies appeared to adopt a relatively crude forward-backward translation approach, without committee involvement. It has been argued that forward-backward translation should not be relied upon exclusively as a means of producing an equivalent translation, since this may overemphasise linguistic equivalence while neglecting to account for cultural variation and idiosyncrasies.60 Consensus within the field is that forward-backward translation should be combined with a committee or team-based approach.24 As stated by Behr60:

A methods description along the lines of ‘We translated and back translated the questionnaire to check for equivalence,’ which is all too common, should not be regarded as sufficient evidence of a flawless and equivalent translation. Efforts should be directed towards ensuring quality in the translation itself – by committee or team approaches; by the involvement of suitable translation, content, and survey experts; and by thorough documentation of the translation process, including problems and intentional deviations from a source questionnaire. (Behr, 2017, p. 582)

This is reflected within cross-cultural adaptation guidelines16,27 and quality criteria33 which recommend that translations are reviewed by an expert committee and then pilot tested within the target cultural context. However, within the current review, over half of the included studies (58.3 %) did not involve an expert committee and half did not carry out pilot testing. Furthermore, three studies did not use professional translators. This may have implications for the quality of the data obtained using these translated measures.24,60

Beyond this specific point, there are many more pieces of important information which translation studies should calculate and report. Very few studies provided information regarding the profiles and expertise of the translators, and most studies did not refer to any standardised translation protocol. Questionnaire translation guidelines16,27 emphasise the importance of fully documenting each step of the translation process, to enable the quality of the translation approach to be evaluated. Whilst this may be a reflection on overall research quality, an alternative reason for failure to include these components may be authors’ concerns about adding to the length of their journal articles; authors should thus be encouraged to include such material as supplementary material or publish such material in relevant ‘open’ repositories. Such practices make comparative assessment of quality much easier, and allows the literature to much more effectively build on what has gone before.

Following translation of a measure, it is important to assess its psychometric properties in the translated language.17,27Again, this is an area where translation studies show significant potential for improvement, and where future authors would be strongly encouraged to exert efforts. Whilst Cronbach's alpha was reported frequently (though even here, four of the studies did not report this data at all) only four studies carried out a factor analysis, and only two studies reported on test-retest reliability. The findings suggesting poor reliability of translated versions of the AQ-27 at a subscale level warrants further research.

Beyond the limitations observed in the synthesised data, it is also important to briefly reflect on the limitations inherent in the review methodology. Arguably the largest limitation is that for pragmatic reasons, non-English publications were excluded from the systematic review.. If resources had not been constrained, we would have ideally developed a research team that would have allowed inclusion of papers in all of these languages. Whilst there is some evidence to suggest that excluding non-English papers from systematic reviews may have minimal impact (since most scientific papers are published in the English language),61 we did identify six articles which were not possible to include because they lacked an English translation. This does suggest that future reviews of translated measures may be improve at least modestly if attention is given to processes to support the inclusion of non-English language papers, including where necessary international collaborative efforts and better inclusion of native speakers or translators.

We deliberately only sought peer-reviewed, published studies as we aimed to identify translated versions of the AQ-27 which were likely to be of a sufficient quality to be of value to future researchers. However, it is possible that the exclusion of grey literature reduced the comprehensiveness of the review. This may be an important consideration for future systematic reviews (e.g., given concerns about Western-centred biases in academic publishing).62

Conclusion

This systematic review provides an overview of the use of translated versions of the AQ-27, and an assessment of the methodological quality of the translation approaches. Some relatively robust translation approaches were identified (e.g., for the Turkish,52 Arabic50 and Italian49 adaptations), but more widely there was significant scope for improvement in the quality of translation approaches or at least better reporting of quality markers in published studies We hope that the approach to consideration of quality provides a framework on which future researchers can build, and allows a reduction in duplication of research efforts. A stepwise and incremental approach to stigma research is important to reduce the likelihood of replicating the cacophonous situation in relation to stigma measures that exists in the English-speaking world.

For most translated versions, therefore, researchers should avoid making assumptions about the quality of the original translation methodology used to develop existing measures before adopting them. A poor-quality translation could potentially invalidate conclusions drawn from the data.24 This is particularly important in light of the wider research situation involving use of the AQ-27 in non-English-speaking regions; whilst eTable 2 highlights a relatively broad range of research activity, particularly in some regions, it is a concern that the underpinning translations of the AQ-27 leave room for improvement in several ways. The research situation in Spain (and in Spanish versions more widely) is arguably a particular case in point, where research activity is most advanced, but where three translations of the AQ-27 exist, all of which appear to have room for improvement.

In future, researchers wishing to develop their own translations of the AQ-27 should be aware that a systematic and rigorous approach, based on a robust translation framework and ideally involving a committee approach is recommended to ensure that the translated measure is valid and equivalent within the target culture.24 A variety of translation frameworks,24,27 and quality appraisal tools are available to support this.33Attention should also be given to culturally inappropriate assumptions which are inherent in any underlying theory.

Considering the context much more broadly, one must remember that stigma is itself a social and cultural construction.6,63 When considering the cross-cultural adaption of existing stigma measures, it is important to note that many tools, including the AQ-27 were originally developed and evaluated within Western, English-speaking cultural contexts, such as the UK, USA and Australia, and based on theories that reflect Western assumptions and values.17 Cultural adaptation is as important as linguistic adaptation, but is arguably a somewhat more elusive ambition. It is likely that this will inform the way in which mental health is conceptualised and represented, and may potentially mean that meaningful efforts to develop.62 A report by the Lancet Commission11 highlighted concerns that within the field of global mental health, Western, biomedical models of mental health are being extrapolated to define health, illness and treatment across diverse cultural contexts where a variety of different perspectives may be held.63 An alternative approach could be to develop culturally specific stigma measures; Yang et al.17 propose a ‘what matters most’ framework to guide the development of culture-specific measures, which focuses on attempting to understand how stigma threatens the activities that define personhood within the local cultural context. This approach may be better able to capture culture-specific stigma dynamics.

Declarations of interest

None.

Appendix

Supplementary materials

References

[1]

G. Thornicroft, E. Brohan, D. Rose, N. Sartorius, M Leese.

Global pattern of experienced and anticipated discrimination against people with schizophrenia: a cross-sectional survey.

Lancet, 373 (2009), pp. 408-415

http://dx.doi.org/10.1016/S0140-6736(08)61817-6 | Medline

[2]

G. Ciciurkaite, B.A. Pescosolido.

Mental Health Literacy and Public Stigma: examining the Link in 17 Countries.

Med Res Arch, 12 (2024),

http://dx.doi.org/10.18103/mra.v12i7.5471

[3]

E. Goffman.

Stigma: Notes on the Management of Spoiled Identity.

Simon and Schuster, (1963),

[4]

B.G. Link, J.C. Phelan.

Conceptualizing stigma.

Annu Rev Sociol, 27 (2001), pp. 363-385

http://www.jstor.org/stable/2678626

[5]

A. Kleinman, R. Hall-Clifford.

Stigma: a social, cultural and moral process.

J Epidemiol Community Health, 63 (2009), pp. 418-419

http://dx.doi.org/10.1136/jech.2008.084277 | Medline

[6]

L.H. Yang, A. Kleinman, B.G. Link, J.C. Phelan, S. Lee, B. Good.

Culture and stigma: adding moral experience to stigma theory.

Soc Sci Med, 64 (2007), pp. 1524-1535

http://dx.doi.org/10.1016/j.socscimed.2006.11.013 | Medline

[7]

World Health Organisation. Comprehensive mental health action plan 2013-2030, https://www.who.int/publications/i/item/9789240031029; 2021 [accessed 16 August 2024].

[8]

G. Thornicroft, C. Sunkel, A. Alikhon Aliev, et al.

The Lancet Commission on ending stigma and discrimination in mental health.

Lancet, 400 (2022), pp. 1438-1480

http://dx.doi.org/10.1016/S0140-6736(22)01470-2

[9]

B.A. Pescosolido.

Stigma as a mental health policy controversy: positions, options, and strategies for change.

The Palgrave Handbook of American Mental Health Policy, http://dx.doi.org/10.1007/978-3-030-11908-9_19

[10]

A.C. Krendl, B.A. Pescosolido.

Countries and cultural differences in the stigma of mental illness: the East–West divide.

J Cross-Cultural Psychol, 51 (2020), pp. 149-167

http://dx.doi.org/10.1177/0022022119901297

[11]

V. Patel, S. Saxena, C. Lund, et al.

The lancet commission on global mental health and sustainable development.

Lancet, 392 (2018), pp. 1553-1598

http://dx.doi.org/10.1016/S0140-6736(18)31612-X | Medline

[12]

G. Thornicroft, N. Mehta, S. Clement, et al.

Evidence for effective interventions to reduce mental-health-related stigma and discrimination.

Lancet, 387 (2016), pp. 1123-1132

http://dx.doi.org/10.1016/S0140-6736(15)00298-6 | Medline

[13]

A.E.R. Bos, J.B. Pryor, G.D. Reeder, S.E. Stutterheim.

Stigma: advances in theory and research.

Basic Appl Soc Psych, 35 (2013), pp. 1-9

http://dx.doi.org/10.1080/01973533.2012.746147

[14]

A.B. Fox, V.A. Earnshaw, E.C. Taverna, D. Vogt.

Conceptualizing and measuring mental illness stigma: the mental illness stigma framework and critical review of measures.

Stigma Health, 3 (2018), pp. 348-376

http://dx.doi.org/10.1037/sah0000104 | Medline

[15]

G. Becker, R. Arnold.

Stigma as a social and cultural construct.

The Dilemma of difference: A multidisciplinary View of Stigma, pp. 39-57

[16]

F. Guillemin, C. Bombardier, D. Beaton.

Cross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelines.

J Clin Epidemiol, 46 (1993), pp. 1417-1432

http://dx.doi.org/10.1016/0895-4356(93)90142-n | Medline

[17]

L.H. Yang, G. Thornicroft, R. Alvarado, E. Vega, B.G. Link.

Recent advances in cross-cultural measurement in psychiatric epidemiology: utilizing 'what matters most' to identify culture-specific aspects of stigma.

Int J Epidemiol, 43 (2014), pp. 494-510

http://dx.doi.org/10.1093/ije/dyu039 | Medline

[18]

P. Corrigan, F.E. Markowitz, A. Watson, D. Rowan, M.A. Kubiak.

An attribution model of public discrimination towards persons with mental illness.

J Health Soc Behav, 44 (2003), pp. 162-179

http://dx.doi.org/10.2307/1519806

[19]

Corrigan P.W. A toolkit for evaluating programs meant to erase the stigma of mental illness. [Internet]. 2015 May 9. Available from: https://medbox.org/pdf/5e148832db60a2044c2d549e.

[20]

P.W. Corrigan, K.J. Powell, P.J. Michaels.

Brief battery for measurement of stigmatizing versus affirming attitudes about mental illness.

Psychiatry Res, 215 (2014), pp. 466-470

http://dx.doi.org/10.1016/j.psychres.2013.12.006 | Medline

[21]

C.M. Hazell, C. Berry, L. Bogen-Johnston, M. Banerjee.

Creating a hierarchy of mental health stigma: testing the effect of psychiatric diagnosis on stigma.

BJPsych Open, 8 (2022), pp. 174

http://dx.doi.org/10.1192/bjo.2022.578

[22]

P. Robinson, D. Turk, S. Jilka, M. Cella.

Measuring attitudes towards mental health using social media: investigating stigma and trivialisation.

Soc Psychiatry Psychiatr Epidemiol, 54 (2019), pp. 51-58

http://dx.doi.org/10.1007/s00127-018-1571-5 | Medline

[23]

P.W. Corrigan.

Mental health stigma as social attribution: implications for research methods and attitude change: science and practice.

Clin Psychol, 7 (2000), pp. 48-67

http://dx.doi.org/10.1093/clipsy.7.1.48

[24]

D. Valdez, M.S. Montenegro, B.L. Crawford, R.C. Turner, W.J. Lo, K.N. Jozkowski.

Translation frameworks and questionnaire design approaches as a component of health research and practice: a discussion and taxonomy of popular translation frameworks and questionnaire design approaches.

Soc Sci Med, 278 (2021),

http://dx.doi.org/10.1016/j.socscimed.2021.113931

[25]

P. Cruchinho, M.D. López-Franco, M.L. Capelas, et al.

Translation, cross-cultural adaptation, and validation of measurement instruments: a practical guideline for novice researchers.

J Multidiscip Healthc, 17 (2024), pp. 2701-2728

http://dx.doi.org/10.2147/JMDH.S419714 | Medline

[26]

A.K. Danielsen, H.C. Pommergaard, J. Burcharth, E. Angenete, J. Rosenberg.

Translation of questionnaires measuring health related quality of life is not standardized: a literature based research study.

PloS one, 10 (2015),

http://dx.doi.org/10.1371/journal.pone.0127050

[27]

D.E. Beaton, C. Bombardier, F. Guillemin, M.B. Ferraz.

Guidelines for the process of cross-cultural adaptation of self-report measures.

Spine, 25 (2000), pp. 3186-3191

http://dx.doi.org/10.1097/00007632-200012150-00014 | Medline

[28]

C. Acquadro, K. Conway, A. Hareendran, N. Aaronson.

Literature review of methods to translate health-related quality of life questionnaires for use in multinational clinical trials.

Value Health, 11 (2008), pp. 509-521

http://dx.doi.org/10.1111/j.1524-4733.2007.00292.x | Medline

[29]

Ethnologue: Languages of the World, 26th ed.,

[30]

D. Kenny.

Human and machine translation.

Machine Translation for everyone: Empowering users in the Age of Artificial Intelligence, pp. 23-49 http://dx.doi.org/10.5281/zenodo.6759976

[31]

W. Liu.

The changing role of non-English papers in scholarly communication: evidence from Web of Science's three journal citation indexes.

Learn Publ, 30 (2016), pp. 115-123

http://dx.doi.org/10.1002/leap.1089

[32]

Scientific publishing has a language problem.

Na Hum Beh, 7 (2023), pp. 1019-1020

http://dx.doi.org/10.1038/s41562-023-01679-6

[33]

Mokkink L.B., Prinsen C.A.C., Patrick D.L., et al. COSMIN study design checklist for patient-reported outcome measurement instruments; 2019. https://www.cosmin.nl/wp-content/uploads/COSMIN-study-designing-checklist_final.pdf.

[34]

J.M. Schellingerhout, M.W. Heymans, A.P. Verhagen, H.C. de Vet, B.W. Koes, C.B. Terwee.

Measurement properties of translated versions of neck-specific questionnaires: a systematic review.

BMC Med Res Methodol, 11 (2011),

http://dx.doi.org/10.1186/1471-2288-11-87

[35]

C.B. Terwee, S.D. Bot, M.R. de Boer, et al.

Quality criteria were proposed for measurement properties of health status questionnaires.

J Clin Epidemiol, 60 (2007), pp. 34-42

http://dx.doi.org/10.1016/j.jclinepi.2006.03.012 | Medline

[36]

J. Popay, H. Roberts, A. Sowden, et al.

Guidance on the Conduct of Narrative Synthesis in Systematic Reviews: A product from the ESRC Methods Programme.

Lancaster University, (2006),

https://www.lancaster.ac.uk/media/lancaster-university/content-assets/documents/fhm/dhr/chir/NSsynthesisguidanceVersion1-April2006.pdf

[37]

M. Meyers, J. Geldmacher, S. Mattausch, et al.

Stigmatisierung psychischer Erkrankung unter Schülern [Stigma of mental illness among students].

Nervenarzt, 88 (2017), pp. 1266-1272

http://dx.doi.org/10.1007/s00115-016-0189-7 | Medline

[38]

AdA Pereira, S.M.E. Santos, R.M.D. de Faria.

Versão brasileira do Attribution Questionnaire—Adaptação transcultural e validação de propriedades psicométricas [Brazilian version of the Attribution Questionnaire—Cross cultural adaptation and validation of psychometric properties].

J Bras Psiquiatr, 65 (2016), pp. 314-321

http://dx.doi.org/10.1590/0047-2085000000139

[39]

A. Rodriguez-Meirinhos, L Antolín-Suárez.

Estigma social hacia la enfermedad mental: factores relacionados y propiedades psicométricas del Cuestionario de Atribuciones-revisado [Social stigma towards mental illness: related factors and psychometric properties of the revised-Attribution Questionnaire].

Univ Psychol, 19 (2020), pp. 1-13

http://dx.doi.org/10.11144/Javeriana.upsy19.esem

[40]

J. Saavedra, L. Murvartian.

Estigma público en salud mental en la universidad [Mental health public stigma at the university].

Univ Psychol, 20 (2021), pp. 1-15

http://dx.doi.org/10.11144/Javeriana.upsy20.epsm

[41]

A. Liberati, D.G. Altman, J. Tetzlaff, et al.

The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration.

Annu Int Med, 151 (2009), pp. 65-94

http://dx.doi.org/10.1016/j.jclinepi.2009.06.006

[42]

L.S. Ayuso, A. Ruiz-Hontangas, J.J.G. Cervantes, et al.

The promotion of mental health and prevention of first-episode psychosis: a pilot and feasibility non-randomised clinical trial.

Int J Environ Res Public Health, 20 (2023), pp. 7087

http://dx.doi.org/10.3390/ijerph20227087 | Medline

[43]

C. González-Sanguino, M. Muñoz, M.A. Castellanos, E. Pérez-Santos, T Orihuela-Villameriel.

Study of the relationship between implicit and explicit stigmas associated with mental illness.

Psychiatry Res, 272 (2019), pp. 663-668

http://dx.doi.org/10.1016/j.psychres.2018.12.172

[44]

M. Muñoz, A.I. Guillén, E. Pérez-Santos, P.W. Corrigan.

A structural equation modeling study of the Spanish mental illness stigma Attribution Questionnaire (AQ-27-E).

Am J Orthopsychiatr, 85 (2015), pp. 243-249

http://dx.doi.org/10.1037/ort0000059

[45]

A. Chamorro Coneo, E. Aristizabal, O. Hoyos de los Rios, D Aguilar.

Danger appraisal and pathogen-avoidance mechanisms in stigma towards severe mental illness: the mediating role of affective responses.

BMC Psychiatry, 22 (2022),

http://dx.doi.org/10.1186/s12888-022-03951-x

[46]

M. Crespo, E. Pérez-Santos, M. Muñoz, A.I. Guillén.

Descriptive study of stigma associated with severe and persistent mental illness among the general population of Madrid (Spain).

Community Ment Health J, 44 (2008), pp. 393-403

http://dx.doi.org/10.1007/s10597-008-9142-y | Medline

[47]

Y.H. Chiu, M.Y. Kao, K.K. Goh, C.Y. Lu, M.L. Lu.

Renaming schizophrenia and stigma reduction: a cross-sectional study of nursing students in Taiwan.

Int J Environ Res Public Health, 19 (2022), pp. 3563

http://dx.doi.org/10.3390/ijerph19063563

[48]

A.H.Y. Ho, T.C.T. Fong, J.S. Potash, V.F.L. Ho, E.Y.H. Chen, R.T.H Ho.

Deconstructing patterns of stigma toward people living with mental illness.

Soc Work Res, 42 (2018), pp. 302-312

http://dx.doi.org/10.1093/swr/svy022

[49]

L. Pingani, M. Forghieri, S. Ferrari, et al.

Stigma and discrimination toward mental illness: translation and validation of the Italian version of the Attribution Questionnaire-27 (AQ-27-I).

Soc Psychiatry Psychiatr Epidemiol, 47 (2012), pp. 993-999

http://dx.doi.org/10.1007/s00127-011-0407-3 | Medline

[50]

B.N. Saguem, M. Gharmoul, A. Braham, S.B. Nasr, S. Qin, P. Corrigan.

Stigma toward individuals with mental illness: validation of the Arabic version of the Attribution Questionnaire in a university student population.

J Public Ment Health, 20 (2021), pp. 201-209

http://dx.doi.org/10.1108/JPMH-10-2020-0135

[51]

P. Romem, O. Anson, Y. Kanat-Maymon, R. Moisa.

Reshaping students' attitudes toward individuals with mental illness through a clinical nursing clerkship.

J Nurs Educ, 47 (2008), pp. 396-402

http://dx.doi.org/10.3928/01484834-20080901-01

[52]

G. Akyurek, A. Efe, H. Kayihan.

Stigma and discrimination towards mental illness: translation and validation of the Turkish version of the Attribution Questionnaire-27 (AQ-27-T).

Community Ment Health J, 55 (2019), pp. 1369-1376

http://dx.doi.org/10.1007/s10597-019-00438-0 | Medline

[53]

A. Baminiwatta, H. Alahakoon, N.C. Herath, K.M. Kodithuwakku, T. Nanayakkara.

Trait mindfulness, compassion, and stigma towards patients with mental illness: a study among nurses in Sri Lanka.

Mindfulness (N Y), 14 (2023), pp. 979-991

http://dx.doi.org/10.1007/s12671-023-02108-5

[54]

N.A. Giasuddin, I. Levav, G. Gal.

Mental health stigma and attitudes to psychiatry among Bangladeshi medical students.

Int J Soc Psychiatry, 61 (2015), pp. 137-147

http://dx.doi.org/10.1177/0020764014537237 | Medline

[55]

N. Ihalainen-Tamlander, A. Vähäniemi, E. Löyttyniemi, T. Suominen, M. Välimäki.

Stigmatizing attitudes in nurses towards people with mental illness: a cross-sectional study in primary settings in Finland.

J Psychiatr Ment Health Nurs, 23 (2016), pp. 427-437

http://dx.doi.org/10.1111/jpm.12319 | Medline

[56]

M. Tavakol, R. Dennick.

Making sense of Cronbach's alpha.

Int J Med Educ, 2 (2011), pp. 53-55

http://dx.doi.org/10.5116/ijme.4dfb.8dfd | Medline

[57]

T.K. Koo, M.Y. Li.

A guideline of selecting and reporting intraclass correlation coefficients for reliability research.

J Chiropr Med, 15 (2016), pp. 155-163

http://dx.doi.org/10.1016/j.jcm.2016.02.012 | Medline

[58]

C.P. Dancey, J. Reidy.

Statistics Without Maths For Psychology.

7th ed., Pearson, (2017),

[59]

M. Tavakol, A. Wetzel.

Factor analysis: a means for theory and instrument development in support of construct validity.

Int J Med Educ, 11 (2020), pp. 245-247

http://dx.doi.org/10.5116/ijme.5f96.0f4a | Medline

[60]

D. Behr.

Assessing the use of back translation: the shortcomings of back translation as a quality testing method.

Int J Soc Res Methodol, 20 (2017), pp. 573-584

http://dx.doi.org/10.1080/13645579.2016.1252188

[61]

B. Nussbaumer-Streit, I. Klerings, A.I. Dobrescu, et al.

Excluding non-English publications from evidence-syntheses did not change conclusions: a meta-epidemiological study.

J Clin Epidemiol, 118 (2020), pp. 42-54

http://dx.doi.org/10.1016/j.jclinepi.2019.10.011 | Medline

[62]

S. Oliveira, E. Baggs.

Psychology's WEIRD Problems.

Cambridge University Press, (2023), http://dx.doi.org/10.1017/9781009303538

[63]

Johnstone L., Boyle M., Cromby J., et al. The Power Threat Meaning Framework: towards the identification of patterns in emotional distress, unusual experiences and troubled or troubling behaviour, as an alternative to functional psychiatric diagnosis, https://www.bps.org.uk/guideline/power-threat-meaning-framework-full-version; 2018 [accessed 16 August 2024].

Indexed in:

Follow us:

Indexed in:

Follow us:

Subscribe to our newsletter