Regístrese
Buscar en
Revista de Psiquiatría y Salud Mental (English Edition)
Toda la web
Inicio Revista de Psiquiatría y Salud Mental (English Edition) Brief Suicide Questionnaire. Inter-rater reliability
Journal Information
Vol. 5. Issue 1.
Pages 24-36 (January - March 2012)
Visits
2148
Vol. 5. Issue 1.
Pages 24-36 (January - March 2012)
Original article
DOI: 10.1016/j.rpsmen.2012.04.005
Full text access
Brief Suicide Questionnaire. Inter-rater reliability
Protocolo breve de evaluación del suicidio: fiabilidad interexaminadores
Visits
2148
Rebeca García-Nietoa, Isabel Parra Uribeb, Diego Palaob, Jorge Lopez-Castromana, Pilar Alejandra Sáizc, María Paz García-Portillac, Jerónimo Saiz Ruizd, Angela Ibañezd, Thais Tianae, Santiago Durán Sindreue, Victor Perez Solae, Yolanda de Diego-Oterof, Lucia Pérez-Costillasf, Rafael Fernández García-Andradeg, Dolores Saiz-Gonzálezg, Miguel Angel Jiménez Arrieroh, Mercedes Navío Acostah, Lucas Gineri, Julio Antonio Guijai,j, José Luis Escobark, Jorge Antonio Cervillak,l, Marta Quesadam, Dolores Braquehaism, Hilario Blasco-Fontecillaa, Teresa Legido-Gila, Fuensanta Arocan, Enrique Baca-Garcíaa,
Corresponding author
ebaca@fjd.es

Corresponding author.
, GEICS 1
a Fundación Jiménez Díaz, Madrid, CIBERSAM, Spain
b Corporación Sanitaria Universitaria Parc Taulí de Sabadell, Barcelona, Spain
c Universidad de Oviedo, CIBERSAM, Spain
d Hospital Universitario Ramón y Cajal, Madrid, IRYCIS, CIBERSAM, Spain
e Hospital de la Santa Creu i Sant Pau, Barcelona, CIBERSAM, Spain
f Hospital Carlos Haya and Fundación IMABIS, Málaga, Spain
g Hospital Universitario Clínico San Carlos, Madrid, CIBERSAM, Spain
h Hospital 12 de Octubre, Madrid, CIBERSAM, Spain
i Universidad de Sevilla, Spain
j Instituto de Medicina Legal, Sevilla, Spain
k Unidad de Salud Mental, Hospital Universitario San Cecilio, Granada, Spain
l Departamento de Psiquiatría, Universidad de Granada, CIBERSAM, Spain
m Hospital Universitario Vall d’Hebron, CIBERSAM, Spain
n Instituto de Matemáticas, Universidad Nacional Autónoma de México, Mexico
Article information
Abstract
Full Text
Bibliography
Download PDF
Statistics
Tables (5)
Table 1. Agreement on the various items of each assessment tool.
Table 2. Inter-rater agreement on the Global Assessment of Functioning (GAF) scale.
Table 3. Inter-rater agreement on the items in Beck's Suicide Intent Scale Part I: objective circumstances related to suicide attempt.
Table 4. Inter-rater agreement on the Scale for Suicide Ideation items.
Table 5. Inter-rater agreement on the Lethality Rating Scale (LRS) for suicide attempts.
Show moreShow less
Additional material (1)
Abstract
Introduction

Inter-rater agreement is a crucial aspect in the planning and performance of a clinical trial in which the main assessment tool is the clinical interview. The main objectives of this study are to study the inter-rater agreement of a tool for the assessment of suicidal behaviour (Brief Suicide Questionnaire) and to examine whether the inter-examiner agreement when multiple ratings are made on a single subject is an efficient method to assess the reliability of an instrument.

Method

In the context of designing a multicenter clinical trial, 32 psychiatrists assessed a videotaped clinical interview of a patient with suicidal behaviour. In order to identify those items in which a greater level of discordance existed and detect the examiners whose ratings differed significantly from the average ratings, we used the DOMENIC method (Detecion of Multiple Examiners Not in Consensus).

Results

Inter-rater agreement was between poor (<70%) to excellent (90–100%. Inter-rater agreement in Brugha's list of threatening experiences ranged from 75.5% to 100%; in the Global Assessment of Functioning (GAF) Scale was 82.58%; in Beck's Suicidal Intent Scale, ranged from 67.5% to 97%; in Beck's Scale for Suicide Ideation, ranged from 63.5% to 100%; and in the Lethality Rating Scale was 88.39%. On the whole, the level of agreement among raters, both in general scores and in particular items, was appropriate.

Conclusion

The proposed design allows the assessment of the inter-rater agreement in an efficient way (only in one session). In addition, regarding the Brief Suicide Questionnaire, inter-raters agreement was appropriate.

Keywords:
Clinical trials
Clinical rating scales
Statistics
Inter-rater agreement
Psychometrics
Suicide attempt
Suicide
Resumen
Introducción

El acuerdo entre-examinadores es un aspecto fundamental en la planificación de cualquier trabajo de investigación donde la principal herramienta diagnóstica es la entrevista clínica. El objetivo de este estudio es valorar el acuerdo entre-examinadores de un instrumento de evaluación de la conducta suicida (Protocolo Breve de Evaluación del Suicidio) utilizando las valoraciones de múltiples observadores en una sola sesión.

Método

Durante la fase piloto de un estudio clínico multicéntrico centrado en la monitorización de intentos de suicidio, 32 examinadores evaluaron el vídeo de la entrevista clínica a un paciente simulado con conducta suicida. Para identificar los ítems en los que existía una mayor discordancia y a los examinadores cuyo criterio se alejaba más del acuerdo general, se utilizó el método DOMENIC (Detection Of Multiple Examiners Not In Consensus).

Resultado

El acuerdo interexaminadores osciló entre pobre (<70%) y excelente (90–100%). En la Escala de Acontecimientos Vitales Estresantes el nivel de acuerdo osciló entre 48,4 y 97%; en la escala Problemas Psicosociales del DSM-IV, entre 75,5 y 100%; en la Escala de Evaluación de la Actividad Global fue de 82,58%; en la Escala de Intencionalidad Suicida, osciló entre 67,5 y 97%; en la Escala de Ideación Suicida, entre 63,5 y 100% y en la escala de Letalidad del Intento de Suicidio fue de 88,39%. En general, los examinadores mostraron un nivel de acuerdo adecuado tanto en las puntuaciones globales de cada escala como en cada ítem en particular.

Conclusiones

El diseño propuesto permite evaluar el acuerdo entre-examinadores de una forma eficiente (en una única sesión). Además, con respecto al Protocolo Breve de Evaluación del Suicidio, el acuerdo entre-examinadores fue apropiado.

Palabras clave:
Ensayos clínicos
Escalas clínicas
Estadística
Acuerdo entre-examinadores
Psicometría
Intentos de suicidio
Suicidio
Full Text
Introduction

Suicidal behaviour is the main cause of health resources use and mortality worldwide, especially among young people,1 and is a public health priority for the European Union. Suicidal behaviour (ideation, attempts, completed suicide) is heterogeneous due to the complex interaction of genetic, biological, psychological and environmental factors.2,28 Research on suicidal behaviour is limited by the difficulties involved in evaluating these aspects, which is why it is often studied as subordinated to the diagnosis of axis I (affective disorders and substance dependence) or axis II (border-line personality disorder) without specific assessment tools, when its clinical and health impact makes it deserve to be treated as an independent nosological entity.3

The gold standard for assessing suicidal behaviour is currently clinical assessment.4 However, using protocols and scales has proven very useful in improving the way information is documented and in increasing the thoroughness of clinical evaluation.5 The fact that clinical protocols and scales are used can also be of legal value and serve as a basis for making clinical decisions.6,7 Some recent studies, however, have revealed that the documents that accompany suicidal behaviour assessment are deficient in our environment.8,9 The Spanish group for suicidal behaviour research (GEICS is the Spanish acronym), aware of this situation, has designed a brief suicide assessment questionnaire, which includes the most widely used scales to assess the range of suicidal behaviour, from ideation up to suicide attempts,9 and examines the most important risk and protective factors (Appendix A).

To construct the brief questionnaire for suicide assessment, we have used the (preferably self-administered) scales most utilised in the literature of the past 40years of suicidology. We have also used questions that encompass the socio-demographic factors that have the best descriptive and predictive capability.7

One of the essential requirements for assessment tools is their reproducibility.10 This notion overlaps with that of agreement, and is used interchangeably to talk about consistency measures (reliability, reproducibility, repeatability), which refer to the agreement between several measurements in which none are the “correct” ones, and conformity measures (validity, accuracy), which refer to the agreement between one measurement and another acting as a reference.11 The prototype design for putting inter-rater reliability to test is to use a small number of independent raters (generally 2) who evaluate a large sample of subjects (more than 30). The reliability is measured using Kappa coefficients, the weighted Kappa or interclass correlation coefficient, based on whether the type of tool is to be evaluated is a nominal qualitative, ordinal qualitative or quantitative scale.12–16 Using these indexes requires a greater sample from a single subject to perform the reliability study appropriately, given that it is impossible to calculate the chance agreement with samples from a single patient. Its statistical power depends as much on the number of raters as of subjects, which means a very significant limitation for resources.17

To estimate the inter-rater agreement of the instrument for assessing suicidal behaviour, we used the strategy of a single case evaluated by multiple researchers. To do so, we used the method proposed by Cicchetti et al.,17 which allows you to generate indexes (that can be interpreted clinically and statistically) that permit assessing the overall rater agreement for each of the items in the scales. It also allows you to identify the raters who diverge from the overall agreement global (understood to be the mean score, given that a previous standard pattern is not assumed).

MethodRaters

In this study, 32 raters−psychiatrists and clinical psychologists with at least 2years of training–participated. They assessed a video-recorded clinical interview of a prototype case, recorded in a single session. Before the interview, they received a brief explanation of the tool and each of the scales it included. This audiovisual support has been used in the evaluation of the reliability of assessment tools in psychiatry18 and, although it generally presents lower agreement than clinical histories, it is closer to reality and is more economical than using multiple interviews repeated individually.19

The interview was carried out by 2 of the study participants (LG and JAG). By using this system, we attempted to minimise the factors related to the interview and to the patient that affect any reliability study, given that having a sample from a single patient makes this source of variability disappear.20 Identifying the factors related to the raters was one of the study objectives.

Measurement tools

The different investigative groups designed an assessment questionnaire that examined the following suicidal behaviour-related variables: triggers (stressful life events, psychosocial problems), functionality (previous activity level), objective circumstances related to the suicide attempts, characteristics of suicidal ideation and lethality of the suicide attempt. In addition to examining clinical and socio-demographic data, our brief questionnaire (Appendix A) included the following tools, all translated to Spanish6:

List of threatening experiences (LTE)21

This is an inventory examining the life events experienced by the patient in the last 6months. It consists of 12 dichotomous items that allow only 2 responses (present/absent).

DSM-IV-TR. Psychosocial problems22

Using this tool, we gathered information on the psychosocial and environmental problems that had been present in the 6 previous months, as described in the DSM-IV (APA, 2000).

Global Assessment of Functioning (GAF)

The Global Assessment of Functioning (GAF) is a tool administered by others, proposed by the DSM-III-R (APA, 1987),23 which evaluates the subject's general activity level in the psychosocial, social and work environments. The scores on this scale vary from 0 to 100, in 10-point intervals. The scale is scored based on the overall activity before the suicide attempt.

Beck's Suicide Intent Scale (SIS)24

This other-administered tool to assess suicide intent (SI) characteristics consists of 2 subscales. The first groups the objective circumstances in which the suicide attempt was carried out; the second evaluates the patient's attitude towards life and death and how the patient sees this attempt. For this study, we used the first section, which examines the objective circumstances related to the intention of suicide attempts.25 This section comprises 15 items with a value from 0 to 2. In the studies performed to validate the scale, the measurement of the scores for highly serious SI was 16.3; for SI of average seriousness, the score was 10.1 and for low seriousness, 6.7.25 In a later study by Baca-García et al.,7 a cut-off point of 11 was established for distinguishing the patients who, following the suicide attempt, required admission to a psychiatric unit from those who did not need such an admission.

Scale for Suicidal Ideation (SSI)26

This is a scale that quantifies and assesses the seriousness of suicidal thought, or degree of seriousness and intensity with which someone is thinking about killing themselves. It is a scale of 19 items that have to be filled in by a rater in a semi-structured clinical interview. Divided into 4 sections, it gathers a series of characteristics related to attitude towards life/death, suicidal thoughts or desires, planning the suicide attempt and performing the planned attempt. In the last section, previous suicide attempts are examined. There are 3 alternative answers for each item, indicating an increasing degree of seriousness and/or intensity of the suicidal intentionality.

Lethality Rating Scale27

The suicide attempt method used was coded according to the Lethality Rating Scale and Method Attempt Coding (LRS), which evaluates the various methods utilised and also examines the medical consequences of the attempt.

Statistical analysis

We based the process followed for our statistical analysis on the method proposed by Cicchetti et al.16 In it, global agreement is defined according to the partial agreement levels (the shorter the distance between scores, the greater the agreement). Specifically, the following indexes were calculated:

Normal overall level of inter-rater agreement. This measurement indicates the global agreement of all the raters. The reference values for its interpretation are the following: excellent agreement (a score of 90–100), good (80–89), weak (70–79) and poor agreement (less than 70).

We found the agreement level for each rater individually. To do so, the raters with the same degree of agreement were grouped together and we calculated the clinical and statistical evaluation of the agreement level of each of the raters, using the agreement index, Z score (that indicates the deviation of each rater with respect to the consensus value, in this case the average of the scores).

To identify the items for which there was greater discordance and the raters with a low inter-rater reliability, we used the Detection of multiple examiners not in consensus (DOMENIC)17 method.

ResultsStressful life events

The overall mean for inter-rater agreement for each of the items ranged from 48.4% to 97% (Table 1). The agreement level principally fell between good (80%–89%) and excellent (90%–100%), except for the items 6, 7, 8 and 10 (6. You have broken off a stable relation; 7. You have had a serious problem with some close friend, neighbour or relative; 8. You have become unemployed or have looked for employment for over a month without success; and 10. You have had a serious economic crisis.) (Table 2).

Table 1.

Agreement on the various items of each assessment tool.

Scale  Item  Overall mean inter-rater agreement (%)  Level of significance 
Brugha  1. You yourself have suffered an illness, injury or serious assault.  100.00  Excellent 
  2. A close relative has suffered an illness, injury or serious assault.  100.00  Excellent 
  3. One of your parents or children or your partner/spouse has died.  93.55  Excellent 
  4. A close family friend or some other relative (grandparents, aunts, uncles, cousins) has died.  100.00  Excellent 
  5. You have separated because of marital problems.  81.94  Good 
  6. You have broken off a stable relationship.  60.65  Poor 
  7. You have had a serious problem with some close friend, neighbour or relative.  62.58  Poor 
  8. You have become unemployed or have looked for employment for over a month without success.  54.84  Poor 
  9. You have been fired from your job.  87.53  Good 
  10. You have had a serious economic crisis.  48.39  Poor 
  11. You have had problems with the police or have appeared in court.  81.94  Good 
  12. You have been robbed or have lost a valuable item.  93.55  Excellent 
DSM-IV  Problems with the primary support group  93.01  Excellent 
  Social environment  100.00  Excellent 
  School  87.31  Good 
  Work  89.35  Good 
  Home  83.39  Good 
  Finances  81.45  Good 
  Access to health services  81.29  Good 
  Legal system  83.39  Good 
  Other psychosocial problems  75.48  Normal 
GAF  Scale assessment previous overall activity  82.58  Good 
SIS  1. Isolation  100.00  Excellent 
  2. Time  70.62  Normal 
  3. Precautions against discovery/intervention  67.57  Poor 
  4. Actions to obtain help during and after the attempt  66.88  Poor 
  5. Final actions, anticipating death (e.g., insurance policies, gifts, will)  86.02  Good 
  6. Active preparation for the attempt  69.46  Poor 
  7. Suicide note  81.89  Good 
  8. Communication of intention before the attempt  67.20  Poor 
  9. Intention of the attempt  71.78  Normal 
  10. Expectations about the fatal result  91.96  Excellent 
  11. Knowledge about the lethality of the method  89.12  Good 
  12. Seriousness of the attempt  74.84  Normal 
  13. Attitude towards life/death  64.13  Poor 
  14. Conception of the medical intervention  86.15  Good 
  15. Degree of premeditation  67.96  Poor 
  Total cut-off point=11  87.53  Good 
       
SSI  1. Desire to live  93.76  Excellent 
  2. Desire to die  67.89  Poor 
  3. Reasons for living/dying  100.00  Excellent 
  4. Desire to actively attempt suicide  93.76  Excellent 
  5. Passive suicide attempt  77.42  Normal 
  6. Duration of the suicidal ideation/desire  63.51  Poor 
  7. Frequency of the suicidal ideation/desire  65.48  Poor 
  8. Attitude towards the suicidal ideation/desire  87.53  Good 
  9. Control over the suicidal act: acting-out/desire  87.53  Good 
  10. Deterrents (“brakes”) against making an active attempt (family, consequences if not completed)  74.80  Normal 
  11. Reasons for the planned attempt  84.84  Good 
  12. Method: specificity/planning  74.62  Normal 
  13. Method: availability/opportunity  88.73  Good 
  14. Feeling of “capability to perform the attempt”  84.84  Good 
  15. Expectancy/anticipation of the attempt itself  87.53  Good 
  16. Real preparation  64.62  Poor 
  17. Suicide note  88.73  Good 
  18. Last arrangements to prepare for death (insurance policies, will, donations, etc.)  87.53  Good 
  19. Deception/Hiding the planned attempt  69.38  Poor 
  20. Previous suicide attempts  93.76  Excellent 
  21. Intention of dying related with the last attempt  91.35  Excellent 
LRS  Lethality of the suicide attempt  88.39  Good 
Table 2.

Inter-rater agreement on the Global Assessment of Functioning (GAF) scale.

Score  No.  Mean inter-rater agreement  Clinical significance  Z Value  P  Raters who gave the same score 
0.24  Poor  −3.53  <0.001  2, 3 
         
         
         
         
0.80  Good  −0.54  0.59  12, 18, 23 
14  0.89  Good  0.06  0.95  5, 6, 7, 11, 13, 17, 19, 22, 24, 26, 27, 30, 31, 32 
12  0.86  Good  0.65  0.51  4, 8, 9, 10, 14, 15, 16, 20, 21, 25, 28, 29 
         
         
DSM-IV. Psychosocial problems

The overall mean for inter-rater agreement for each of the items ranged between 75.5% and 100% (Table 1). The agreement level was mainly good to excellent, except for the item “Other psychosocial problems”, in which agreement was weak.

Global Assessment of Functioning (GAF) for previous overall activity

The score agreement on this scale was good (82.58%). Only 2 raters (numbers 2 and 3) presented statistically poor agreement (P<.001) compared to the mean of the total scores (reference pattern).

Beck's Suicide Intent Scale Part I: objective circumstances related to the intention of suicide

The overall mean for inter-rater agreement in each of the items ranged from 67.5% to 97% (Table 1). The agreement level for most of the items varied from good to excellent, except for the items 3, 4, 6, 8, 13 and 15 (3. Precautions against discovery/intervention; 4. Actions to obtain help during and after the attempt; 6. Active preparation for the attempt; 8. Communication of intention before the attempt; 13. Attitude towards life/death; and 15. Degree of premeditation), for which significant divergence was detected. The raters whose scores differed most from the others, in each of the items, were raters 5 and 7 (Table 3). Agreement with the total scale score, using a cut-off point of 11, was good (87.5%).

Table 3.

Inter-rater agreement on the items in Beck's Suicide Intent Scale Part I: objective circumstances related to suicide attempt.

Item  Score  No.  Mean inter-rater agreement  Clinical significance  Z Value  P  Raters who gave the same score 
SIS1           
           
  31  100.00  Excellent      2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
SIS2  47.33  Poor  −2.65  0.01  4, 25 
  24  83.33  Good  0.00  1.00  3, 5, 7, 8, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32 
  47.33  Poor  2.65  0.01  2, 21 
SIS3  55.33  Poor  −2.37  0.02  23, 27 
  12  66.67  Poor  −0.75  0.45  3, 4, 8, 9, 12, 13, 15, 18, 19, 21, 25, 30 
  16  74.00  Normal  0.86  0.39  2, 5, 6, 7, 10, 11, 14, 16, 17, 20, 24, 26, 28, 29, 31, 32 
SIS4           
  19  76.67  Normal  −0.73  0.47  3, 4, 6, 8, 9, 10, 13, 14, 15, 16, 19, 20, 21, 22, 25, 26, 28, 30, 31 
  10  61.67  Poor  1.38  0.17  2, 5, 7, 11, 17, 23, 24, 27, 29, 32 
SIS5  26  91.67  Excellent  −0.44  0.66  2, 3, 4, 5, 6, 8, 10, 11, 12, 14, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  56.67  Poor  2.28  0.02  7, 9, 13, 15, 19 
           
SIS6  14  70.00  Normal  −1.07  0.29  2, 4, 8, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 22 
  16  73.33  Normal  0.94  0.35  3, 5, 6, 7, 9, 15, 21, 23, 24, 25, 27, 28, 29, 30, 31, 32 
           
SIS7  23  88.67  Good  −0.57  0.57  2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 18, 20, 21, 22, 23, 25, 26, 28, 29, 30, 31, 32 
  51.67  Poor  0.69  0.49  13, 19 
  66.00  Poor  1.94  0.05  9, 11, 15, 17, 24, 27 
SIS8  15  72.33  Normal  −0.92  0.36  3, 4, 5, 7, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 23 
  13  68.33  Poor  0.70  0.48  2, 6, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  55.00  Poor  2.33  0.02  9, 15 
SIS9  14  72.33  Normal  −1.01  0.31  2, 3, 4, 5, 8, 11, 12, 17, 18, 23, 24, 26, 27, 30 
  15  73.33  Normal  0.64  0.52  6, 7, 9, 10, 14, 15, 16, 20, 21, 22, 25, 28, 29, 31, 32 
  56.33  Poor  2.29  0.02  13, 19 
SIS10  61.00  Poor  −3.58  0.00  3, 4 
  50.00  Poor  −1.63  0.10  29 
  28  95.67  Excellent  0.31  0.75  2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32 
SIS11  60.67  Poor  −3.38  0.00  13, 19 
  51.67  Poor  −1.51  0.13  5, 24 
  27  94.00  Excellent  0.36  0.72  2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32 
SIS12           
  13  70.00  Normal  −1.18  0.24  3, 4, 8, 11, 12, 13, 17, 18, 19, 22, 26, 28, 31 
  18  78.33  Normal  0.85  0.40  2, 5, 6, 7, 9, 10, 14, 15, 16, 20, 21, 23, 24, 25, 27, 29, 30, 32 
        1.00   
SIS13  60.67  Poor  −1.60  0.11  5, 11, 13, 17, 19, 30 
  12  66.67  Poor  −0.27  0.79  2, 6, 12, 14, 18, 20, 21, 22, 23, 25, 26, 27 
  12  68.67  Poor  1.07  0.29  3, 4, 7, 9, 10, 15, 16, 24, 28, 29, 31, 32 
SIS14  26  92.00  Excellent  −0.41  0.68  2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 16, 17, 18, 19, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  55.00  Poor  1.72  0.09  9, 14, 15, 20 
  58.67  Poor  3.85  0.00  23 
SIS15           
  60.00  Poor  −1.49  0.14  5, 7, 8, 9, 13, 15, 19, 24, 25 
  20  78.33  Normal  0.67  0.50  2, 3, 4, 6, 10, 11, 14, 16, 17, 20, 21, 22, 23, 26, 27, 28, 29, 30, 31, 32 
Scale for suicidal ideation

The overall mean for inter-rater agreement in each of the items ranged between 63.51% and 100% (Table 1). The agreement level fell principally between good and excellent, except for the items 2, 6, 7, 16 and 19 (2. Desire to die; 6. Duration of the suicidal ideation/desire; 7. Frequency of the suicidal ideation/desire; 16. Expectation/Anticipation of the actual attempt; and 19. Suicide note), for which there was significant divergence. The raters whose scores differed most from those of the others in each of the items were numbers 12 and 18 (Table 4).

Table 4.

Inter-rater agreement on the Scale for Suicide Ideation items.

Item  Score  No.  Mean inter-rater agreement  Clinical significance  Z Value  P  Raters who gave the same score 
SSI1  29  96.67  Excellent  −0.26  0.79  2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  51.67  Poor  3.81  0.00  12, 18 
           
           
SSI2  11  65.67  Poor  −1.19  0.23  5, 7, 8, 9, 10, 12, 13, 15, 16, 18, 19 
  17  75.00  Normal  0.51  0.61  2, 3, 4, 6, 11, 14, 17, 20, 21, 23, 24, 25, 26, 27, 29, 30, 32 
  53.67  Poor  2.22  0.03  28, 31 
           
SSI3  31  100.00  Excellent      2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
           
           
           
SSI4  29  96.67  Excellent  −0.26  0.79  2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  51.67  Poor  3.81  0.00  12, 18 
           
           
SSI5  10  65.00  Poor  −1.45  0.15  2, 3, 5, 9, 12, 13, 15, 18, 19, 26 
  21  83.33  Good  0.69  0.49  4, 6, 7, 8, 10, 11, 14, 16, 17, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32 
           
           
SSI6  63.67  Poor  −1.39  0.16  3, 4, 5, 13, 19, 21, 23, 25 
  11  65.00  Poor  −0.13  0.90  7, 9, 14, 15, 20, 27, 28, 29, 30, 31, 32 
  11  67.67  Poor  1.14  0.25  6, 8, 10, 11, 12, 16, 17, 18, 22, 24, 26 
           
SSI7           
  21  78.33  Normal  −0.58  0.56  3, 4, 5, 6, 7, 10, 11, 14, 16, 17, 20, 21, 23, 24, 25, 27, 28, 29, 30, 31, 32 
  55.00  Poor  1.73  0.08  8, 12, 13, 18, 19, 22, 26 
           
SSI8  28  93.33  Excellent  −0.27  0.79  3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  50.00  Poor  3.74  0.00  12, 18 
           
           
SSI9  28  93.33  Excellent  −0.27  0.79  3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  50.00  Poor  3.74  0.00  12, 18 
           
           
SSI10  22  84.67  Good  −0.56  0.57  3, 4, 6, 7, 9, 10, 11, 13, 15, 16, 17, 19, 21, 22, 23, 24, 25, 26, 27, 29, 30, 32 
  53.33  Poor  0.84  0.40  5, 8, 14, 20 
  60.67  Poor  2.25  0.02  12, 18, 28, 31 
           
SSI11  51.67  Poor  −3.00  0.00  3, 4, 27 
  27  91.67  Excellent  0.33  0.74  5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32 
           
           
SSI12  22  83.33  Good  −0.60  0.55  7, 8, 10, 12, 13, 14, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  60.00  Poor  1.66  0.10  3, 4, 5, 6, 9, 11, 15, 17 
           
           
SSI13  59.33  Poor  −3.74  0.00  23, 27 
           
  28  94.00  Excellent  0.27  0.79  3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32 
           
SSI14  27  91.67  Excellent  −0.33  0.74  3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  51.67  Poor  3.00  0.00  9, 15, 21 
           
           
SSI15  28  93.33  Excellent  −0.27  0.79  3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31 
    Poor  3.74  0.00  29, 32 
           
           
SSI16  12  68.33  Poor  −1.07  0.28  3, 4, 5, 6, 13, 19, 21, 22, 23, 25, 26, 27 
  13  68.33  Poor  0.33  0.74  8, 9, 10, 11, 12, 15, 16, 17, 18, 28, 29, 31, 32 
  59.00  Poor  1.72  0.08  7, 14, 20, 24, 30 
           
SSI17  28  94.00  Excellent  −0.27  0.79  3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
           
  59.33  Poor  3.74  0.00  9, 15 
           
SSI18  28  93.33  Excellent  −0.27  0.79  3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
  50.00  Poor  3.74  0.00  10, 16 
           
           
SSI19  11  65.33  Poor  −1.24  0.21  3, 4, 9, 10, 11, 13, 15, 16, 17, 19, 25 
  18  76.67  Normal  0.62  0.54  6, 7, 8, 12, 14, 18, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32 
  52.00  Poor  2.48  0.01 
           
SSI20  51.67  Poor  −3.81  0.00  12, 18 
  29  65.67  Excellent  0.26  0.79  2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 
           
           
SSI21  59.33  Poor  −4.45  0.00 
  51.67  Poor  −2.07  0.04  7, 25 
  28  95.33  Excellent  0.31  0.76  2, 3, 4, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32 
           
Lethality rating scale for suicide intention

The agreement reached in the score for this scale was good (88.39%). There was only a single rater (number 6) who presented an average level of agreement with the mean for the total scores (standard pattern) (Table 5).

Table 5.

Inter-rater agreement on the Lethality Rating Scale (LRS) for suicide attempts.

Score  No.  Mean inter-rater agreement  Clinical significance  Z Value  P  Raters who gave the same score 
         
         
         
15  0.89  Good  −0.98  0.33  2, 3, 4, 5, 7, 10, 11, 14, 16, 17, 20, 24, 25, 27, 28 
15  0.89  Good  0.81  0.42  8, 9, 12, 13, 15, 18, 19, 21, 22, 23, 26, 29, 30, 31, 32 
0.7  Normal  2.60  0.01 
         
Discussion

The objective of this study was to evaluate the reliability of a questionnaire for assessing suicidal behaviour (Brief Suicide Questionnaire) for its research use in a multi-centre project using the assessments of multiple raters on a sample from a single patient. Our study results make it possible to state that the clinical scales that compose this questionnaire have reliability. What is more, the reliability observed is attributable to specific raters and, in the case of the scales with more than 1 item, it is related to the fact that some raters left some of the item answers blank.

It should not be forgotten that, as this is a design with only 1 patient, it may not be possible to generalise the results on tool reliability to the population from which the patient was selected. Faced with constructing a tool applicable to clinical situations, the approach to estimating its reliability would be different. That would require assessing various videotaped patients (approximately 10 for each observer included) and using other statistical parameters like the weighted Kappa or the interclass coefficient of correlation or quantitative or ordinal scales. Our study might let developers of such a project know where the areas of low consistency of these tools are and which areas could initially be eliminated in consequence.

The sole-case design controls the sources of variability related to the exam and to patient assessment; in this way, assessment variability is reduced to the factors depending on the rater. In fact, as has been indicated earlier, identifying raters whose assessments differed most from the group was one of our study objectives. In the preparation stage of all types of multi-centre studies (including clinical trials), using this kind of design (agreement of multiple assessors on a single patient) has proved useful for detecting areas of low consistency and identifying assessors who differ from the group.17 Nevertheless, it is important to note that this design type is rare in the literature, principally due to the complexity of the statistical treatment that it involves.16 Solving this problem with the procedures proposed by Cicchetti and Showalter,16 the procedure that we describe here can make the preparation stage more efficient for multi-centre study researchers. One of the most important characteristics of this preparation is training the examiners until appropriate inter-rater reliability can be guaranteed. Identifying items and raters with low levels of reliability, followed by specific training in the most conflictive items, could help to correct potential sources of variability in assessing the participants in a clinical trial. This would, in turn, contribute to increasing design strength without having to enlarge study sample size.17

In summary, as the results of this study manifest, the technique developed by Cicchetti et al.16 helps to meet these objectives efficiently, because it requires a very small sample size (1 subject), a single assessment session that can be pre-recorded, and it does not require all of the researchers to assess the subject at the same time, given that the indexes can be calculated later on. In addition, the new technologies (like videoconferences) allow the assessments to take place at the same time but from different places. With respect to the Brief Suicide Questionnaire, we can conclude that it presents appropriate inter-rater agreement for research purposes while identifying the areas of low agreement and the raters who distance themselves from the overall agreement. To use this tool with greater reliability, measures for investigator training have been implemented.

Ethical responsibilities

Protection of human and animal subjects. The authors declare that the procedures followed were in accordance with the regulations of the responsible Clinical Research Ethics Committee and in accordance with those of the World Medical Association and the Helsinki Declaration.

Confidentiality of data. The authors declare that no patient data appear in this article.

Right to privacy and informed consent. The authors declare that no patient data appear in this article.

Conflict of interest

The authors have no conflict of interest to declare.

Appendix A
A.1
The Spanish group for suicidal behaviour research (GEICS is the Spanish acronym)

Universidad Autónoma de Madrid: Concepción Vaquero Lorenzo.

Corporación Sanitaria Universitaria Parc Taulí de Sabadell, Barcelona: Gemma García-Parés, María Giró Batalla, M. Garrido.

Hospital 12 de Octubre, Madrid, CIBERSAM: M. Aragues.

Hospital Carlos Haya and Fundación IMABIS, Málaga: E. Martín, M. Alba, M.I. Gómez, A. González, M. Maté, M. Romero and N. Cantero.

Hospital de la Santa Creu i Sant Pau, Barcelona, CIBERSAM: J. Hernández and S. Durán Sindreu.

Universidad de Oviedo, CIBERSAM: Maria Teresa Bascarán, Julio Bobes, Manuel Bousoño and P. Burón, Luis Jiménez Treviño.

References
[1]
M.A. Oquendo, E. Baca-García, J.J. Mann, J. Giner.
Issues for DSM-V: suicidal behavior as a separate diagnosis on a separate axis.
Am J Psychiatry, 165 (2008), pp. 1383-1384
[2]
J.J. Mann, A. Apter, J. Bertolote, A. Beautrais, D. Currier, A. Haas, et al.
Suicide prevention strategies: a systematic review.
JAMA, 294 (2005), pp. 2064-2074
[3]
F.M. Gore, P.J. Bloem, G.C. Patton, J. Ferguson, V. Joseph, C. Coffey, et al.
Global burden of disease in young people aged 10–24 years: a systematic analysis.
Lancet, 377 (2011), pp. 2093-2102
[4]
P.S. Links, B. Hoffman.
Preventing suicidal behaviour in a general hospital psychiatric service: priorities for programming.
Can J Psychiatry, 50 (2005), pp. 490-496
[5]
M.A. Oquendo, D. Currier, K. Posner.
Reconceptualización de la nosología psiquiátrica: el caso de la conducta suicida.
Rev Psiquiatr Ment (Barc), 2 (2009), pp. 63-65
[6]
E. Baca-Garcia, C. Diaz-Satre, E. Garcia Resa, H. Blasco, D. Braqueis Conesa, J. Saiz-Ruiz, et al.
Variables associated with hospitalization decision by emergency psychiatrists after a patient's suicide attempt.
Psychiatr Serv, 55 (2004), pp. 792-797
[7]
E. Baca-Garcia, M.M. Perez-Rodriguez, I. Basurte-Villamor, J. Saiz-Ruiz, J.M. Leiva-Murillo, M. de-Prado-Cumplido, et al.
Using data mining to explore complex clinical decisions: a study of hospitalization after a suicide attempt.
J Clin Psychol, 67 (2006), pp. 1124-1133
[8]
M. Miret, R. Nuevo, C. Morant, E. Sainz-Cortón, M.A. Jiménez-Arriero, J.J. López-Ibor, et al.
Calidad de los informes médicos sobre personas que han intentado suicidarse.
Rev Psiquiatr Ment (Barc), 3 (2010), pp. 13-18
[9]
M.P. García-Portilla, M.T. Bascarán, P.A. Sáiz, M. Bousoño, M. Parellada, J. Bobes.
Banco de instrumentos básicos para la práctica de la psiquiatría clínica.
6th ed., Comunicación y Ediciones Sanitarias, SL. Psiquiatría Editores, (2011),
[10]
M.D. Brundag, J.L. Pater, B. Zee.
Assessing the reliability of two toxicity scales: implications for interpreting toxicity data.
J Natl Cancer Inst, 85 (1993), pp. 1138-1148
[11]
Müller Reinhold, Büttner Petra.
A critical discussion of intraclass correlation coefficients.
Stat Med, 13 (1994), pp. 2465-2476
[12]
J. Andersen, A. Korner, J.K. Larsen, V. Schultz, B.M. Nielsen, K. Behnke, et al.
Agreement in psychiatric assessment.
Acta Psychiatr Scand, 87 (1993), pp. 128-132
[13]
J.J. Bartko, B. Carpiniello.
On the methods and theory of reliability.
J Nerv Ment Dis, 163 (1976), pp. 307-317
[14]
R.L. Spitzer, J.L. Fleiss.
A re-analisys of the reliability of psychiatric diagnosis.
Br J Psychiatry, 125 (1974), pp. 341-347
[15]
P.E. Shorout, R.L. Spitzer, J.L. Fleiss.
Quantification of agreement in psychiatric diagnosis revisited.
Arch Gen Psychiatry, 44 (1987), pp. 172-177
[16]
D.V. Cicchetti, D. Showalter, R. Rosenheck.
A new method for assessing interexaminer agreement when multiple ratings are made on a single subject: applications to the assessment of neuropsychiatric symtomatology.
Psychiatr Res J, 72 (1997), pp. 51-63
[17]
E. Baca-García, C. Blanco, J. Sáiz-Ruiz, F. Rico, C. Diaz-Sastre, D.V. Cicchetti.
Assessment of reliability in the clinical evaluation of depressive symptoms among multiple investigators in a multicenter clinical trial.
J Psychiatr Res, 102 (2001), pp. 163-173
[18]
J.M. Bland, D.G. Altman.
Statistical methods for assessing agreement between two methods of clinical measurement.
Lancet, 1 (1986), pp. 307-310
[19]
E.G. Altman, D.R. Hedeker, P.G. Janicak, J.L. Peterson, J.M. Davis.
The Clinician-Administered Rating Scale for Mania (CARS-M): development, reliability, and validity.
Biol Psychiatry, 36 (1994), pp. 124-134
[20]
A. Lobo, F.J. Huyse, T. Herzog, U.F. Maltz.
The ECLW collaborative study II: patient registration form (PRF) instrument, training and reliability.
J Psychosom Res, 40 (1996), pp. 143-156
[21]
T.S. Brugha, D. Cragg.
The list of threatening experiences: the reliability and validity of a brief life events questionnaire.
Acta Psychiatr Scand, 82 (1990), pp. 77-81
[22]
American Psychiatric Association.
DSM-IV-TR. Manual diagnóstico y estadístico de los trastornos mentales. Texto revisado.
Masson, (2000),
[23]
American Psychiatric Association.
DSM-III-R. Diagnostic and statistical manual of mental disorders.
APA, (1987),
[24]
A.T. Beck, D. Schuyler, I. Herman.
Development of suicidal intent scales.
The prediction of suicide,
[25]
F.J. Diaz, E. Baca-Garcia, C. Diaz-Sastre, E. García Resa, H. Blasco, D. Braquehais Conesa, et al.
Dimensions of suicidal behavior according to patient reports.
Eur Arch Psychiatry Clin Neurosci, 253 (2003), pp. 197-202
[26]
A.T. Beck, M. Kovacs, A. Weissman.
Assessment of suicidal intention: the Scale for Suicide Ideation.
J Consult Clin Psychol, 47 (1979), pp. 343-352
[27]
A.T. Beck, H.L.P. Resnik, D.J. Lettieri.
The prediction of suicide.
Charles Press Publishers, (1974),
[28]
J.L. Ayuso-Mateos, E. Baca-García, J. Bobes, J. Giner, L. Giner, V. Pérez, et al.
Recomendaciones preventivas y manejo del comportamiento suicida en España.
Rev Psiquiatr Salud Ment (Barc), 5 (2012), pp. 8-23

The group members are listed in Appendix A.

Please cite this article as: García-Nieto R, et al. Protocolo breve de evaluación del suicidio: fiabilidad interexaminadores. Rev Psiquiatr Salud Ment (Barc). 2012;5:24–36.

Copyright © 2011. SEP y SEPB
Article options
Tools
Supplemental materials
es en pt

¿Es usted profesional sanitario apto para prescribir o dispensar medicamentos?

Are you a health professional able to prescribe or dispense drugs?

Você é um profissional de saúde habilitado a prescrever ou dispensar medicamentos

es en pt
Política de cookies Cookies policy Política de cookies
Utilizamos cookies propias y de terceros para mejorar nuestros servicios y mostrarle publicidad relacionada con sus preferencias mediante el análisis de sus hábitos de navegación. Si continua navegando, consideramos que acepta su uso. Puede cambiar la configuración u obtener más información aquí. To improve our services and products, we use "cookies" (own or third parties authorized) to show advertising related to client preferences through the analyses of navigation customer behavior. Continuing navigation will be considered as acceptance of this use. You can change the settings or obtain more information by clicking here. Utilizamos cookies próprios e de terceiros para melhorar nossos serviços e mostrar publicidade relacionada às suas preferências, analisando seus hábitos de navegação. Se continuar a navegar, consideramos que aceita o seu uso. Você pode alterar a configuração ou obter mais informações aqui.