Many women across the world suffer from endometriosis. This disease should be staged by laparoscopy in order to know the extent of disease. Ultrasound may be a reliable diagnostic tool that could complement laparoscopy for endometriosis staging. The aim of this study is to perform a narrative review of the current status of studies comparing ultrasound findings and laparoscopic staging according to American Society Reproductive Medicine (ASRM) and ENZIAN classifications. A search in PubMed and Web of Science databases from 2004 to 2022 was performed using the following terms “endometriosis”, “ultrasound”, “laparoscopy”, “ENZIAN” and “ASRM”. We focused on the accuracy of sonography using laparoscopy as gold standard. Seven studies were ultimately included. We observed that ultrasound is accurate and correlates well with advanced stages in the case of ASRM classification, and correlates well with ENZIAN classification. However, some limitations came up. There is little scientific information out there regarding this specific topic. Some of the studies have a retrospective design and one of them has a small sample size. In addition to this, even if ultrasound could have a relevant role in staging deep endometriosis, this method is highly dependent on the operator's experience. We conclude that diagnostic performance of transvaginal ultrasound (TVS) for evaluating the extent of disease in women with pelvic endometriosis is high. However, evidence is still limited and further studies are needed.
Muchas mujeres en todo el mundo sufren de endometriosis. Esta enfermedad debe ser estadificada por laparoscopia para conocer la extensión de la enfermedad. La ecografía puede ser una herramienta de diagnóstico fiable que podría complementar la laparoscopia para la estadificación de la endometriosis. El objetivo de este estudio es realizar una revisión narrativa del estado actual de los estudios que comparan los hallazgos ecográficos y la estadificación laparoscópica según las clasificaciones de la Sociedad Americana de Medicina Reproductiva (ASRM) y ENZIAN. Se realizó una búsqueda en las bases de datos PubMed y Web of Science de 2004 a 2022 utilizando los siguientes términos: «endometriosis», «ultrasonido», «laparoscopia», «ENZIAN» y «ASRM». Nos enfocamos en la precisión de la ecografía utilizando la laparoscopia como estándar de oro. Finalmente, se incluyeron siete estudios. Observamos que la ecografía es precisa y se correlaciona bien con estadios avanzados en el caso de la clasificación ASRM, y se correlaciona bien con la clasificación ENZIAN. Sin embargo, surgieron algunas limitaciones. Hay poca información científica sobre este tema específico: algunos de los estudios tienen un diseño retrospectivo y uno de ellos tiene un tamaño de muestra pequeño. Además de esto, si bien la ecografía podría tener un papel relevante en la estadificación de la endometriosis profunda, este método depende en gran medida de la experiencia del operador. Concluimos que el rendimiento diagnóstico de la ecografía transvaginal para evaluar la extensión de la enfermedad en mujeres con endometriosis pélvica es alto. Sin embargo, la evidencia aún es limitada y se necesitan más estudios.
Endometriosis is an inflammatory disease caused by the implantation of endometrial tissue along the pelvic cavity (or even further). This disease has an estimated 10% prevalence among reproductive-age women worldwide.1 In the United States of America, the primary cause of hysterectomies performed in women between 30 and 34 years old is pain-related endometriosis.1
Even though the pathology behind this disease is not completely understood yet, there are three main theories: retrograde menstruation, coelomic metaplasia and transportation of endometrial cells throughout blood/lymphatics.1 The symptoms associated to this disease are very variable and include pelvic pain, infertility, dysmenorrhea and dyspareunia among others. Some patients may be asymptomatic as well, but many women see their life's quality severely diminished due to this condition.
For many years, the gold standard for staging endometriosis has been laparoscopy, which is an invasive, stressful and costly method. However, non-invasive approaches such as sonography and magnetic resonance imaging (MRI) are gradually gaining more and more relevance.
It is important to distinguish between superficial and deep endometriosis. Endometrial tissue affecting the peritoneum is referred to as superficial endometriosis, whereas lesions affecting deeply the recto-vaginal septum, bladder, rectum or any other pelvic organ are considered as deep endometriosis. When this condition affects the ovaries, it is called endometrioma and it may extend to the fallopian tubes.
There have been many attempts to accurately staging the endometriosis extension throughout the years. Currently, two main classifications are used, the ENZIAN and revised American Society for Reproductive Medicine (rASRM) classifications.
Undoubtedly, the most widely used classification in clinical practice in recent decades is the rASRM classification.2 This classification was originally proposed in 1979 and subsequently revised several times. The most recent version is from 1996. This classification is based on the surgical findings. It analyzes the presence of endometriotic implants (deep and/or superficial), their size, and the presence of adhesions following the scheme in Fig. 1. The total percentage is assigned with this scoring system, classifying endometriosis in four stages (stage I (minimal): 1–5 points; stage II (mild): 6–15 points; stage III (moderate): 16–40 points; and study IV (severe): >40 points). This classification has the advantage that it is very widely implemented throughout the world, it is easy to use and understand, and it is useful to explain the extent of the disease.3 However, it also has disadvantages, such as the lack of classification for certain pelvic areas such as the bladder, bowel or vagina among others.3–5
The ENZIAN score was developed in Austria in 2005.6 The latest update, released in 20207 is shown in Fig. 2. This classification divides the pelvic cavity into three compartments: compartment A, which would be the rectovaginal space; compartment B, which would be the uterosacral ligament and walls of the pelvic cavity; and compartment C, which would be the rectum. It also includes the Uterus (Fa), Bladder (Fb), Intestine (Fi), Ureter (Fu) and other structures. As well as, lesions in the Ovary (O), Fallopian Tubes (T) and Peritoneum (P). In addition to this, it associates a number according to the dimensions of the lesions, that is 1 (1–3cm), 2 (3–7cm) or 3 (more than 7cm). The main advantage of the ENZIAN score is that it provides a very complete description of the entire pelvic cavity; this facilitates the classification of laparoscopic findings for surgeons. However, the ENZIAN score is not as universally used as the rASRM, it is mainly used in certain European countries. Another limitation of the ENZIAN score would be its complexity when compared to the rASRM.
However, the two main limitations associated to these staging systems is their low reproducibility and their lack of accurate correlation with symptoms.8
Treatment of endometriosis is based on either drugs, including oestrogen suppressors and non-steroidal anti- inflammatory drugs (NSAIDs), or surgery.9 In case of surgery, this may be certainly complex.10,11 Therefore, the knowledge of the nature and location of these lesions prior to surgery is quite important. The objective of this systematic review is to analyze the diagnostic performance of both laparoscopy and sonography, according to the ENZIAN and/or ASRM classifications, and to assess whether ultrasound is a reliable diagnostic tool for this purpose.
MethodsThis narrative review was performed according to the recommendation of Green et al.12 We performed a search in PubMed/Medline and Web of Science databases between the January 2000 and December 2022. We used the following terms: “endometriosis”, “ultrasound”, “laparoscopy”, “staging”, “ENZIAN” and “ASRM”. We searched for articles published in English language only.
Once the articles were identified, all the titles and abstracts were read first, to later exclude irrelevant articles or those that did not have to do with the topic to be studied in question. In turn, the full texts of the articles were obtained to select those that we were going to use in the study, based on inclusion criteria that were:
- (1)
Prospective or retrospective cohort studies that included patients with pelvic endometriosis evaluated by transvaginal ultrasound to assess the extension of the disease.
- (2)
The standard reference was laparoscopic/histological findings.
- (3)
The study used the rASRM or ENZIAN classification to report laparoscopic/histological findings.
- (4)
The study reported data about sensitivity and specificity.
The exclusion criteria were:
- (1)
Study did not report data about diagnostic performance.
- (2)
Study did not use surgical findings as reference standard.
- (3)
Study used MRI, not ultrasound.
The Patients, Interventions, Comparators, Outcomes, Study Design (PICOS) criteria were used to describe the included studies.
All the primary studies have been evaluated independently, using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2).13 It is an instrument for evaluating the quality of diagnostic accuracy studies. It includes four key domains:
- (1)
The selection of patients (patient selection)
- (2)
The index test
- (3)
The reference test (reference standard)
- (4)
The flow of patients through the study and the moments of performance of the index and reference tests (flow and times) (flow and time)
For each domain, the risk of bias was analyzed and scored as low, high, or unclear risk.
Due to the nature of this review, Institutional Review Board approval was waived.
ResultsSearchThe search in Web of Science returned 27 citations. Whereas in Pubmed, the search returned 25 citations, yielding a total number of 56 records. Fifteen records were duplicated and were excluded. After reading the full text of the remaining 41 citations, 34 studies were excluded for being irrelevant to the present review, revisions or case report, not reporting data about diagnostic performance of ultrasound or study focusing on MRI. Therefore, seven articles were selected for this review.14–20 Four papers used the ENZIAN classification for surgical findings,15–18 two studies used on the rASRM classification19,20 and one study used both classifications.14
Characteristics of the studiesTable 1 shows the main characteristics of the studies included in this review. The studies were performed in different countries all over the world, most of them took place in Europe, but some were performed in India, Brazil and Australia as well. Half of these researches were multi-centric and almost all of them the series of patients was consecutive. Studies were reported over a twelve-year period, from 2010 to 2022.
Characteristics of the studies analyzed.
| Author | Year | Country | Study's design | Patient's age (years) | N | Index test | IDEA protocol | Number of examiners | Expertise examiners | Reference test | Classification used | Surgeons blinded to US findings | Time from TVS to surgery |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gonçalves | 2021 | Brasil | Prospective | 18–45 | 120 | TVS | No | 2 | Expert | LPS/Histo | ENZIAN/rASRM | Yes | <3 months |
| Hudelist | 2021 | Austria | Prospective | 28–40 | 195 | TVS | Yes | 1 | Expert | LPS/Histo | ENZIAN | Yes | <2 weeks |
| Bindra | 2022 | India | Retrospective | 20–42 | 50 | TVS | Yes | 1 | Expert | LPS/Histo | ENZIAN | Yes | <4 weeks |
| Enzelsberger | 2022 | Multiple | Prospective | 27–37 | 1057 | TVS | Yes | 32 | Expert and non-expert | LPS | ENZIAN | Yes | <6 months |
| Montanari | 2022 | Multiple | Prospective | 29–41 | 745 | TVS | Yes | Multiple | Expert | LPS/Histo | ENZIAN | Yes | <3 months |
| Holland | 2010 | UK | Prospective | 19–51 | 201 | TVS | No | 4 | Expert | LPS | rASRM | Yes | <2 months |
| Leonardi | 2020 | Australia | Retrospective | 23–41 | 204 | TVS | Yes | 2 | Expert | LPS/Histo | rASRM | No | <6 months |
TVS: transvaginal ultrasound; N: number of patients; IDEA: International Deep Endometriosis Analysis; LPS: laparoscopy; Histo: histology.
The sample sizes were usually large, except for the Bindra's study, which only had 50 patients.16 The total number of patients assessed was 2670 for all seven studies, 2167 women for studies using ENZIAN classification and 523 women for studies using rARSM classification (as stated above, one study including 120 women assessed both classifications).14
All of them were prospective studies, with the only exceptions of Bindra et al.16 and Leonardi et al.20 that had a retrospective design. However, it should be noted that in Gonzalves’ study, albeit data were collected prospectively, data analysis was done retrospectively.14 Laparoscopy was used as the gold standard in all studies. However, almost none of the studies used histology samples consistently to certificate the presence of endometriosis. Actually, Gonsalves et al.14 and Enzlesberger et al.17 did it, but not in every single sample. Two studies obtained systematically histological samples for conforming the presence of endometriosis.15,16
Five out of the seven studies utilized multiple ultrasound operators.14,17–20 Those same five studies used multiple surgeons on their laparoscopies and half of the studies had blinded surgeons.14,17–20
The TVS scanning protocol proposed by the International Deep Endometriosis Analysis group (IDEA), which is an international consensus on nomenclature and measurements in endometriosis ultrasound evaluation,21 was used in five of the seven studies.15,20 After reading the articles, it is unclear whether the Gonsalves et al.14 and Holland et al.19 studies followed this protocol.
Regarding inclusion and exclusion criteria, the main inclusion criteria in these studies was the indication for laparoscopy due to endometriosis following a transvaginal ultrasound (TVS). Holland did include patients between 16 and 18 years old, as long as they could provide an informed consent. The main exclusion criteria were previous malignancy, previous pelvic surgery, and pregnancy, as well as being unable to undergo TVS, poor quality laparoscopy recording and menopausal status. The Bindra's study also excluded those cases in which the ENZIAN or IDEA protocol were not followed or those cases in which there was a prior MRI.16 Enzelsberger et al. excluded those patients that cancelled surgery 6 months prior to date.17
Qualitative synthesisTable 2 shows the QUADAS-2 evaluation.
QUADAS-2 analysis of the studies assessed.
| Author | Risk of bias | Concerns applicability | |||||
|---|---|---|---|---|---|---|---|
| Patient selection | Index test | Reference test | Flow/timing | Patient selection | Index test | reference test | |
| Gonçalves | High | Low | Low | Low | Low | Low | Low |
| Hudelist | High | Low | Low | Unclear | Low | Low | Low |
| Bindra | High | Low | Low | Unclear | Low | Low | Low |
| Enzelsberger | Low | Low | Low | High | Low | Low | Low |
| Montanari | High | Low | Low | Low | Low | Low | Low |
| Holland | Low | Low | Low | Low | Low | Low | Low |
| Leonardi | High | Low | Low | High | Low | Low | Low |
Regarding the risk of bias, there are four domains. In the patient selection domain, two studies were classified as low risk of bias since study's design was prospective, the inclusion and exclusion criteria were clear, the series was consecutive and no inappropriate exclusions were observed.17,19 Five studies were classified as high risk because of retrospective design16,20 or inappropriate exclusions, for example excluding patient with previous surgery.14,15,18
Regarding the Index test domain, it was clear in all studies. In all cases, index test was TVS. In five studies, the IDEA scanning protocol was used15,20 and in the other two, the way to perform and interpret it was clearly described.14,19 Therefore, all studies were classified as low risk. Certainly, in all studies except one,17 expert examiners performed all ultrasound evaluations.
Regarding the reference standard domain, all the studies used surgical findings. Most stated whether the surgeons were blinded or not. All seven studies were classified as low risk of bias.
Lastly, regarding the flow and timing domain, five studies referred to the time interval between ultrasound diagnosis and surgery. In three studies, this time was considered as low risk of bias (less than 3 months14,18,19) and two studies were considered as high risk of bias (up to 6 months).17,20 Two studies did not report this information.15,16
Regarding the figure of applicability (concerns regarding applicability), we have three domains. Regarding the index test and patient selection domains, all studies were classified as low risk. In the standard reference domain, eight of the studies were classified as low risk and another eight of doubtful risk, as it was not clear in the articles whether the pathologist was blinded to the Ultrasound, which could lead to bias.
Quantitative synthesisResults in terms of diagnostic performance for ENZIAN classification is shown in Table 3.
Diagnostic performance of transvaginal ultrasound to detect dsease according to ENZIAN classification.
| Author | Ovary | Peritonal | Tube | Compartment A | Compartment B | Compartment C | FA | FB | FI | FU | FO |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | Sen/Spe | |
| Gonçalves14 | 95%/95% | – | – | 93%/89% | 91%/85% | 96%/98% | – | 75%/99% | – | – | – |
| Hudelist15 | – | – | – | 84%/85% | 91%/73% | 92%/95% | – | 88%/99% | – | – | – |
| Bindra16 | 95%/100% | 86%/100% | 91%/90% | 100%/100% | 93%/90% | 93%/97% | 100%/100% | – | – | 100%/100% | – |
| Montanari18 | 90%/97% | – | 89%/88% | 95%/93% | 87%/91% | 93%/95% | – | 94%/100% | 50%/99% | 78%/100% | 57%/100% |
| Einzenberg17 | – | – | – | 63%/91% | 46%/86% | 52%/96% | 71%/81% | 32%/99% | 30%/99% | – | 43%/99% |
Three studies used the first ENZIAN classification14,15,17 and two studies used the updated ENZIAN classification.16,18 Overall, sensitivity and specificity for the pelvic compartment A ranged from 63% to 100% and from 85% to 100%, respectively. Sensitivity and specificity for the pelvic compartment B ranged from 46% to 93% and from 73% to 91%, respectively. Sensitivity and specificity for the pelvic compartment C ranged from 52% to 96% and from 95% to 98%, respectively. Sensitivity and specificity for the ovarian involvement ranged from 90% to 95% and from 95% to 100%, respectively. Sensitivity and specificity for the compartment FA ranged from 71% to 100% and from 81% to 100%, respectively. Sensitivity and specificity for the compartment FB ranged from 32% to 100% and from 81% to 100%, respectively. Sensitivity and specificity for the compartment FU ranged from 43% to 100% and from 81% to 100%, respectively. Sensitivity and specificity for the compartment FI ranged from 30% to 50% and from 89% to 99%, respectively.
We observed consistency among the studies, except for Enzelsberger's study that reported poor sensitivity for all compartments.17 Enzelsberger et al. did not exclusively use expert ultrasound operators as in the rest of the studies; this could have affected the results.
Results in terms of diagnostic performance for rASRM classification is shown in Table 4. We also observed a consistency in the results of all three studies. It becomes clear that, ultrasound yield poor sensitivity to identify stages I or II and render a significant rate of false positive cases in women with clinical suspicion of endometriosis that ultimately do not have endometriosis at laparoscopy. However, its diagnostic performance for advanced stages is good.
Diagnostic performance of transvaginal ultrasound to detect dsease according to rASRM classification.
| Author | No disease | Stage I | Stage II | Stage III | Stage IV | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | |
| Gonçalvez14 | 100% | 10% | 41% | 97% | 75% | 97% | 71% | 96% | 91% | 93% |
| Holland19 | 95% | 56% | 3% | 100% | 13% | 97% | 74% | 96% | 85% | 98% |
| Leonardi20 | 92% | 48% | 18% | 95% | 23% | 97% | 63% | 92% | 72% | 97% |
rASRM: revised American Society Reproductive Medicine.
We have observed that the diagnostic performance of TVS for evaluating the extent of disease in women with pelvic endometriosis is high, especially in the case of the ENZIAN classification. In the case of rASRM classification, TVS has a good diagnostic performance for detecting advanced stages, but limited for early stages.
Interpretation of findings in the clinical contextWe believe that the consistency observed throughout all of the studies points towards a clear answer to our initial question. TVS is a reliable tool for pelvic endometriosis staging. Therefore, our results would support the concept that sonography could be stablished as a mandatory imaging technique in the evaluation of women with suspected endometriosis. We think it is a very useful tool since it is widely available and it could prevent unnecessary invasive and expensive laparoscopies.
Furthermore, some of the studies suggested that TVS might be a more sensitive diagnostic method than laparoscopy for certain locations of deep endometriosis below the peritoneal surface.14
Strengths and limitationsThe main strength of our manuscript is that, to the best of our knowledge, it is the first narrative review reporting on this issue.
However, there are some weaknesses that affect this systematic review. Particularly the lack of volume in terms of research regarding this specific topic cannot be overlooked. Another important feature that should be mentioned is the fact that most of these studies were performed mostly in tertiary centres with expert examiners. This implies that the ultrasound operators were way above average in both experience and training. Non-expert examiners probably would encounter some major difficulties to accurately diagnose deep endometriosis through this technique.22 It would be interesting to see how much does the experience of the operator influence the results of the staging. Given the high prevalence of this disease, the most obvious solution to this obstacle is to support an improvement in ultrasound knowledge and training. This way the availability of a reliable and non-invasive endometriosis staging approach would be increased and many women all over the world could benefit from it by getting an early and accurate diagnosis.23
ConclusionsWe can conclude that diagnostic performance of TVS for evaluating the extent of disease in women with pelvic endometriosis is high. However, evidence is still limited and further studies are needed.
Ethical disclosuresProtection of human and animal subjectsThe authors declare that no experiments were performed on humans or animals for this study.
Confidentiality of dataThe authors declare that no patient data appear in this article.
Right to privacy and informed consentThe authors declare that no patient data appear in this article.
FundingThis study had no funding.
Conflict of interestAll authors declare having no conflict of interest.









