In clinical research, the epidemiological design conforms the main part of the study plan and has an impact upon the set of options to be applied in the rest of the research protocol, referring to the selection of the patients, collection of the data and posterior analysis of the results. The design conditions the quality levels of the scientific evidence and the suitability of the recommendations for the adoption of technologies or healthcare procedures in routine clinical practice.1
The epidemiological approach is characterised by: (I) the use of information on groups of people to assess the distribution of diseases and their causes; (II) the need to compare groups in both analytical and descriptive studies; and (III) a fundamental premise in the sense that health problems exhibit a non-randomised distribution, i.e., the distributions and determinants of disease are not explained by chance.2 The study of these distributions allows us to compare the possible differences in exposure or disease among the evaluated groups.
In the research cycle, which can be taken as consisting of five phases – conception, planning, implementation, analysis, and communication of results, the study design is a structuring element in the entire process.
Accordingly, the start of any investigation in the field of health must involve an initial research question or postulate, which posteriorly, and on the basis of an exhaustive review of the literature to establish the current state of knowledge, will allow the definition of a study hypothesis and a series of objectives.
Basically two types of questions can be made: (I) questions which try to clarify the behaviour of a health problem based on the description of variables related to the characteristics of the patient, i.e., questions with a descriptive objective; and (II) questions which try to clarify the behaviour of a health problem based on the analysis of the factors related to the study problem, their measurement, and the magnitude of damage attributable to the presence or absence of these factors, i.e., questions with an analytical objective.2 Thus, a second fundamental step is the choice of the design that best adapts to the study question.Types of epidemiological designs
There are a number of ways for classifying the different types of epidemiological designs. Some of the classification criteria employed are the existence of manipulation, randomisation, follow-up, the sense of the study, the timing of the start of the study, or the study unit, among others. In this context, and based on the criterion of the existence or not of manipulation on the part of the investigator, a first classification of epidemiological designs is shown in Table 1, where the classical division is established between experimental studies and non-experimental or observational studies. This classification takes into account that non-experimental designs are used when it is not possible to conduct an experimental study.3–5
Types of epidemiological designs.
|I. EXPERIMENTAL STUDIES||THE INVESTIGATOR CONTROLS ASSIGNMENT AND INTERVENES IN THE DESIGN. BEST DESIGNS FOR GENERATING HYPOTHESES|
|I.A. Controlled clinical trials||Prospective analysis is made of the effect of an intervention in a group of patients randomly selected from a target population|
|I.B. Community-based intervention studies||Interventions are made in large community-based samples|
|I.C. Quasi-experimental studies||Group rather than individual randomised assignment; all subjects in each group receive or do not receive a given intervention|
|II. OBSERVATIONAL STUDIES||THE INVESTIGATOR DOES NOT CONTROL ASSIGNMENT AND DOES NOT INTERVENE IN THE DESIGN|
|II.A. ANALYTICAL||ALTERNATIVE TO EXPERIMENTAL STUDIES FOR GENERATING HYPOTHESES|
|II.A.1. Case–control studies||Studies of a retrospective nature, since we start from the effect to study the antecedents of exposure in two groups of subjects called cases and controls, according to whether they have the disease or not|
|II.A.2. Follow-up (cohort) studies||Follow-up of cohorts over time (exposed and non-exposed) with the purpose of assessing the hypothesis of association between a given exposure and some effect|
|II.B. DESCRIPTIVE||USED TO GENERATE HYPOTHESES|
|II.B.1. Ecological studies||Studies using pooled information, i.e., the analytical unit is the group or population, and the information on exposure is the average in each pooled study unit|
|II.B.2. Case reports or case series||Descriptions of clinical observations corresponding to isolated cases or groups of individuals with one same diagnosis|
|II.B.3. Cross-sectional (prevalence) studies||Determination of the proportion of individuals presenting a certain disease or risk factor at a given moment in time|
The fundamental difference between these two types of designs is that in experimental studies the investigator intervenes, assigning the study participants to the different exposure categories, i.e., the investigator decides who will be exposed and who will not. In contrast, in non-experimental studies, subject assignment is not decided by the investigator, who simply observes what happens.
Experimental studies, and particularly the randomised clinical trial, represent the design affording the greatest level of scientific evidence. However, observational studies are the most frequent studies in epidemiology, and are classified according to the “objectives of the study” as either descriptive or analytical:
Descriptive studies: these studies aim to determine the distribution of the disease or exposure in the study population. Among the descriptive studies, a distinction is made among prevalence or cross-sectional studies, case series, and ecological studies, according to whether work is done with individual or grouped data.
Analytical studies: these studies aim to analyse the causes or determinants of the appearance of the epidemiological phenomena. Due to the objective of these studies, they are always of a longitudinal nature. Among the analytical studies, a distinction is made between:
Case–control studies: the comparator groups are determined by the presence or absence of an effect. Backward or step-back sense studies.
Cohort studies: the comparator groups are determined by the level of exposure. Forward or step-forward sense studies.
Regarding the guidelines for the communication of observational studies, within the setting of the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) initiative, a group of experts (methodologists, statisticians, investigators and journal editors), based on empirical evidence and methodological considerations, defined recommendations as to what the communication of an observational study should contain. This must be taken into account with the purpose of improving the quality of the publication of observational studies.6
The different studies classified according to their level of evidence are as follows, from greater to lesser importance:
Randomised clinical trials.
Non-randomised clinical trials.
Cohort or case–control studies.
Descriptive studies, clinical cases, expert reports.
Descriptive studies describe the frequency and characteristics of a health problem, do not involve follow-up, and therefore offer a snapshot of the population at a given point in time. These studies do not allow us to establish cause–effect relationships, since exposure and the evaluated outcome are recorded simultaneously.
The most important characteristic in this case is to guarantee that the selected sample of the study population is truly representative of the latter – thereby ensuring validity of the conclusions drawn from the results obtained.7
The following designs in turn are distinguished depending on the study unit involved: (I) individual: cross-sectional or prevalence studies, and case series; and (II) populational: ecological studies.Cross-sectional or prevalence studies
Cross-sectional studies aim to determine the prevalence of a given attribute, such as a specific exposure or disease, or any event related to health, in a given population at a concrete point in time.
In some cases this type of design is the only option, by providing initial information allowing the definition of future hypotheses for later studies.
This type of study is useful for the planning of healthcare services, since it offers an impression of what is happening in a population at a given moment in time. Based on a population, a minimum number of individuals are selected, in order to ensure maximum representativeness. From this sample we then estimate the frequency of subjects who suffer a given disease, or the frequency of subjects who have been exposed to the variable of interest.
An example of a cross-sectional study is provided by the survey conducted in 2002 among schoolchildren between six and seven years of age in Castellón, Spain to determine the prevalence of asthma and its risk factors.8
These studies are less costly in terms of money and time.
They offer a good representation of the healthcare needs of the population at a given point in time.
They can be used to investigate exposure and multiple results.
This type of design does not allow us to assess causal relationships.
They are based on prevalent cases instead of on incident (new) cases; as a result, they are of limited usefulness in exploring aetiological relationships and for establishing the time sequence of events.
They are not useful for investigating infrequent or short-lasting diseases or exposures.
These designs describe the characteristics of a disease in a patient or in a limited patient group. They usually refer to new diseases, rare cases or adverse effects. The main limitations of these studies are that they do not allow the evaluation of statistical associations, and involve no comparator control group. In summary, clinical cases or case series describe the experience gained with an individual or small group of individuals.
An example of a clinical case is that of a 28-year-old nurse developing rash on the hands and forearms after preparing different formulations of donepezil, zolpidem and omeprazole tablets in her workplace. The article described the lesions, their form and timing of appearance in a concrete case – as a result of which the results cannot be extrapolated to other cases.9Ecological studies
In ecological studies, the observations are not made at individual level but at group level. In these studies both exposure and the disease must be present in each of the groups. The usual frequency measure is the incidence of the disease and the corresponding rates in the compared groups. Exposure is also measured with a global index. Ecological studies comprise the following:
Descriptive or map studies, designed to represent geographical patterns of disease or health determinants.
Ecological correlations among quantitative variables, designed to compare groups, i.e., assessing the relationship between mean exposure level and evaluated effect.
Studies of time series, designed to describe the behaviour of events over time.
An example of an ecological study is that carried out in maternal hospitals in the United States to characterise the resource consumption burden in the spring of 2009 caused by the H1N1 flu pandemic.10
These studies are inexpensive and rapid, since they generally make use of existing data on exposure and disease. They are the first generators of working hypotheses.
A large number of individuals can be studied, and as a result small risk increments can be assessed.
These studies can include populations with a very broad range of levels of exposure.
The number of variables for which information is available is limited.
It is difficult to evaluate errors in measurement of the variables of exposure and disease.
They are susceptible to ecological misinterpretation: the observation of a relationship between two variables at population level does not necessarily imply that the same relationship is maintained at individual level.
Control of confounding variables: it is sometimes difficult to identify confounding variables at group level; as a result, such control cannot be made.
In contrast to the descriptive studies commented above, analytical studies do allow us to establish causal relationships, i.e., they make it possible to attribute risk to the exposure or treatment under study, discarding the effects of chance in such relationships. Even in the absence of manipulation or randomisation on the part of the investigator, a design of this type allows us to investigate and confirm causal hypotheses.
The main characteristic distinguishing analytical studies from descriptive surveys is the follow-up of subject exposure over time. Only in the presence of follow-up can we speak of the sense of the study. The sense of the study is determined by the composition of the comparator groups: ill or non-ill persons, or exposed or non-exposed individuals. According to this criterion, the studies can involve a forward (starting with exposure and seeking the effect) or backward design (starting from effect and seeking the exposure).
This follow-up is what allows us to establish causal relationships, while in descriptive studies exposure and effect are documented simultaneously – without being able to establish a time sequence in relation to the event of interest and exposure.Follow-up or cohort studies
The subjects participating in a cohort study are classified into two groups (also known as cohorts), according to whether they are exposed to the studied risk factor or not. Initially, these subjects are free of disease, and after follow-up an analysis is made of the events that have occurred in each group, establishing comparisons to determine which shows the highest incidence – this in turn making it possible to take decisions referred to the hypothesis that the effect is due to the studied exposure.11 It is very important for the subjects not to suffer the study disease at the start of follow-up. If, for example, we design a cohort study to explore the relationship between environmental pollution and the development of respiratory disease in the paediatric population, those children with some such disease antecedent at the start of the study should be excluded, since their inclusion would introduce bias in the results obtained.
There are two types of cohort studies: prospective or retrospective, according to whether the event of interest has occurred or not. In the case of a prospective cohort, the individuals free of disease are classified according to whether they present the risk factor or not, and after forward follow-up over time, we determine whether they present the effect of interest (disease or death) or not.
An example of a prospective cohort is represented by the study of Larsson et al.,12 who examined the possible relationship between exposure to PVC floors and the incidence of asthma in children. Based on a questionnaire, the children were classified according to the presence of possible risk factors – including the presence of PVC floors in the home, among other factors. After a follow-up period of five years, an evaluation was made of the incidence of asthma or other respiratory diseases. This prospective cohort study assumed that all the subjects were initially disease-free, and the patients were classified according to the presence of the risk factor (PVC floors in the home vs. no PVC floors in the home). After follow-up, the frequency of appearance of the disease was recorded. In this case, the prospective design makes it possible to contrast the starting hypothesis, comparing the incidence in the exposed and non-exposed individuals.
In the case of a retrospective cohort, the subjects are likewise classified according to the exposure of interest, but at the present point in time the study result has already occurred; consequently, both exposure and effect have already taken place at the time the study is conducted (Fig. 1).
A retrospective design was used by Short in a study carried out to analyse the effect of beta-blockers upon mortality, admissions and exacerbations in patients with chronic obstructive pulmonary disease (COPD).13 In this case the study was carried out on a retrospective basis, since the data were collected from hospital records and institutional databases – analysing the records corresponding to an interval of 10 years (2001–2010). In this study the exposed cohort consisted of patients with a diagnosis of COPD and who received beta-blockers, while the non-exposed group consisted of patients not administered such medication. After follow-up (retrospective), the results relating to mortality and admissions were analysed and compared in order to establish which group presented the largest number of incidents. At the time of the study, both exposure and effect had already occurred (Fig. 2).
In cohort studies we can calculate relative risk as a measure of association, in addition to the odds ratio (OR) and other potential impact measures.14
Cohort studies allow us to confirm causal hypotheses.
Exposure is determined before the start of the disease, minimising the possibility of population bias in relation to development of the disease.
Multiple intermediate and final results can be assessed.
The incidence of the disease can be determined for exposed and for non-exposed individuals.
Cohort studies sometimes consume a great deal of economic resources, since they generally involve large samples.
These studies usually involve long time periods, since follow-up can be quite extensive until the event of interest occurs.
There may be dropouts or losses during follow-up.
Selection bias may exist due to the assumption that the incidence of the disease among the participants is the same as in those who did not participate.
If exposure in the individuals is not correctly measured, classification bias may result on assigning them to one group or other.
In these studies the subjects are classified according to outcome or disease, in contrast to cohort studies, which classify the subjects according to the presence of the risk factor or exposure. A group of subjects representing the study disease (cases) is selected for comparison with a group of healthy subjects (controls). In this case follow-up is also conducted on a retrospective basis, since both the exposure and the effect have already occurred at the time of the study. In selecting the patient group, the subjects must constitute new cases, and the diagnostic criteria for classifying the individuals and including them in the study must be clearly defined. Selection of the controls is very important, and these subjects must come from the same population as the cases – preferably through probabilistic sampling.
An example of a case–control study (Fig. 3) is that published by Szczepankiewicz et al. with the purpose of analysing the polymorphisms of the HNMT and APB1 genes in asthma.15 To this effect the authors selected a group of asthmatic children (cases) and a control group of healthy children. In both groups a genetic study was made of the polymorphisms of the HNMT and APB1 genes, comparing the presence of the genes of interest in the two groups.
These studies involve a shorter follow-up and lesser economic cost than cohort studies.
They are appropriate for studying infrequent diseases.
The required sample size is smaller than in cohort studies.
These studies are more vulnerable to bias, particularly classification bias, since in the group of cases it is more common to observe memory failure regarding exposure – a fact that can overestimate the differences between groups.
No direct estimations can be made of risk (relative risk); although a risk estimation can be made through the corresponding odds ratio.
Of the different types of epidemiological designs described thus far, the option offering the greatest power in determining whether an intervention affords benefits for health is the randomised clinical experiment. This design allows us to observe the results of the intervention in a group of individuals and to compare their response with that recorded in a group control – the latter typically receiving the conventional treatment or placebo. The two basic features of the clinical experiment are: (I) the comparison of two or more groups of subjects that are identical (homogeneous) in all aspects except for the factor subjected to evaluation (typically a treatment or therapy); and (II) the need for randomisation in order to ensure such comparability and similarity between groups.11 The existence of randomisation is what distinguishes experimental studies from the rest.
In human clinical trials we can differentiate the following phases: Phase I: the objective of this phase is to establish the pharmacokinetics and tolerance of the new treatment in humans; Phase II: this phase explores the effect of the treatment upon patients with the disease under study, and dose adjustment (dose ranging); Phase III: this phase examines the efficacy and safety of the experimental treatment in a large sample of patients; and Phase IV: this study phase is carried out after marketing authorisation has been obtained, with the purpose of assessing the effectiveness of the treatment and of establishing its side effects over the middle and long term (pharmacovigilance).16Individual randomised clinical trial
The experiment with the ideal control conditions is a clinical trial involving individual randomised assignment, employing chance as assignment criterion. An example of this is represented by the study conducted in 33 young adults to assess the clinical efficacy and safety of pre-seasonal sublingual immunotherapy with grass pollen, using carbamylated allergoid versus placebo in patients with seasonal rhinoconjunctivitis. Both groups were followed up on for two years after treatment, with verification of a decrease in the symptoms of rhinorrhoea, sneezing, and conjunctivitis in the vaccinated group.17Community-based randomised clinical trial
A community-based randomised clinical trial (CRCT), also known as a field trial, is characterised by the randomised assignment of compact participant groups, not of individuals. The groups of participants represent sets of administrative or sanitary type, and the corresponding size may correspond to families, healthcare centres, hospitals or entire communities. The observations in the individuals of each group are usually correlated – thus causing this type of clinical trial to have less statistical power than studies involving individual randomisation. A CRCT is the most adequate design: (I) in the evaluation of healthcare programmes or educational/training interventions, which present an organisational rather than an individual order; and (II) for minimising the risk of contamination of the active intervention in the control group.18
An example of a community-based clinical trial is that carried out in 12 paediatric hospitals in Philadelphia, designed to determine whether the incorporation of a tool supporting the clinical decisions integrated in the electronic case history is able to improve adherence to the clinical guidelines of the National Asthma Education and Prevention Program, after one year of follow-up, in a paediatric population with asthma – determining intervention and control subgroups, and stratifying the series into urban and suburban populations.19Pragmatic trials
Once a new treatment has been found to be effective, pragmatic trials can be used to determine its interest or relevance in the routine clinical practice setting – the purpose being to establish therapeutic decisions. The patients included in trials of this kind are representative of those found in routine clinical practice. In addition, these studies conduct evaluations based on decision analysis, and aim to facilitate transfer of the results to the improvement of patient care.20 An example of a pragmatic trial is afforded by the study conducted in 50 patients with persistent allergic rhinitis, evaluating the efficacy and safety of specific immunotherapy with modified Dermatophagoides pteronyssinus extract.21 This pragmatic trial showed the experimental treatment to offer rapid improvement in nasal allergenic tolerance and in the symptoms score – thus allowing general improvement of well-being among allergic patients.
With the purpose of eliminating, minimising or controlling the possible sources of bias in clinical trials, such as measurement, observation, information or placebo effect bias, among others, the investigator can make use of masking or blinding procedures. A clinical trial is said to be double-blinded when the investigator and the participants do not know which is the conventional treatment, thanks to the use of placebo. However, double blinding is often not possible, and in such cases simple blinding can be used (i.e., the investigator knows the treatment group to which each participant belongs).22
The main association measures in experimental studies are relative risk and the incidence density ratio, while the commonly used measures of impact are absolute risk reduction or the number needed to treat (NNT), among others.14Quasi-experimental studies
In terms of the degree of evidence, the so-called quasi-experimental studies are at an intermediate level between observational studies and individual randomised clinical trials, offering the possibility of evaluating the results at individual and group level. Two types of quasi-experimental studies are found: (I) before–after designs (or pre-test post-test studies), without a control group. In these studies we establish a first observation of a certain indicator in the population group (pre-test); in a second step we introduce the intervention or experimental effect; and finally the indicator is re-evaluated (post-test); and (II) before–after with group control. This design follows the same pattern as the previous design, with three phases (baseline assessment, intervention, and final measurement to determine changes produced by the intervention), although in this case a control group is introduced as non-equivalent comparator, since there is no individual randomised assignment.
Some of the points in favour and against experimental studies are summarised below:
These studies offer the best possible design when we aim to confirm a hypothesis.
Provided chance does not produce unexpected surprises, the comparator groups can be expected to be similar.
In relation to a proposed intervention, multiple results can be studied.
Intervention studies tend to be very expensive and take a long time to complete.
Ethical problems may be raised, referred to the principles of fairness and autonomy of the subject.17
It is difficult to ensure compliance with the intervention regimen and to avoid contamination between groups during the study.
The enormous body of information offered by the endless volume of articles published in scientific journals can prove too much to handle – thereby making it difficult for investigators to stay abreast of the latest developments in any given field of Medicine. Systematic reviews try to solve this problem by synthesising, filtering and summarising the scientific production, and facilitate handling of the results obtained in previous studies in any concrete area. According to the Cochrane Manual, the term “systematic review” refers to a synthesis of the results of different primary studies using techniques that limit bias and random error.23
Although not an obliged condition, systematic reviews can increase their quality by including a meta-analysis, i.e., a statistical analysis in which the data examined are the results of the different studies included in the review, with the aim of integrating their respective findings.24 The Quality of Reporting of Meta-analyses (QUOROM) group has developed a document with a checklist for evaluating the quality of the studies.25
In order to synthesise the global effect of the different studies included in the meta-analysis, the statistical measures used are the conventional epidemiological parameters (relative risk, odds ratio, risk difference, etc.), combined in order to calculate measures of global effect, based on different techniques (random effects model, fixed effects model) and estimators (Mantel-Haenszel, Peto, etc.). A characteristic graphic representation of a meta-analysis is the forest plot, which reflects the estimations of effect of the different studies together with their respective confidence intervals, and the measure of global effect. The present article simply aims to present this type of study to the reader; more specialised sources are recommended for a more in-depth introduction to the subject.26–28
One of the most important considerations in conducting a meta-analysis is that the included studies must satisfy the criterion of homogeneity. In effect, in both the design and in the results obtained, the different studies must show some similarity, in order to guarantee their comparability.
An example of a systematic review with meta-analysis is offered by the study of McGwin et al.,29 who examined the relationship between exposure to formaldehyde and childhood asthma – an association which the different publications had been unable to confirm in any consistent manner. By combining the results obtained in seven studies, the authors calculated the global effect, recording a pooled odds ratio of 1.17 (1.01–1.36), based on the random effects model. Although the studies showed some heterogeneity, the investigators concluded that there is a positive association between exposure to formaldehyde and childhood asthma.
One of the main limitations of systematic reviews is that they may introduce publication bias. This can be because there is a tendency not to publish those studies that fail to report the results expected by the investigator, or those in which such results do not reach statistical significance. The presence of this bias can lead the results of the meta-analysis to overestimate the positive impact of the studied effect. Funnel plots are the most widely used graphic tool for detecting the presence of this type of bias.Final comments
In clinical research, the epidemiological design conforms the principal structure of the study-planning phase, and has an impact upon the set of options applicable to the rest of the study protocol. The choice of the design best suited to the proposed study must contemplate the type of question considered, the existing prior scientific evidence on the study subject, the viability of the study in terms of budget issues, and the time available to the research team.Conflict of Interests
The authors have no conflicts of interest to declare.
The authors thank Sabina Pérez-Vicente for her contribution to final drafting of the manuscript.