Detecting, Characterizing, and Interpreting Nonlinear Gene–Gene Interactions Using Multifactor Dimensionality Reduction

doi:10.1016/B978-0-12-380862-2.00005-9

Advances in Genetics

Volume 72, 2010, Pages 101-116

https://doi.org/10.1016/B978-0-12-380862-2.00005-9 Get rights and content

Abstract

Human health is a complex process that is dependent on many genes, many environmental factors and chance events that are perhaps not measurable with current technology or are simply unknowable. Success in the design and execution of population-based association studies to identify those genetic and environmental factors that play an important role in human disease will depend on our ability to embrace, rather that ignore, complexity in the genotype to phenotype mapping relationship for any given human ecology. We review here three general computational challenges that must be addressed. First, data mining and machine learning methods are needed to model nonlinear interactions between multiple genetic and environmental factors. Second, filter and wrapper methods are needed to identify attribute interactions in large and complex solution landscapes. Third, visualization methods are needed to help interpret computational models and results. We provide here an overview of the multifactor dimensionality reduction (MDR) method that was developed for addressing each of these challenges.

Introduction

Human genetics has a long and rich history of research to understand the role of interindividual variation in the human genome and variation in biological traits. We have progress rapidly from unmeasured genetic studies in families to the identification of common variation in the DNA sequence that can be used in population-based association studies. This is an exciting time because we now have access to technology that allows us to efficiently measure many DNA sequence variations from across the human genome. We will within the next 5 years likely have access to cutting-edge technology that will deliver the entire genomic sequence for all subjects in our genetic and epidemiologic studies. Now that we have access to the basic hereditary information it is time to shift our focus toward the analysis of this data. The focus of this chapter is on the important role of computer science, and, more specifically, machine learning for mining patterns of genetic variations that are associated with susceptibility to common human diseases. This approach assumes that the relationship between genotype and phenotype is very complex. Specifically, we will focus on computational methods for identifying gene–gene interactions or epistasis that accounts for part of the complexity of genetic architecture.

Human genetics has been largely successful in identifying the causative mutations in single genes that determine with virtual certainly rare diseases such as sickle-cell anemia. However, the same success has not been had for common human diseases such as sporadic breast cancer, essential hypertension or bipolar depression. This is because diseases that are common in the population have a much more complex etiology that requires different research strategies than were used to identify genes underlying rare diseases that follow a simpler Mendelian inheritance pattern. Complexity can arise from phenomena such as locus heterogeneity (i.e., different DNA sequence variations leading to the same phenotype), phenocopy (i.e., environmentally determined phenotypes), and the dependence of genotypic effects on environmental factors (i.e., gene–environment interactions or plastic reaction norms) and genotypes at other loci (i.e., gene–gene interactions or epistasis). It is this latter source of complexity, epistasis, that is of interest here. Epistasis has been recognized for many years as deviations from the simple inheritance patterns observed by Mendel (Bateson, 1909) or deviations from additivity in a linear statistical model (Fisher, 1918) and is likely due, in part, to canalization or mechanisms of stabilizing selection that evolve robust (i.e., redundant) gene networks (Waddington, 1942).

Epistasis has been defined in multiple different ways (e.g., Phillips, 1998, Phillips, 2008). We have reviewed two types of epistasis, biological and statistical (Moore & Williams, 2005, Moore & Williams, 2009, Tyler et al., 2009). Biological epistasis when the physical interactions between biomolecules (e.g., DNA, RNA, proteins, enzymes, etc.) are influenced by genetic variation at multiple different loci. This type of epistasis occurs at the cellular level in an individual and is what Bateson (1909) had in mind when he coined the term. Statistical epistasis, on the other hand, occurs at the population level and is realized when there is interindividual variation in DNA sequences. The statistical phenomenon of epistasis is what Fisher (1918) had in mind. The relationship between biological and statistical epistasis is often confusing but will be important to understand if we are to make biological inferences from statistical results (Moore & Williams, 2005, Moore & Williams, 2009, Phillips, 1998, Phillips, 2008, Tyler et al., 2009). Moore (2003) has argued that epistasis is likely to be a ubiquitous phenomenon in complex human diseases. The focus of the present study is the detection and characterization of statistical epistasis in human populations using machine learning and data mining methods.

The fields of genetics and epidemiology are undergoing an information explosion and an understanding implosion. That is, our ability to generate data is far outpacing our ability to interpret it. This is especially true today where it is technically and economically feasible to measure a million or more single nucleotide polymorphisms (SNPs) from across the human genome. An important goal in human genetics is to determine which of the millions of SNPs are useful for predicting who is at risk for common diseases. This “genome-wide” approach is expected to revolutionize the genetic analysis of common human diseases and, for better or worse, is quickly replacing the traditional “candidate-gene” approach that focuses on several genes selected by their known or suspected function.

Moore and Ritchie (2004) have outlined three significant challenges that must be overcome if we are to successfully identify genetic predictors of health and disease using a genome-wide approach. First, powerful data mining and machine learning methods will need to be developed to statistically model the relationship between combinations of DNA sequence variations and disease susceptibility. Traditional methods such as logistic regression have limited power for modeling high-order nonlinear interactions (Moore and Williams, 2002). A second challenge is the selection of genetic features or attributes that should be included for analysis. If interactions between genes explain most of the heritability of common diseases, then combinations of DNA sequence variations will need to be evaluated from a list of thousands of candidates. Filter (SNP selection) and wrapper (SNP searching) methods will play an important role because there are more combinations than can be exhaustively evaluated. A third challenge is the interpretation of gene–gene interaction models. Although a statistical model can be used to identify DNA sequence variations that confer risk for disease, this approach cannot be translated into specific prevention and treatment strategies without interpreting the results in the context of human biology. Making etiological inferences from computational models may be the most important and the most difficult challenge of all (Moore and Williams, 2005).

To illustrate the concept of statistical interaction, consider the following simple example of epistasis in the form of a penetrance function. Penetrance is simply the probability (P) of disease (D) given a particular combination of genotypes (G) that was inherited (i.e., P[D|G]). Let us assume for two SNPs labeled A and B that genotypes AA, aa, BB, and bb have population frequencies of 0.25, while genotypes Aa and Bb have frequencies of 0.5. Let us also assume that individuals have a very high risk of disease if they inherit Aa or Bb but not both (i.e., the exclusive OR or XOR logic function). What makes this model interesting is that disease risk is entirely dependent on the particular combination of genotypes inherited at more than one locus. The penetrance for each individual genotype in this model is all the same and is computed by summing the products of the genotype frequencies and penetrance values. Heritability can be calculated as outlined by Culverhouse et al. (2002). Thus, in this model there is no difference in disease risk for each single-locus genotype as specified by penetrance values. This model is labeled M170 by Li and Reich (2000) in their categorization of genetic models involving two SNPs and is an example of a pattern that is not separable by a simple linear function. This model is a special case where all of the heritability is due to epistasis or nonlinear gene–gene interaction.

Combining this type of statistical interaction with the challenge of variable selection yields what computer scientists have called a needle-in-a-haystack problem. That is, there may be a particular combination of SNPs or SNPs and environmental factors that together with the right nonlinear function are a significant predictor of disease susceptibility. However, individually they may not look any different than thousands of other SNPs that are not involved in the disease process and are thus noisy. Under these models, the learning algorithm is truly looking for a genetic needle in a genomic haystack. It is now commonly assumed that at least 1,000,000 carefully selected SNPs may be necessary to capture all of the relevant variation across the Caucasian human genome. Assuming this is true, we would need to scan approximately 500 billion pairwise combinations of SNPs to find a genetic needle. The number of higher order combinations is astronomical. What is the optimal computational approach to this problem?

There are two general approaches to select attributes for predictive models. The filter approach preprocesses the data by algorithmically, statistically, or biologically assessing the quality or relevance of each variable and then using that information to select a subset for analysis. The wrapper approach iteratively selects subsets of attributes for classification using either a deterministic or stochastic algorithm. The key difference between the two approaches is that the learning algorithm plays no role in selecting which attributes to consider in the filter approach. The advantage of the filter is speed while the wrapper approach has the potential to do a better job classifying subjects as sick or healthy. We first discuss a specific machine learning algorithm called multifactor dimensionality reduction (MDR) that has been applied to classifying healthy and disease subjects using their DNA sequence information and then discuss filter and wrapper approaches for the specific problem of detecting epistasis or gene–gene interactions on a genome-wide scale.

Section snippets

Machine Learning Analysis of Gene–Gene Interactions Using MDR

As discussed above, one of the early definitions of epistasis was deviation from additivity in a linear model (Fisher, 1918). The linear model plays a very important role in modern genetics and epidemiology because it has solid theoretical foundation, is easy to implement using a wide range of different software packages, and is easy to interpret. Despite these good reasons to use linear models, they do have limitations for detecting nonlinear patterns of interaction (e.g., Moore and Williams,

Filter Approaches to MDR

As discussed above, it is computationally infeasible to combinatorially explore all interactions among the DNA sequence variations in a genome-wide association study. One approach is to filter out a subset of variations that can then be efficiently analyzed using a method such as MDR. We review below a powerful filter method based on the ReliefF algorithm and then discuss prospects for using biological knowledge to filter genetic variations.

Wrapper Approaches to MDR

Stochastic search or wrapper methods may be more powerful than filter approaches because no attributes are discarded in the process. As a result, every attribute retains some probability of being selected for evaluation by the classifier. There are many different stochastic wrapper algorithms that can be applied to this problem (Michalewicz and Fogel, 2004). However, when interactions are present in the absence of marginal effects, there is no reason to expect that any wrapper method would

Statistical Interpretation of MDR Models

The MDR method described above is a powerful attribute construction approach for detecting epistasis or nonlinear gene–gene interactions in epidemiologic studies of common human diseases. The models that MDR produces are by nature multidimensional and thus difficult to interpret. For example, an interaction model with four SNPs, each with three genotypes, summarizes 81 different genotype (i.e., level) combinations (i.e., 3⁴). How do each of these level combinations relate back to biological

Summary

As human genetics and epidemiology move into the genomics age with access to all the information in the genome, we will become increasingly dependent on computer science for managing and making sense of these mountains of data. The specific challenge reviewed here is the detection, characterization, and interpretation of epistasis or gene–gene interactions that are predictive of susceptibility to common human diseases. Epistasis is an important source of complexity in the genotype to phenotype

Acknowledgments

This work was supported by National Institutes of Health (USA) grants LM009012, LM010098, and AI59694.

References (64)

R. Culverhouse et al.
A perspective on epistasis: Limits of models displaying no main effect
Am. J. Hum. Genet.
(2002)
K. Kira et al.
A practical approach to feature selection
X.Y. Lou et al.
A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence
Am. J. Hum. Genet.
(2007)
X.Y. Lou et al.
A combinatorial approach to detecting gene–gene and gene–environment interactions in family studies
Am. J. Hum. Genet.
(2008)
H. Mei et al.
Multifactor dimensionality reduction-phenomics: A novel method to capture genetic heterogeneity with use of phenotypic variables
Am. J. Hum. Genet.
(2007)
R.S. Michalski
A theory and methodology of inductive learning
Artif. Intell.
(1983)
J. Millstein et al.
A testing framework for identifying susceptibility genes in the presence of epistasis
Am. J. Hum. Genet.
(2006)
J.H. Moore et al.
Epistasis and its implications for personal genetics
Am. J. Hum. Genet.
(2009)
J.H. Moore et al.
A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility
J. Theor. Biol.
(2006)
M.D. Ritchie et al.
Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer
Am. J. Hum. Genet.
(2001)

A.S. Andrew et al.

Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking, and bladder cancer susceptibility

Carcinogenesis

(2006)

A.S. Andrew et al.

DNA repair polymorphisms modify bladder cancer risk: A multi-factor analytic strategy

Hum. Hered.

(2008)

K. Askland et al.

Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission

Hum. Genet.

L.W. Hahn et al.

Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions

Bioinformatics

(2003)

Cited by (46)

Explainable variational autoencoder (E-VAE) model using genome-wide SNPs to predict dementia
2023, Journal of Biomedical Informatics
Alzheimer’s disease (AD) and AD related dementias (ADRD) are complex multifactorial neurodegenerative diseases. The associations between genetic variants obtained from genome wide association studies (GWAS) are the most widely available and well documented variants associated with ADRD. Application of deep learning methods to analyze large scale GWAS data may be a powerful approach to elucidate the biological mechanisms in ADRD compared to penalized regression models that may lead to over-fitting.
We developed a deep learning frame work explainable variational autoencoder (E-VAE) classifier model using genotype (GWAS SNPs = 5474) data from 2714 study participants in the Health and Retirement Study (HRS) to classify ADRD. We validated the generalizability of this model among 234 participants in the Religious Orders Study and Memory and Aging Project (ROSMAP). Utilizing a linear decoder approach we have extracted the weights associated with latent features for biological interpretation.
We obtained a predictive accuracy of 0.71 (95 % CI [0.59, 0.84]) with an AUC of 0.69 in the HRS test dataset and got an accuracy of 0.62 (95 % CI [0.56, 0.68]) with an AUC of 0.63 in the ROSMAP dataset.
This is the first study showing the generalizability of a deep learning prediction model for dementia using genetic variants in an independent cohort. The latent features identified using E-VAE can help us understand the biology of AD/ ADRD and better characterize disease status.
Contribution of revision amputation vs replantation for certain digits to functional outcomes after traumatic digit amputations: A comparative study based on multicenter prospective cohort
2021, International Journal of Surgery
Citation Excerpt :
Participants with revision amputation vs replantation of the ring finger distal to the PIP joint and small finger had DASH scores that were statistically indistinguishable. As mentioned, with generalized linear regression, detecting interactions requires more statistical power than do main effects [24,25]. For this reason, we alternatively used MDR with the purpose of finding out important finger-finger interactions on functional outcomes and guiding further analysis.
Traumatic digit amputations can result in significant impairment. Optimal surgical treatment is unclear for certain digits in various amputation patterns. Our aim was to compare the contribution of revision amputation vs replantation for each particular digit to functional outcomes.
Prospective cohort study at three tertiary hospitals was conducted in China. Eligible participants were 3192 patients with traumatic digit amputations enrolled from January 1, 2014, to January 1, 2018. The primary outcome was Michigan Hand Outcomes Questionnaire (MHQ) scores 2 years after initial surgery. Secondary outcome was score on the Disabilities of the Arm, Shoulder, and Hand (DASH).
Of 3192 enrolled patients, 2890 completed the study. Main-effect linear regression showed that participants with replantation of thumb, index, long, and ring (proximal to the proximal interphalangeal [PIP] joint) fingers had significantly better MHQ scores compared to participants with the corresponding finger revision amputation. DASH results were comparable. Finger-finger interaction analyses conducted with multifactor dimensionality reduction (MDR) revealed that the small finger and ring finger had the smallest and greatest interactions with other fingers, respectively. After stratification by amputation level of thumb, index finger, or long finger, linear regression showed that replantation of the ring finger distal to the PIP joint resulted in better MHQ and DASH when the thumb or long finger was also traumatically amputated proximal to the IP/PIP joint.
Replantation of the thumb, index, long, and ring (proximal to PIP joint) fingers is preferable to revision amputation, regardless of amputation pattern. Replantation of the ring finger amputated distal to PIP was beneficial only when the thumb or long finger was amputated proximal to IP/PIP joint. Replantation or revision amputation of the small finger was indistinguishable in terms of functional outcome. Future investigations and clinical decisions should take into account the role of finger-finger interactions.
Genetic variants in the cholesterol biosynthesis pathway genes and risk of prostate cancer
2021, Gene
Citation Excerpt :
Finally, correlation between the expressions of genes was explored using TCGA database via Pearson product moment correlations. Multifactor dimensionality reduction (MDR) was used to analyze SNP-environmental interactions in the risk of PCa (Moore, 2010; Zhang et al., 2016). We applied the best models with high training balance accuracy, testing balance accuracy and cross-validation consistency from the GMDR software (version 0.7) to evaluate the interaction between genes and environmental factors.
Previous studies have found the relationship between cholesterol biosynthesis pathway genes and the risk or prognosis of prostate cancer (PCa), while there is no definite evidence that genetic variants in the cholesterol biosynthesis pathway gene is related to PCa risk. Consequently, we performed this study to explore the associations of single-nucleotide polymorphisms (SNPs) in the cholesterol biosynthesis pathway with PCa risk. We systematically evaluated the association of SNPs in 21 cholesterol biosynthesis pathway genes with the risk of PCa using the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial database using a logistic regression model. Gene expression data of PCa from Gene Expression Omnibus (GEO) datasets and the Cancer Genome Atlas (TCGA) database were applied for mRNA expression analysis. The TCGA database was used to perform expression quantitative trait loci (eQTL) analysis. The interaction between demographic factors and SNPs was analyzed using two-by-four tables. We found T allele of rs67415672 in HMGCS1 is a significant protective allele of PCa [adjusted odds ratio (OR) = 0.90, 95% confidence interval (CI) = 0.83–0.97, P = 4.16 × 10⁻³]. Moreover, rs67415672 was an eQTL for HMGCS1 (P = 2.23 × 10⁻⁶). The expression of HMGCS1 significantly decreased in PCa primary tumors than that in normal tissues. These findings indicated that the HMGCS1 rs67415672 might be possible functional susceptibility loci for PCa.
Proteinarium: Multi-sample protein-protein interaction analysis and visualization tool
2020, Genomics
Citation Excerpt :
Genome-wide association studies (GWAS) have become a popular approach to the investigation of complex diseases [1,2] and have made possible discovery of insights not previously recognized [3–5].
We posit the likely architecture of complex diseases is that subgroups of patients share variants in genes in specific networks which are sufficient to give rise to a shared phenotype. We developed Proteinarium, a multi-sample protein-protein interaction (PPI) tool, to identify clusters of patients with shared gene networks. Proteinarium converts user defined seed genes to protein symbols and maps them onto the STRING interactome. A PPI network is built for each sample using Dijkstra's algorithm. Pairwise similarity scores are calculated to compare the networks and cluster the samples. A layered graph of PPI networks for the samples in any cluster can be visualized. To test this newly developed analysis pipeline, we reanalyzed publicly available data sets, from which modest outcomes had previously been achieved. We found significant clusters of patients with unique genes which enhanced the findings in the original study.
Addiction
2014, Rosenberg's Molecular and Genetic Basis of Neurological and Psychiatric Disease: Fifth Edition
Addiction is characterized by compulsive, out-of-control behavior exemplified by drug use despite adverse consequences. The complex trait of addiction is influenced by genetic and environmental factors. Given the complex interplay of genes and environment, much work has concentrated on dissociating the two. Human studies of twins and families can detect traits with common genetic and/or environmental influence. Genetic animal models are used to determine the neurobiological mechanisms of addiction-related behaviors. Mechanisms of drug-induced neuroplasticity, which includes changes in various molecular and cellular processes, have been implicated in the transition to the addicted state. These mechanisms may offer therapeutic targets for drug development. The prevention of relapse in individuals attempting to abstain from drug-seeking and -taking is a large challenge for the medical science community, but the molecular and genetic research tools exist and are rapidly being improved, which will undoubtedly facilitate progress in the treatment of addiction.
Synergistic association of DNA repair relevant gene polymorphisms with the risk of coronary artery disease in northeastern Han Chinese
2014, Thrombosis Research
Citation Excerpt :
Besides, we also observed that another polymorphism rs4846049 in the 3’-untranslated region of MTHFR gene also exhibited strong associations with CAD. Considering the ubiquity of genetic interactions in the pathogenesis of complex diseases, the identification and characterization of susceptible genes or polymorphisms require a thorough understanding of gene-to-gene interactions [26]. As expected, three of six examined polymorphisms in XRCC1 and MTHFR genes, which were significant in single-locus analyses, constituted the overall best MDR model in association with CAD.
Evidence is mounting suggesting that DNA damage is implicated in the development and progression of atherosclerosis. To yield more information, we focused on six well-characterized polymorphisms from four DNA repair-relevant candidate genes, viz. XRCC1 (rs1799782 and rs25487), XRCC3 (rs861539), MTHFR (rs1801133 and rs4846049), and NQO1 (rs1800566), to identify and characterize their potential gene-to-gene interactions in susceptibility to coronary artery disease (CAD) in Han Chinese. This was a hospital-based case-control study involving 1142 patients diagnosed with CAD and 1106 age- and gender-matched controls. All participants were angiographically confirmed. Risk estimates were expressed as odds ratio (OR) and 95% confidence interval (95% CI). All six examined polymorphisms met Hardy-Weinberg equilibrium. Overall there were significant differences in the genotype/allele distributions of MTHFR gene rs1801133 and rs4846049 (both P ≤ 0.005), and in the genotype distributions of XRCC1 gene rs1799782 (P = 0.002) between patients and controls. The adjusted risk of having CAD was more evident for rs1799782 (OR = 1.53; 95% CI: 1.16-2.02; P = 0.003), rs1801133 (OR = 1.54; 95% CI: 1.22-1.94; P < 0.001), and rs4846049 (OR = 1.74; 95% CI: 1.13-2.69; P = 0.013) under the recessive model. Interaction analyses indicated that the overall best multifactor dimensionality reduction (MDR) model included rs4846049, rs1801133, and rs1799782, and this model had a maximal testing accuracy of 0.6885 and a cross-validation consistency of 10 out of 10 (P = 0.0030). Further interaction entropy graph bore out the validity of this MDR model. Taken together, our findings demonstrate a contributory role of genetic defects in XRCC1 and MTHFR genes, both individually and interactively, in the development of CAD in Han Chinese.

View all citing articles on Scopus

View full text

Chapter 5 - Detecting, Characterizing, and Interpreting Nonlinear Gene–Gene Interactions Using Multifactor Dimensionality Reduction

Abstract

Introduction

Section snippets

Machine Learning Analysis of Gene–Gene Interactions Using MDR

Filter Approaches to MDR

Wrapper Approaches to MDR

Statistical Interpretation of MDR Models

Summary

Acknowledgments

Am. J. Hum. Genet.

Am. J. Hum. Genet.

Am. J. Hum. Genet.

Am. J. Hum. Genet.

Artif. Intell.

Am. J. Hum. Genet.

Am. J. Hum. Genet.

J. Theor. Biol.

Am. J. Hum. Genet.

Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking, and bladder cancer susceptibility

Carcinogenesis

DNA repair polymorphisms modify bladder cancer risk: A multi-factor analytic strategy

Hum. Hered.

Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission

Hum. Genet.

Mendel's Principles of Heredity

Parallel multifactor dimensionality reduction: A tool for the large-scale analysis of gene–gene interactions

Bioinformatics

Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction

BMC Bioinform.

Biofilter: A knowledge-integration system for the multi-locus analysis of genome-wide association studies

Pac. Symp. Biocomput.

Improving strategies for detecting genetic patterns of disease susceptibility in association studies

Stat. Med.

mbmdr: An R package for exploring gene–gene interactions associated with binary or quantitative traits

Bioinformatics

FAM-MDR: A flexible family-based multifactor dimensionality reduction technique to detect epistasis using related individuals

PLoS ONE

Odds ratio based multifactor-dimensionality reduction method for detecting gene–gene interactions

Bioinformatics

Genome-wide association studies: Detecting gene–gene interactions that underlie human diseases

Nat. Rev. Genet.

A screening methodology based on Random Forests to improve the detection of gene–gene interactions

Eur. J. Hum. Genet.

A general framework for formal tests of interaction after exhaustive search methods with applications to MDR and MDR-PDT

PLoS ONE

The correlations between relatives on the supposition of Mendelian inheritance

Trans. R Soc. Edinb.

Spatially uniform ReliefF (SURF) for computationally-efficient filtering of gene–gene interactions

BioData Min.

Optimal use of expert knowledge in ant colony optimization for the analysis of epistasis in human disease

Lect. Notes Comput. Sci.

Enabling personal genomics with an explicit test of epistasis

Pac. Symp. Biocomput.

The informative extremes: Using both nearest and farthest individuals can improve Relief algorithms in the domain of human genetics

Lect. Notes Comput. Sci.

Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS

Bioinformatics

A robust multifactor dimensionality reduction method for detecting gene–gene interactions with application to the genetic analysis of bladder cancer susceptibility

Ann. Hum. Genet.

Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions

Bioinformatics