Biomedical and Translational Informatics Laboratory

MDR-PDT

Method Description:

MDR-PDT – This is a new hybrid method which is merging the strengths of MDR (multifactor dimensionality reduction) and genotype-PDT (genotype pedigree disequilibrium test ). One of the major strengths of MDR is the ability to detect gene-gene and gene-environment interactions even when the main effects are absent. One of the major strengths of genotype-PDT is the ability to carry out single-locus linkage disequilibrium tests on extended pedigrees. However, MDR cannot be used on pedigree data and genotype-PDT is not effective at detecting interactions without main effects. The new hybrid method, MDR-PDT, can be used to examine pedigrees for gene-gene and gene-environment interactions.

Software Subscription

Relevant Publications:

  • Edwards TL, Pericak-Vance M, Gilbert JR, Haines JL, Martin ER, Ritchie MD. An association analysis of Alzheimer disease candidate genes detects an ancestral risk haplotype clade in ACE and putative multilocus association between ACE, A2M, and LRRTM3. American Journal of Medical Genetics Part B, Neuropsychiatric Genetics, 67(3):183-92 (2009).
  • Edwards TL, Torstenson ES, Martin EM, Ritchie MD. A cross-validation procedure for general pedigrees and matched odds ratio fitness metric implemented for the multifactor dimensionality reduction pedigree disequilibrium test MDR-PDT and cross-validation: power studies. Genetic Epidemiology, 34(2):194-9 (2010).
  • Martin ER, Ritchie MD, Kang S, Hahn L, Moore JH. A novel method to identify potential interactions in nuclear families: The MDR-PDT. Genetic Epidemiology, 30:111-23 (2006).
  • Bush WS, Edwards TL, Dudek SM, McKinney BA, Ritchie MD. Alternative contingency table measures improves the power and detection of Multifactor Dimensionality Reduction. BMC Bioinformatics, 9:238 (2008).
  • Motsinger AA, Ritchie MD. The effect of reduction in cross-validation intervals on the performance of multifactor dimensionality reduction. Genetic Epidemiology, 30: 546-555 (2006).
  • Edwards TL, Turner SD, Torstenson ES, Dudek SM, Martin EM, Ritchie MD. A general framework for formal tests of interaction after exhaustive search methods with applications to MDR and MDR-PDT. PLOS ONE, 5(2):e9363 (2010).
  • Edwards TL, Wang X, Chen Q, Wormly B, Riley B, O’Neill FA, Walsh D, Ritchie MD, Kendler KS, Chen X. Interaction between interleukin 3 and dystrobrevin-binding protein 1 in schizophrenia. Schizophrenia Research, 106(2-3); 208-17 (2008).

Related Links:

  • MDR at the Computational Genetics Laboratory, Dartmouth – http://www.epistasis.org/open-source-mdr-project.html
 

simPEN

Method Description:

simPEN uses a genetic algorithm to evolve a penetrance model meeting the specifications of the user. The model is arrived at by minimizing marginal penetrance variance to simulate a model with minimal main effects while also optimizing heritability, table variance, and average marginal penetrance as selected by the user.

simPEN can perform the following:

  • Generate penetrance tables representing models specified in the configuration file.
  • Generate a table as above and then generate a case-control dataset using the penetrance table to assign cases and contrlis.
  • Generate datasets using a previously defined penetrance table.

The current version of simPEN links to a data simulator, genomeSIM, that will use the model evolved by simPEN to create case-control datasets as specified. This program also accepts a datasim file that lists the parameters for running the simulator.

The genetic algoritm uses a fitness function to evolve a model. The function evaluates the fitness of each model in the population. The fitness is dependent on the parameters supplied in the configuration file. Marginal penetrance variance and heritability always affect fitness. Table variance and marginal penetrance target only affect fitness when set in configuration file. Maximum fitness is 1.0. The genetic algorithm terminates when it finds a model with fitness = 1.0 or the maximum number of generations is reached.

Grammatical Evolution Neural Networks

Method Description:

GENN – Applying grammatical evolution to optimize neural nets for detection and modeling of gene-gene interactions.

Genetically Programmed Neural Networks (GPNN) is a technique that utilizes genetic programming to optimize neural nets for classification and identification of gene-gene interactions.

Grammatical evolution (GE) is an evolutionary algorithm that uses linear genomes and grammars to define the populations. In GE, each individual consists of a binary genome divided into codons. Mutation takes place on individual bits but crossover only takes place between the codons. An individual or phenotype is produced by translating the codons using the grammar. The resulting individual can then be tested for fitness in the population and the usual evolutionary operators can be carried out. By using a grammar to define the phenotype, GE separates the genotype from the phenotype and allows greater genetic diversity within the population than other evolutionary algorithms.

Since GENN uses a grammar to define the structure of the resulting neural network, we can easily vary the behavior of the program with changes to the grammar. In GPNN the GP was constrained so that only valid neural networks can be produced. Any change to the behavior required changes to the code. The constraints for GENN are provided by the grammar itself and can be easily modified without modification of the code. For example, Boolean operators can be added or removed by changing only the grammar file used as an input to the program.

In addition, GPNN uses a binary tree for the genome and therefore, only two connections between nodes are possible. In GENN the grammar allows for defining multiple connections between nodes selected by the algorithm. Variable numbers of connections allows for more complicated neural networks to be evolved and potentially makes GENN more powerful than GPNN.

Publications:

  • Motsinger AA, Ritchie MD. Neural networks for genetic epidemiology: past, present, and future. BMC BioData Mining, 1:3 (2008).
  • Ritchie MD, Bartlett J, Bush WS, Edwards TL, Motsinger AA, Torstenson ES. Exploring epistasis in candidate genes for Rheumatoid Arthritis. BMC Proceedings, 1 Suppl 1:S70 (2007).
  • Motsinger-Reif AA, Fanelli TJ, Davis AC, Ritchie MD. Power of grammatical evolution neural networks to detect gene-gene interactions in the presence of error. BMC Research Notes, 1:65 (2008).
  • Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie MD. Comparison of approaches for machine learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genetic Epidemiology, 32(4):325-40 (2008).
  • Motsinger-Reif AA, Reif DM, Fanelli TJ, Ritchie MD. A comparison of analytical methods for genetic association studies. Genetic Epidemiology, 32(8):767-78 (2008).
  • Motsinger AA, Reif DM, Dudek SM, Ritchie MD. Understanding the Evolutionary Process of Grammatical Evolution Neural Networks for Feature Selection in Genetic Epidemiology. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 1-8 (2006).
  • Motsinger AA, Reif DM, Fanelli TJ, Davis AC, Ritchie MD. Linkage disequilibrium in genetic association studies improves the performance of grammatical evolution neural networks. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 1-8 (2007).
  • Turner SD, Ritchie MD, Bush WS. Conquering the Needle-in-a-Haystack: How Correlated Input Variables Beneficially Alter the Fitness Landscape for Neural Networks. Lect Notes Comput Sci, 5483:80-91 (2009).
  • Turner SD, Dudek SM, Ritchie MD. Grammatical Evolution of Neural Networks for Discovering Epistasis among Quantitative Trait Loci. Lect Notes Comput Sci. 6023:86-97. (2010).
  • Motsinger AA, Dudek SM, Hahn LW, Ritchie MD. Comparison of neural network optimization approaches for studies of human genetics. Lecture Notes in Computer Science, 3907:103-114 (2006).

Related Links:

GE – http://grammatical-evolution.org