Composite Genotype Example

The CompositeGenotype example allows a user to examine the joint inheritance of two or more markers, without considering phase of the data. A haplotype frequency file is required for this analysis. We have also included quantitative data in this analysis. If you have problems opening a file, at the error page try using toolbar View, Source or click on Page, View Source to view the file.

Files required to run this example are :

PhasedPedigree.dat - Linkage pre-makeped pedigree file with a header linePhasedPedigree.hap - Loci haplotype data fileTrait.dat - Trait / Covariate data filePedGenieCompGenotype.rgen - .rgen Parameter File

Output file of this analysis is :

PedGenieCompGenotype.report

To run this analysis:

java -jar Genie.jar PedGenie PedGenieCompGenotype.rgen

PhasedPedigree.datThe example pedigree file is composed of 16 three generation families. Each individual has information on two markers.

Trait.datThe quantitative test requires a quantitative trait file. It contains the trait and covariate information for each individual.

PedGenieCompGenotype.rgenThe set-up of CompositeGenotype.rgen is similar to the set-up of the SingleLocus.rgen. Only differences between the two files will be described in this section.

  1. top=”HapFreqTopSim” Instead of dropping alleles from the pedigree founders as we illustrated with the SingleLocus.rgen example, in this example haplotypes are dropped.
  2. In the locus section, loci are defined, as is the distance (‘dist’) between the loci. The ‘dist’ is defined as the distance between the marker and the proceeding marker. If the ‘dist’ is listed as a value <=0.5, it is assumed to be a recombination fraction. If the ‘dist’ is listed as a value >0.5 it is assumed to a value in centiMorgans (cM). The default value for ‘dist’ is 0.5, assuming linkage equilibrium between markers.
  3. The [“top-sample”] option of ‘all’ or ‘founder’. For composite genotype and haplotypes analyses, the [“top-sample”] option is ignored. Haplotype frequencies from the ‘.hap’ file are used instead.
  4. In the first cctable analysis, SNP1 and SNP2 are tested jointly, both in a state of dominant inheritance. We did not list specific loci (eg., loci=”1”) to test and thus all markers defined in the locus section will be tested. We point out to users that within a defined block, each line listed is joined together by an ‘AND’ statement. Between blocks are ‘OR’ statements. Thus in the first block, the first line requires that the first locus (SNP1) be a ‘1/1’ genotype AND the second line states that the second locus (SNP2) can be ‘any’ (./.) genotype. Also within wt=0, we define the scenario where the first locus (SNP1) can be ‘any’ (./.) genotype, but the second locus (SNP2) is required to have the 1/1 genotype. Thus for wt=0, we require that either SNP1 or SNP2 or both have a 1/1 genotype. The comparison group (wt=1), requires a ‘2’ in at least one position for both the first (SNP1) and second (SNP2) loci. The ‘OR’ statement (i.e., ‘|’) allows for unphased genotype data. All statistics in the above ‘ccstat’ list will be run as no ‘stats’ are defined.
  5. In the second cctable analysis, SNP1 and SNP2 are tested jointly, both in a state of recessive inheritance. Under wt=0, we require the first locus (SNP1) to have a ‘1’ in either the first or second position of their genotype. Thus, the SNP1 genotype could be ‘1/1’, ‘1/2’, or ‘2/1’. Any genotype (./.) is permissible for SNP2. Also under wt=0, SNP1 could be any genotype, but a ‘1’ is required for SNP2. We then define wt=1 as requiring SNP1 and SNP2 to both have the ‘2/2’ genotype.
  6. Other combinations of the joint inheritance of SNP1 and SNP2 may be compared. We illustrate in cctable 3 the joint inheritance of a dominant mode of inheritance for SNP1 and a recessive mode for SNP2. In cctable 4, we model the joint inheritance of recessive mode of inheritance for SNP1 and a dominant mode for SNP2.

PhasedPedigree.hapThe PhasedPedigree.hap file lists each haplotype frequency with its corresponding haplotype. The haplotype frequencies listed must sum to ‘1’. Haplotypes with ‘0’ frequency may be listed, but they will be ignored. Haplotypes are defined as each marker, corresponding to the same order listed in the locus list defined in the ‘*.rgen’ file, separated by a ‘-‘.

PedGenieCompGenotype.report output file

We refer users to the SingleLocus.report output for detailed information regarding the PedGenieCompGenotype.report output file. Only differences between the two files will be described here.

  1. For Analysis1, we point out that Loci is defined as ‘ALL MARKERS’ as we did not define a specific marker to analyze in the CompositeGenotype.rgen file. Our defined Model is ‘Dom-Dom’ (SNP1 and SNP2 both in a state of a dominant mode of inheritance), and the Type of data is ‘genotype’ data. The odds ratio statistic is listed as a ‘-‘ as one of the cells in the contingency table has ‘0’ value.
  2. For Analysis2 where SNP1 and SNP2 were set up to both be in a recessive mode of inheritance, we note that the number of statistics calculated is less than the number of simulations (i.e., 0 / 2000 for chi-square and chi-square trend statistics, 1911 / 2000 for the odds ratio, and 1604 / 2000 for the quantitative statistic). Upon examination of the contingency table we see that there were ‘0’ cases and ‘0’ controls in the observed data that had both a 2/2 genotype for SNP1 and a 2/2 genotype for SNP2. Due to ‘sparse’ data, no statistics could be calculated.
  3. Results are also shown for the combined dominant (SNP1) - recessive (SNP2) model as well as for the recessive (SNP1) - dominant (SNP2) model.

PedGenie   PedGenie Examples