Single Locus Example

The single locus example analyses each marker as a single locus assumed to be in linkage equilibrium with all other markers. We have also included a quantitative data file for this example. If the user does not wish to include quantitative data in their analyses, please delete the “datafile” sub-element with “quantitative” line and “covar1” lines from the “optional modules” section of the “.rgen” file and also remove “4” from each of the cctable stats. If you have problems opening a file, at the error page try using toolbar View, Source or click on Page, View Source to view the file.

Files required to run this example are :

PhasedPedigree.dat - Linkage pre-makeped pedigree file with a header lineTrait.dat - Trait / Covariate data fileSingleLocus.rgen - .rgen Parameter File

Output file of this analysis is :

SingleLocus.report

To run this analysis:

java -jar Genie.jar PedGenie SingleLocus.rgen

PhasedPedigree.datThe example pedigree file is composed of 16 three generation families. Each individual has information on two markers.

Trait.dat fileThe quantitative test requires a quantitative trait file. It contains the trait and covariate information for each individual.

SingleLocus.rgen filePlease refer to the .rgen Parameter File Description for a point-by-point description of each section of the ‘.rgen’ file. Here we highlight a few of the features of the SingleLocus.rgen file that a user may wish to alter when running a Single Locus analysis.

  1. nsims=”2000”. The number of simulations is 2,000. A user may wish to increase or decrease this value.
  2. Top=”AlleleFreqTopSim”. Alleles not haplotypes will be dropped from the pedigree founders.
  3. In the locus section, we list two loci to be analyzed, and they have the names SNP1 and SNP2.
  4. The statistics of interest listed under ‘optional modules’ for this analysis are the chi-square, chi-square trend, odds ratio, and the difference in means quantitative statistics using quantitative data from the Trait.dat file and the first column (covar1) of data following the kindred and individual identifier columns.
  5. For this analysis, we selected the option to sample ‘all’ of the individuals in the pedigree to determine allele frequencies (“top-sample”>all<). Other options are ‘founder’ and ‘GeneCountAllele’.
  6. In the first cctable analysis, we set-up a dominant mode of inheritance for SNP1. We selected loci=”1”, as SNP1 is the first locus in the locus list defined above. We selected certain statistics that we wanted to run, in this case statistics 1, 3, and 4 from the ‘optional modules’ list (see 4 above), which corresponds to chi-square, odds ratio, and the quantitative statistics. The ‘model’ allows a user to define a model name for the analysis. In this case, we called our model ‘Dom’. Next, through the use of weights (wt), a dominant model was set-up. The lowest weight is the comparison group. We point out to the user that a ‘.’ is a wild parameter and refers to any non-zero value not previously defined. A ‘|’ is defined as ‘OR’, and in this case allows un-phased genotype data to be analyzed.
  7. In the second cctable analysis, we defined a recessive model for SNP1 using the same statistics as the dominant model (see 6 above).
  8. In the third cctable analysis, we set up an additive test for SNP1, where the number of rare alleles (‘2’) are each weighted, rather than combining them into groups as we did for the dominant and recessive models described above. Note: we did not list any statistics for this analysis. Hence all statistics in the ‘optional modules’ will be run.
  9. In the fourth cctable analysis, we set up an allele test for SNP1, by defining type=”Allele”. The default for ‘type’ is ‘Genotype’, and as such we did not need to specify ‘type’ for the dominant, recessive, and additive models using genotype data described above (see 6, 7, and 8 above). We again define weights (wt) for this analysis. In this case, individuals carrying a ‘2’ at SNP1 will be compared to individuals carrying a ‘1’, double counting homozygous carriers.
  10. In the fifth, sixth, seventh, and eighth cctables, we set up the same dominant, recessive and additive modes of inheritance for genotype data and the allele test for SNP2.

SingleLocus.report output file

  1. The output file begins with a header stating the date and time the file was run, parameters that were defined in the ‘.rgen’ file including statistics, loci, number of simulations, etc.
  2. Each separate cctable defined in the ‘.rgen’ parameter file creates a separate analysis section in the output. For Analysis1, the loci was defined as ‘SNP1’, the model defined as ‘Dom’ for dominant mode of inheritance, and our ‘type’ of data was ‘genotype’. A contingency table is displayed showing the number of cases and controls in each of the defined weights we entered into the ‘.rgen’ file. Here we see that there are 4 cases with wt = 0 (1/1 genotype) and 44 cases with a wt = 1 (1/2, 2/1, or 2/2 genotypes) for a total of 48 cases. A similar pattern can be observed for the controls.The quantitative table lists for each weight group the mean trait value for the group. Thus, for individuals in the wt = 0 category (genotype 1/1), the trait value is 2.13 and for individuals in the wt = 1 category (1/2, 2/1, or 2/2 genotype), the mean trait value is 2.1359.Underneath the contingency and quantitative tables, the number of statistics calculated out of the total number simulations is listed. We ran 2,000 simulations and 2,000 statistics were calculated. If the number of statistics is smaller than the number of simulations, this may suggest sparse data and caution is advised when interpreting the results. The observed statistics and empirical p values are also listed. We remind the users that this is simulated data and these results are meaningless. For SNP1, we see that the observed chi-square statistic is 16.8583 and the empirical p value is 0. We encourage users to report the result as p<0.0005 as 2,000 simulations were run. The odds ratio was also highly significant at 8.5556 with p=0 or p<0.0005 and the empirical 95% confidence interval was 3.7485-19.5325. The quantitative statistic was not significant (means statistic =0.0369, p=0.9735).
  3. Analysis 2 shows the same results for SNP1 under a recessive mode of inheritance. The chi-square, odd ratio, and quantitative statistics were all non-significant.
  4. Analysis 3 shows the results for the additive model. There are now three weights listed: weight 0 - 1/1 homozygotes as the reference group, weight 1 - 1/2 or 2/1 heterozygotes, and weight 2 - 2/2 homozygotes. The chi-square and chi-square trend analyses were both significant. For the odds ratio, we list results for both column 2 (weight 1) vs. column 1 (weight 0), and column 3 (weight 2) vs. column 1 (weight 0). The lowest weight is always the reference group. The empirical p value and empirical confidence intervals are listed in the same order. The quantitative statistic results list the difference in means statistics for each weight comparison (defined the same as for the odds ratio) and the observed overall statistic is the ANOVA statistic.
  5. Analysis 4 lists the results for SNP1 analyzing alleles rather than genotypes.
  6. Analyses 5, 6, 7, and 8 are for SNP2, and similar to the SNP1 analyses.

Home  PedGenie  PedGenie Examples