Most notable changes per version (minor changes and bug fixes are not explicitly listed) v1.07b - fixed segmentation fault when reading gene annotation file during gene analysis - fixed bug in gene meta analysis - added additional columns to per-gene output files created during gene set/gene covariate analysis (for use with QC R scripts) v1.07 - fixed occasional segmentation fault bug occurring on some Mac and Windows systems when using --pval - general restructuring of code and various fixes and tweaks - improved memory performance when using --pval option - complete rewrite of the gene-level analysis component - implementation of a range gene-level (gene set, gene covariate) conditional, joint, and interaction analysis options - moved all existing gene-level model options to the --model flag - added additional per-gene output from gene-level analysis to assist in post-hoc inspection of results - deprecated and removed the gene-level FWER empirical multiple testing correction option - added modifiable variable name abbreviation option v1.06 - added option to read in synonymous SNP IDs from file, to map SNPs designated with different but synonymous SNP IDs in different input files - provided an up-to-date rs ID synonym file for the most recent dbSNP release - included filtered versions of this synonym file with the 1,000 Genomes reference data files (will be auto-detected) - added additional options for dealing with duplicate SNP IDs (including synonyms) in --pval input file - changed default behaviour from terminating with an error to removing duplicate SNPs - added option to condition gene analysis on specified SNPs in the data - added --big-data option, for processing very large data sets much more quickly and using much less memory - added auto-detection, turning on --big-data automatically for sample sizes greater than 25,000 - modified chromosome batch mode to significantly reduce memory usage per batch even when reading from a single genotype data file - changed NPARAM for the SNP-wise (mean) model to align more closely with the NPARAM for the PC regression model - fixed minor bug: in v1.05, the column names P_MAIN and P_INTERACT for the interaction model were switched (fixed in v1.05b) v1.05 - added optimized top1-SNP p-value gene analysis model - added multi-models: running multiple gene analysis models (PC regression, SNP-wise (mean), SNP-wise (top 1) for each gene and combining into a single p-value - changed PC regression gene analysis model from PCA on SNP covariance matrix to PCA on SNP correlation matrix (more balanced in presence of rare variants) - deprecated --snp-wise flag, replaced by 'snp-wise' modifier of --gene-model flag - added gene by covariate interaction model - added multi-burden score mechanism, setting a (modifiable) maximum on the number of variants aggregated into a single burden score (creating multiple burden scores if exceeded) - added automatic burden score mechanism for very low MAC SNPs (SNPs with both MAC <= 25 *and* MAF <= 0.01) - turned on by default, can be turned off if desired - added inverse mean MAC per gene (and log value) as covariate in gene-level analysis, to correct for low power in genes containing mostly very low MAC SNPs - added option to selectively include automatic covariates - changed default imputation of missing gene covariate values from mean imputation to median imputation - added option to perform two-sided test for competitive gene-set analysis - added a batch mode for distributed/parallel analysis of larger data sets - added chromosome batch mode and support for input data split by chromosome - performed major revision and expansion of the user manual v1.04 - removed single imputation for missing genotypes, replaced with analysis with rescaled sufficient statistics computed from observed data - added permutation-based empirical multiple testing correction for gene-set/gene property analysis - renamed --rare to --burden - replaced --burden SNP filter-file options with general SNP filter-file options under --gene-settings - added R-squared and adjusted R-squared values to gene analysis for PC regression output (default model when performing raw data analysis) - added regression coefficients (raw and semi-standardized) and standard error in output for competitive gene-set analysis and for gene property analysis - turned off self-contained gene-set analysis by default (can be turned on with 'self-contained' modifier for --set-annot flag) - added option to filter genes used in gene-set / gene property analysis based on filter file v1.03 - added permutation based multiple-testing corrected p-values for gene-level analyses - added option to override sample size to meta-analysis - added SNP differential missingness filter - expanded functionality for gene property analysis (handling of missing values, conditional analysis) - expanded functionality for conditional analysis (increased maximum conditioned on sets/covariates, added gene set specification through file) v1.02 - additional rare variant analysis options - added meta-analysis options v1.01 - added support for nonhuman genomes - stricter default pruning of genotype data (raw data PC regression model) to improve power - minor adjustments to computation of gene correlations to improve type 1 error rate control in gene-set analysis