The AD_sumstats_Jansenetal.txt.gz are the results from the Phase 3 analysis as described in "Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer's disease risk" by Jansen et al in Nature Genetics. The file contains 13,367,301 variants described by the following columns:
uniqID.a1a2 = unique ID per variant, format -> CHR:BP_A1_A2
CHR = chromosome
BP = base pair position
A1 = allele 1 (effect allele)
A2 = allele 2 (non-effect allele)
SNP = rsID (if available)
Z = Z-statistic
P = p-value
Nsum = sample size by simply summing n per cohort
Neff = effecting sample size
dir = directions of effect per cohort (order: ADSP, IGAP, UKB, PGC-ALZ)
MAF = minor allele frequency of allele 1
BETA = effect size
SE = standard error
-----------------------------------------------------------------------------------------------------
Additional comments for specific columns:
Column 2 to 11 are outputted by the mvGWAMA code, which misses reporting MAF and effect sizes. We therefore added the MAF BETA and SE.
MAF: Minor allele frequencies are initially based on HRC and 1000G phase 3 (match for 12,178,387 variants). For the remaining 1,188,914 variants the MAF was reported based on own datasets (match for 1,131,053 variants). For the subsequent 57,861 variants (0.4% of total dataset) no MAF is reported, as the original IGAP-sumstats miss information on MAF (these variants are all solely based on IGAP).
BETA: $Z/sqrt((2*$MAF*(1-$MAF))*($Neff+ ($Z)^2))
SE: 1/sqrt((2*$MAF*(1-$MAF))*($Neff+($Z)^2))