Generation Scotland


Research publications from 2016.

Quantifying the extent to which index event biases influence large genetic association studies. Yaghootkar, H., Bancks, M. P., Jones, S. E., McDaid, A., Beaumont, R., Donnelly, L., … Kutalik, Z. (2016). Human Molecular Genetics.

As the trend for identifying more gene variants of small effect sizes continues, so the need for ever larger sample sizes are needed to find the missing genetic variance. This carries a potential risk if the study cohort is in anyway biased to more or fewer cases of a given trait than the true population frequency. This study tests the potential source of bias in UK Biobank, the largest available single cohort to date, and checks the findings in GS and other cohorts. At least for diabetes and blood pressure, this analysis shows that differences as low as 10% from the true prevalence can lead to false positive or negative findings.   


Access policies in biobank research: what criteria do they include and how publicly available are they? A cross-sectional study. Langhof, H., Kahrass, H., Sievers, S., Strech, D. (2016).  European Journal of Human Genetics 25, 293-300.

GS was established to provide managed access to high quality heath data to bona fide researchers and has clear and transparent governance and access policy to ensure this. GS is mentioned favourably in this survey of other research biobanks.


Genome-wide regional heritability mapping identifies a locus within the TOX2 gene associated with Major Depressive Disorder. Zeng, Y., Navarro, P., Shirali, M., Howard, D. M., Adams, M. J., Hall, L. S., … McIntosh, A. M. (2016). Biological Psychiatry.

Most gene association studies consider each common genetic variant (SNP) in turn and then correct for the total number tested to see if any are significantly associated with the trait of interest. However, this may miss important associations if there is more than one genetic risk variant in any one gene, as can be the case. This study adopts a novel method, haplotype-block-based regional heritability mapping (HRHM), to overcome this limitation and in doing so identifies the TOX2 gene as a candidate risk factor in depression. Moreover, these genetic variants regulate the expression of TOX2, a brain-expressed regulatory protein.


SOS2 and ACP1 Loci Identified through Large-Scale Exome Chip Analysis Regulate Kidney Development and Function. Li, M., Li, Y., Weeks, O., Mijatovic, V., Teumer, A., Huffman, J. E., … Chu, A. Y. (2016). Journal of the American Society of Nephrology.

Over 50 common genetic variants had been associated with kidney function before this study, but they do not fully explain the genetic effect. This large study, found seven new genes with common variants of small effect and one, SOS2, with rare variants of strong effect. The biological effect of the SOS2 gene was shown to be important in kidney development in zebrafish.  


PCSK9 genetic variants and risk of type 2 diabetes: a Mendelian randomisation study. Schmidt, A. F., Swerdlow, D. I., Holmes, M. V., Patel, R. S., Fairhurst-Hunter, Z., Lyall, D. M., … Sattar, N. (2016). The Lancet Diabetes & Endocrinology, pii: S2213-8587(16)30396-5.

Statins are well proven and safe ways to reduce LDL cholesterol and thus the risk of coronary heart disease, but they are associated with modest hyperglycaemia, increased bodyweight, and a modestly increased risk of type 2 diabetes, which in no way offsets their substantial benefits. Genetic variants of PCSK9 also lower HDL cholesterol. This large study of which GS was a partner, showed that the same PCSK9 variants associated with lower LDL cholesterol were also associated with circulating higher fasting glucose concentration, bodyweight, and waist-to-hip ratio, and an increased risk of type 2 diabetes. In trials of PCSK9 inhibitor drugs, investigators should carefully assess these safety outcomes and quantify the risks and benefits of PCSK9 inhibitor treatment, as was previously done for statins.


KLB is associated with alcohol drinking, and its gene product β-Klotho is necessary for FGF21 regulation of alcohol preference. Schumann, G., Liu, C., O'Reilly, P. F., Gao, H., Song, P., Xu, B., … Elliott, P. (2016). PNAS, VOL(ISS), PP. DOI.

Excessive alcohol consumption is a major public health problem worldwide. Although drinking habits are known to be inherited, few genes have been identified that are robustly linked to alcohol drinking. GS contributed to this study of over 100,000 individuals which identified a gene called β-Klotho (KLB) as associated with alcohol consumption. Laboratory mice lacking this gene have an increased alcohol preference. The KLB gene links the liver and brain to regulate alcohol drinking behaviour. It may provide a unique pharmacologic target for reducing alcohol consumption.


Shared genetics and couple-associated environment are major contributors to the risk of both clinical and self-declared depression.  Zeng, Y., Navarro, P., Xia, C., Amador, C., Fernandez-Pujals, A. M., Thomson, P. A., … McIntosh, A. M. (2016). EBioMedicine, 14: 161-167.

A big advantage of GS over many other population cohorts is the family-based recruitment. This allows us to test not just for common genetic and environmental factors influencing health, but also the influence of sharing between couples. We show here that both genetics and couple-based shared environment have a big impact on liability to depression. This has implications for favourable intervention and treatment, based on the family, not just the individual.


Dissection of Major Depressive Disorder using polygenic risk scores for Schizophrenia in two independent cohorts. Whalley, H. C., Adams, M. J., Hall, L., Clarke, T.-K., Fernandez-Pujals, A. M., Gibson, J., … McIntosh, A. M. (2016). Translational Psychiatry, 6, e938.

Psychiatric diagnoses of schizophrenia, bipolar disorder and major depressive disoder are recognised as being heterogeneous categories,  and also as having overlapping features. In the important effort to try to determine ways to stratify these conditions into sub-entities, Whalley and colleagues used Generation Scotland to successfully test and UK Biobank to replicate one such hypothesis: that the genes that predispose to schizophrenia show enrichment in depression amongst those who also show higher levels of psychological distress and neuroticism.


Genetic variants linked to education predict longevity. Marioni, R. E., Ritchie, S. J., Joshi, P. K., Hagenaars, S. P., Okbay, A., Fischer, K., … Deary, I. J. (2016). PNAS. 2016 Oct 31. pii: 201605334. [Epub ahead of print].

Sometimes a rather simple question produces a rather striking and provocative finding, as in this paper by Marioni et al. Do the same genetic determinants that predict educational attainment (secondary schooling only, or tertiary education and highest degree attained) correlate with longer-living? The answer was an emphatic  'yes'. The trick to getting the answer lay in accessing the necessary genetic and phenotypic information in a large number of study participants. By combining Generation Scotland , UK Biobank and Estonian Biobank data (over 130,000 in sum) the answer to the question was nailed - those with the highest educational attainment scores had parents that lived on average 6 months longer, modest perhaps in overall terms but pointing to shared inherited factors that link brain to body and general health.


Rare Functional Variant in TM2D3 is Associated with Late-Onset Alzheimer's Disease. Jakobsdottir, J., van der Lee, S. J., Bis, J. C., Chouraki, V., Li-Kroeger, D., Yamamoto, S., … van Duijn, C. M. (2016). PLoS Genetics, 12(10):e1006327.

The genetics of Alzheimer's Disease has thrown up the classic example of ApoE4 as a commmon risk variant with a high odds ratio and a short list of rare, highly penetrant mutations in APP and PSEN1 for early onset disease. This study identifies a new, rare functional variant in the TM2D3 gene associated with late onset disease in Icelanders.  To be sure of the finding being real relied on careful comparison with exome sequencing data from well-phenotyped control subjects, which Generation Scotland provided.


Data science for mental health: a UK perspective on a global challenge. McIntosh, A. M., Stewart, R., John, A., Smith, D. J., Davis, K., Sudlow, C., … Porteous, D. J. (2016). Lancet Psychiatry, 3(10), 993-998.

A key premise of GS is that by having rich and deep information on all aspects and measures of health in all members of the cohort a new and powerful way to better understand who is most at risk and why and which choice of treatment is likely to work best can be provided. We call this precision medicine driven by data science (computer science and statistics to extract new knowledge from high-dimensional datasets). Mental health research, diagnosis, and treatment could benefit more than most. This position paper sets out the case for doing so in the UK, drawing on the UK Biobank, Generation Scotland, and the Clinical Record Interactive Search (CRIS) programme. Data science has great potential as a low-cost, high-return catalyst for improved mental health recognition, understanding, support, and outcomes. Lessons learnt from such studies could have global implications.


Trans-ancestry meta-analyses identify rare and common variants associated with blood pressure and hypertension. Surendran, P., Drenos, F., Young, R., Warren, H., Cook, J. P., Manning, A. K., … Munroe, P. B. (2016). Nature Genetics.

High blood pressure is a major risk factor for cardiovascular disease and premature death, but there is still limited knowledge on specific causal genes and pathways. GS combined forces with several other cohorts to find additional genetic risk factors. Thirty new gene loci were found, including three with rare mutations that each had a large effect on blood pressure on their own. These may prove to be useful targets for clinical intervention.


Genetic and environmental risk for chronic pain and the contribution of risk variants for major depressive disorder: a family-based mixed-model analysis. McIntosh, A. M., Hall, L. S., Zeng, Y., Adams, M. J., Gibson, J., Wigmore, E., … Hocking, L. J. (2016). PLoS Medicine 13(8), e1002090.

This important paper takes full advantage of the family data available in GS to show how chronic pain relates to depression. The study shows that both are influenced by a multiplicity of genetic factors (that is to say, they are polygenic traits) and that there is overlap between the sets of gene for each trait. Additionally, there is evidence that if one partner suffers chronic pain, their spouse has a high probability of also suffering, perhaps due to shared environment or lifestyle.  


Genome-wide association study of copy number variation with lung function identifies a novel signal of association near BANP for forced vital capacity. Shrine, N., Tobin, M. D., Schurmann, C., Soler Artigas, M., Hui, J., Lehtimäki, T., … Wain, L. V. (2016). BMC Genetics, 17, 116.

Several genetic factors affecting lung function have been described and GS has been valuable in doing so. To date, these studies have focussed exclusively on single DNA base differences, SNPs, but it is known that there are other types of genetic variation in the genome that involve gain or loss of longer sequences of DNA bases. These are called CNVs. This study looks for evidence in GS and other cohorts for CNVs that were associated with altered lung function.  Of nearly 2,000 CNVs tested, two showed some evidence of association.


Investigating Shared Aetiology Between Type 2 Diabetes and Major Depressive Disorder in a Population Based Cohort. Clarke, T-K., Obsteter, J., Hall, L. S., Hayward, C., Thomson, P., Smith, B. H., … McIntosh, A. M. (2016). American Journal of Medical Genetics: Part B.

Diabetics are often depressed, but why? Both have a genetic component, but are they the same or different? Here, the strongest genetic findings for diabetes were tested for their ability to predict depression, but they did not. This could mean the link between the two is more to do with environment than genetics but at this stage our knowledge of which genes influence the risk of diabetes is far from complete and the same is true, but more so for depression. It is a question that needs to be returned to, when we have more data.


Genomic insights into the origin of farming in the ancient Near East. Lazaridis, I., Nadel, D., Rollefson, G., Merrett, D. C., Rohland, N., Mallick, S., … Reich, D. (2016). Nature, 536(7617):419-24.

Here is another study where GS was able to help investigators who needed to find good controls for their study, in this case to look for evidence from DNA extracted from archeological bone samples for the origins of farming.


An empirical comparison of joint and stratified frameworks for studying GxE interactions: systolic blood pressure and smoking in the CHARGE Gene-Lifestyle Interactions Working Group. Sung, Y. J., Winkler, T. W., Manning, A. K., Aschard, H., Gudnason, V., Harris, T. B., … Cupples, L. A. (2016). Genetic Epidemiology, 40(5), 404-415.

This study builds upon earlier collaborative studies between Generation Scotland and the CHARGE (Cohorts for Heart and Ageing Research in Genomic Epidemiology) consortium to refine the relationship between smoking and blood pressure using the Mendelian Randomisation approach to remove potential confounding factors.


Genome-wide Association for Major Depression Through Age at Onset Stratification: Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium. Power, R. A., Tansey, K. E., Buttenschon, H. N., Cohen-Woods, S., Bigdeli, T., Hall, L. S., … Lewis, C. M. (2016). Biological Psychiatry.

Major depressive disorder is a common condition with a strong genetic component, but it has proved challenging to identify specific genetic risk factors, most likely because there are many different genetic routes to the same diagnosis. This study asked whether splitting cases of depression into those with an early age of onset (less that 27 years of age) and those with later onset then comparing helped identify genetic factors that might distinguish early for late onset. One strong finding was made, a small but significant step towards defining sub-categories of illness. 


Genome-wide association study identifies 74 loci associated with educational attainment. Okbay, A., Beauchamp, J. P., Fontana, M. A., Lee, J. T., Pers, T. H., Rietveld, C. A., … Benjamin, D. J. (2016). Nature, 533(7604):539-42.

Generation Scotland contributed data to this remarkable and high profile study which presents strong evidence for a sizeable genetic contribution to educational attainment  (~20% of the variance explained) and 74 loci of genome wide significance. This gene set was preferentially expressed during neonatal brain development


A Combined Pathway and Regional Heritability Analysis Indicates NETRIN1 Pathway is Associated with Major Depressive Disorder. Zeng, Y., Navarro, P., Fernandez-Pujals, A. M., Hall, L. S., Clarke, T-K., Thomson, P. A., …, McIntosh, A. (2016). Biological Psychiatry.

Major depressive disorder is a common condition with a strong genetic component, but it has proved challenging to identify specific genetic risk factors, most likely because there are many different genetic routes to the same diagnosis. This study took a novel approach to the problem by measuring the combined effect of genetic variation across each succesive region of the genome and having found a genetic signal, asked if there was evidence that there was also a signal from genes that were known to be part of the same biological pathway.  This method reduced the number of tests compared to the standard approach and so made it easier to detect a significant signal. The NETRIN1 pathway was implicated. This is plausible as NETRIN1 guides nerve axons as the innervate and interconnect in the developing brain. 


Genetic evidence for a link between favorable adiposity and lower risk of type 2 diabetes, hypertension and heart disease. Yaghootkar, H., Lotta, L. A., Tyrrell, J., Smit, R. A. J., Jones, S. E., Donnelly, L., … Frayling, T. M. (2016). Diabetes.

High body mass index (BMI) is typically associated with higher risk of diabetes, hypertension and heart disease, but for a small sub-set of genetic variants the opposite is apparently true. In this large study, Generation Scotland data was combined with UK Biobank and other cohort data to test 11 'favourable adiposity' alleles. The net conclusion was that although BMI is indeed associated with higher incidence of disease, it matters where fat is stored and that genes that improve fat storage can reduce the risk of illness.


Genome-wide analysis of over 106,000 individuals identifies 9 neuroticism-associated loci. Smith, D. J., Escott-Price, V., Davies, G., Bailey, M. E. S., Conde, L. C., Ward, J., … O'Donovan, M. (2016). Molecular Psychiatry.

Generation Scotland contributed genetic and clinical data to what is by far the largest study to date of the trait of neuroticism. Neuroticism is an important trait to study as it is a key measure of personality and is linked to general and mental illness and well being. The study reports 9 regions of interest in the human genome worthy of further study and confirms a strong genetic association between neuroticism and major depressive disorder. 


Whole-exome sequencing in an isolated population from the Dalmatian island of Vis. Jeroncic, A., Memari, Y., Ritchie, G. R. S., Hendricks, A. E., Kolb-Kokocinski, A., Matchan, A., … Perica, V. B. (2016). European Journal of Human Genetics.

Isolated populations such as the island of Vis off the coast of Croatia are predicted to act as reservoirs of rare genetic mutations of medical interest. This pilot study used data from Generation Scotland as a reference against which to assess the observed varaints and judge whether or not they were specifically enriched in the sample population.


Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N=112 151). Davies, G., Marioni, R. E., Liewald, D. C., Hill, W. D., Hagenaars, S. P., Harris, S. E., … Deary, I. J. (2016). Molecular Psychiatry.

The main study cohort here is the UK Biobank, a large cross-sectional study of 500,000 participants with many measures similar or idential to those in Generation Scotland. GS serves here to replicate UKB findings that ask how educational attainment correlates with a combined (polygenic) measure of general cognitive ability.


Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. van Leeuwen, E. M., Sabo, A., Bis, J. C., Huffman, J. E., Manichaikul, A., Smith, A. V., … van Duijn, C. M. (2016). Journal of Medical Genetics

This study used a very large reference panel to which Generation Scotland contributed to conduct a refined test for evidence of genetic variants affecting lipid levels. Four new loci were identified, adding to the 170 already known. The key new finding was of a damaging variant in the ANGPTL4 gene that would not have been detectable without reference to the 10K data set.


General Framework for Meta-Analysis of Haplotype Association Tests. Wang, S., Zhao, J. H., An, P., Guo, X., Jensen, R. A., Marten, J., … Dupuis, J. (2016). Genetic Epidemiology, 40(3), 244-252.

For most human traits, the effect size of a single genetic variant is very small. Methods have been developed to capture more of the net genetic effect by combining the effects of more than one genetic variant across each locus or gene. These approaches are potentially powerful but not straightforward when applied across cohorts of different genetic background. This study, using Generation Scotland: Scottish Family Health Study data and data from other cohorts sets out an approach to consider and limit these analytical challenges.


Exome-wide analysis of rare coding variation identifies novel associations with COPD and airflow limitation in MOCS3, IFIT3 and SERPINA12. Jackson, V. E., Ntalla, I., Sayers, I., Morris, R., Whincup, P., Casas, J., … Wain, L. V. (2016). Thorax, 71, 501-509.

Chronic obstructive pulmonary disease (COPD) is a debilitating and life limiting condition that is influenced by both environmental and genetic factors. Generation Scotland was one of several cohorts that combined data to show that rare, gene coding mutations in the genes MOCS3, IFIT3 and SERPINA12 were associated with the severity of airflow limitation in COPD.


Assessing the genetic overlap between BMI and cognitive function. Marioni, R. E., Yang, J., Dykiert, D.,  Mõttus, R., Campbell, A., CHARGE Cognitive Working Group, … Deary, I. J. (2016). Molecular Psychiatry.

Obesity and cognitive function are known to be correlated. Here, we show that genetic markers that predict Body Mass Index (BMI) also predict cognition to a certain extent, and vice versa. That said, the environment remains the dominant factor in determining BMI.


Association of Forced Vital Capacity with the Developmental Gene NCOR2. Minelli, C., Dean, C. H., Hind, M., Alves, A. C., Amaral, A. F. S., Siroux, V., … Burney, P. (2016). PLoS ONE, 11(2), e0147388.

Generation Scotland genetic data contributed to this study that substantiated (replicated) earlier evidence for an effect of the NCOR2 gene on lung vital capacity.


Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation. Xia, C., Amador, C., Huffman, J., Trochet, H., Campbell, A., Porteous, D., … Haley, C. S. (2016). PLoS Genetics, 12(2), e1005804.

There is good evidence from twin and family studies for a strong genetic component to many physical and metabolic traits, currently methods capture only a proportion of the genetic contribution. This study uses Generation Scotland data to capture more of that evidence and also to exploit the family based structure of the cohort to test for the effect of shared environment. On average, half of the genetic variance could be explained for the sixteen traits examined. A further 11% was accounted for by the recent shared environment of couples.


Polygenic risk for coronary artery disease is associated with cognitive ability in older adults. Hagenaars, S. P., Harris, S. E., Clarke, T.-K., Hall, L., Luciano, M., Fernandez-Pujals, A. M., … Deary, I. J. (2016). International Journal of Epidemiology.

Coronary artery disease is known to be associated with loss of thinking skills and increased risk of dementia in later years. This study finds evidence that some of this association may be accounted for by shared gene variants. 


Genome-wide association study of sporadic brain arteriovenous malformations. Weinsheimer, S., Bendjilali, N., Nelson, J., Guo, D. E., Zaroff, J. G., Sidney, S., … GEN-AVM Consortium (2016). Journal of Neurology, Neurosurgery & Psychiatry

Here is another study where GS was able to help investigators who needed to find good controls for their clinical case study, in this case for malformations of blood supply to the brain. GS was able to provide matched controls to minimise possible biases in the study.