Scientific consortium maps the range of genetic diversity in Asia, and traces the genetic origins of Asian populations
by Dr. Vikrant Kumar, Genome Institute of Singapore
As an anthropologist, I always wanted to know if Asians, known for their extensive linguistic and ethnic diversity also have a substantial level of genetic variation. In other words, do they have a common or multiple origins? Or whether the ancestors of Negritos from Philippines, Malaysia and Indonesia differ from those of their neighboring Asians? Or what binds us more: language or geography? The recent paper published in Science by the HUGO Pan-Asian SNP Consortium – Mapping Human Genetic Diversity in Asia quintessentially answers these fundamental questions which have been floating around for years.
To the best of my understanding, so far, this is the only paper where 73 populations scattered across 10 Asian countries are studied together through a massive collaborative effort of scientists from 40 institutes mostly from Asia (~2000 samples covering almost entire spectrum of linguistic and ethnic diversity were genotyped for ~50000 single nucleotide polymorphic markers). Some of the key findings of this paper are:
· East and Southeast Asians share a common origin.
· East Asians have mainly originated from South East Asian populations with minor contributions from Central-South Asian groups.
· A common ancestor of the Negrito and non-Negrito populations of Asia entered into the continent. This supports the hypothesis of one wave of migration into Asia as opposed to two waves of migrations from Africa.
· The Taiwan aborigines are derived from Austronesian populations. This stands in contrast to the suggestion that this island served as the ancestral “homeland” for Austronesian speaking populations throughout the Indo-Pacific.
· Genetic ancestry is highly correlated with linguistic affiliations as well as geography.
The paper stands out in its attempt to understand the peopling of Asia and their genetic relationships and in the process it not only presents a fantastic genotype database but also provides vital clues to scientists of diverse fields –from linguistics to archeology to human genetics. For example, it may be an interesting proposition for a human geneticist to examine if East and Southeast Asians share, more than expected, risk alleles associated with diseases. Likewise, it may be time for the linguists to re-look at the “birthplace” of the Austronesian linguistic family. I hope the consortium continues with their amazing endeavor and include a lot more number of important and isolated populations from whole of Asia and move beyond the analysis of Single Nucleotide Polymorphism to other kinds such as structural variations.
Please see below the fold for the official press release.
A scan of a rack of magazines these days will likely find at least one article discussing the emergence of China onto the world stage, its headline usually following the standard formula of “The (Sleeping/Red) (Dragon/Giant) (Awakes/Rises)”. Yet despite China’s growing importance to the world news, the genetics of the Han people, the dominant ethnic group in China and the largest ethnic group in the world, has been relatively poorly studied. The HAPMAP project, which characterized 45 Han individuals, not surprisingly revealed that this group has distinct variation from other ethnic groups, yet unveiled little about variation within the Han population. Other studies have been of a small scale or followed only maternal and paternal lineages through Y chromosome and mitochondrial studies. These have suggested an interesting variation within the Han population, primarily on a north-south axis, but not much more.
Fortunately two studies recently published in the American Journal of Human Genetics are shedding more light on the genome wide variation within the Han population. The studies, one led by Jin Li of Fudan University, the second by Jianjun Liu of the Genome Institute of Singapore, each independently studied thousands of autosomal snps in samples collected from several regions in China (the studies utilized 160K snps, 1700 individuals, and 350K snps, 6000 individuals, respectively). The studies by and large reveal a similar story: both show that while the Han Chinese population is comparatively uniform, significant variation exists, and the variation largely seems to correspond to the known north-south settlement of China by this ethnic group. This pattern, as measured by the Genome Institute of Singapore’s study, accounts for roughly 0.4% of the genetic variance within this population. Though small, both studies confirm that this variation could affect genome wide association studies if a geographically diverse population is used without proper stratification. This information will help better guide these surveys and help avoid false positive candidates.
Another important result from these studies is the flip side of the observed genetic variance; though of statistical significance, it is nonetheless relatively small. This is encouraging to a variety of efforts underway in China which presume a relatively flat genetic landscape. Drug development has been carried out, for the most part, in Europe and the United States, and thus the clinical populations have been comprised largely of those of European heritage. This leads to concern on the applicability of the direct translation of these results to other populations. To address this bias, as well as to entail lower costs in the increasingly cost conscious pharmaceutical market, an increasing number of clinical trials are being carried out within China. If there existed strong regional differences in the population’s genetic makeup, these studies would have to be carefully constructed to either properly stratify their cohorts or limit them to a genetically uniform region, especially in trials where there is a potentially significant genetic component to the response to the candidate compound.
While these two studies do not reveal that the intra-ethnic variation within the Han population can be ignored in such studies, it appears it’s likely of a small enough scale to not adversely affect current clinical trial strategies. Similarly, efforts to employ known associations between drug response and genetic variation in crafting public health care services are not disrupted by this new information. For example, cytochrome P450 2D6 is responsible for the metabolism of a wide variety of commonly prescribed drugs. However, mutant forms of this enzyme, though rare in the West, have been revealed by the HAPMAP project to affect roughly a third of Han Chinese. This is encouraging efforts to determine if genotyping prior to prescribing affected drugs could lead to improved health care delivery and lower costs. If there was strong variation within the Han population then strategies created with the HAPMAP data as its basis would not be applicable. Instead, these recent results will serve to improve ongoing studies and ensure meaningful results. Additional studies of genome wide variation within the Han population, encompassing more regions of China, will further fill in the picture and allow an even more refined approach to future translational studies.
Dr. Brian Ring received his PhD in Molecular Biology from Cornell University and currently lives and works in Beijing, China.
Yigene is a Beijing based personal genomics service company. Yigene is working with the Chinese CDC and other public institutions to determine the best methods and practices of translating genetic discoveries to the Chinese public health market.
Image: Wellcome Trust, L0004700, “The doctor is feeling the pulse of a woman patient. Her wrist is supported on a small red bolster. The doctor touches the pulse only with his finger-tips, without looking at the woman,” Watercolour By: Zhao Pei Qun
…or Length Matters
by HUGO President Prof. Edison T Liu
I have been confronted both as an author and a reviewer with the difficulty of explaining a complex story within a 1500 or even a 5000 word limit for a manuscript. The basis of the word limits in many journals have been historical , and often was because of print costs. But as science advances, this publication boundary has not moved with the times. What exactly have been the trends in the biological sciences?
Genomics has been the vanguard of these trends. The datasets are massive, the analyses are complex, and the validation is also extensive. When a high throughput screen for, let’s say, siRNA knock downs, or for synthetic lethality in a genome-wide scale is presented, the appropriate description for just the initial set up and analysis may run into the maximum 5000 words. The biological interrogations and functional validations can rationally require many more words to explain.
In order to compress the presentation, authors have rendered figures with many subpanels and place critical data in supplementary sections that can be many times the size of the original manuscript. In one systems biology paper (Pujana MA, et al. Nat Genet. 39(11):1338-49, 2007), one figure (figure 4) had 4 parts (a-d) but presented 9 panels of high analytical complexity. Our paper recently published in Nature (Fullwood MJ, et al. Nature. 462(7269):58-64, 2009), had a supplement with ~10,000 words for a manuscript that contained less than 5,000 words. The HUGO Pan-Asian SNP Consortium will publish in Science (December 2009) a scientific story in 4000 words which will require a supplementary section of 14,000 words.
It can be said that the supplemental section should be the place to explain the details not necessary for the core story. However, this is becoming less true, especially in genomics, where the massive nature of the data and the analysis requires compartmented analysis each of which are critical to the main story. Operationally, this is leading to a troubling trend. The supplemental data is given marginal analysis by both readers and reviewers. In many ways, this is like presenting each classical novel as a CliffsNotes summary with a supplement that contains the original chapters. One possible outcome is that a distracted public and even expert review panels will miss serious problems in the analysis that is “tucked” away in the supplemental section (as implied in Coombes KR, Wang J & Baggerly KA, Nature Medicine 13, 1276 – 1277, 2007).
Indeed, our schedules are busy and our time is precious, but perhaps we need a publication format that will preserve the needed scholarship and yet respect the limited attention span of busy scientists. For example, why not have an extended abstract of ~400-500 words that tells the core story (even include a figure or two), but allow for the detailed work to be presented in its entirety as a coherent narrative. Given the online nature of journals, this surely cannot be more expensive than the current main-text-with-online-supplement format.
Regardless of whether you are pro or con to this opinion piece, I would encourage the readers to comment and to engage their communities in such a discussion about change in scientific publishing.
by Dr. Benjamin Capps, Centre for Biomedical Ethics, National University of Singapore.
Listening to the discussions at the 2009 HUGO Symposium on Genomics and Ethics, Law and Society, held in the City of Geneva, I was drawn to the writings of Jean-Jacques Rousseau, born in Geneva in 1712. In particular, one paragraph of the Discourse on Inequality (1754) came to mind:
The first man who, having fenced in a piece of land, said "This is mine," and found people naive enough to believe him, that man was the true founder of civil society. From how many crimes, wars, and murders, from how many horrors and misfortunes might not any one have saved mankind, by pulling up the stakes, or filling up the ditch, and crying to his fellows: ‘Beware of listening to this impostor; you are undone if you once forget that the fruits of the earth belong to us all, and the earth itself to nobody.‘
Rousseau could not possibly have known that his ideas would have resonance in the debates on genetic ownership, privacy, and exploitation. As biotech companies trawl isolated regions of the world for their biodiversity, sheltered, until now, by situation and circumstance from the melting pot of migration, international organisations have tried to cultivate a sense of awe in a collective global cultural heritage, while simultaneously asserting on behalf of these communities their rights to be compensated. But now, every individual has an opportunity to demark their uniqueness – to own their genetic sequence – as part of what Linda Avey calls ‘democratising DNA’.
Personalised medicine promises to be a defining advance in allowing drugs to be prescribed effectively and safely. But there were also some words of caution at the Symposium. It should not be forgotten that uncovering our uniqueness brings with it certain risks; most notably ‘bioinformational creep’, a process whereby uncensored access to an individual’s genetic sequence is routinely justified in ‘the public interests’. Moreover, genetic polymorphisms do not always result in expected phenotypes, as was demonstrated by the discovery that James Watson’s genome revealed that he had two normally debilitating, but clearly unexpressed, monogenetic disorders.
Some participants at the HUGO Symposium raised these issues as important concerns for sequencing standards and premature use of sequence data to make future health projections – particularly when made available to the public through commercial enterprises. One may also question where the responsibility for the access and use of this information lies – do these companies have any enduring role to their clients’ wellbeing, particularly subsequent to the delivery of the test results; and who are they bound to divulge, or withhold, this information from? Is it merely paternalistic to raise these concerns, or is it a clear warning that vital individual interests are at risk? Answering these issues will be a task taken forward by the HUGO Ethics Committee over the coming year.