Indian population, comprising of more than a billion people, consists of 4693 communities with several thousands of endogamous groups, 325 functioning languages and 25 scripts. To address the questions related to ethnic diversity, migrations, founder populations, predisposition to complex disorders or pharmacogenomics, one needs to understand the diversity and relatedness at the genetic level in such a diverse population. In this backdrop, six constituent laboratories of the Council of Scientific and Industrial Research (CSIR), with funding from the Government of India, initiated a network program on predictive medicine using repeats and single nucleotide polymorphisms. The Indian Genome Variation (IGV) consortium aims to provide data on validated SNPs and repeats, both novel and reported, along with gene duplications, in over a thousand genes, in 15,000 individuals drawn from Indian subpopulations. These genes have been selected on the basis of their relevance as functional and positional candidates in many common diseases including genes relevant to pharmacogenomics. This is the first large-scale comprehensive study of the structure of the Indian population with wide-reaching implications. A comprehensive platform for Indian Genome Variation (IGV) data management, analysis and creation of IGVdb portal has also been developed. The samples are being collected following ethical guidelines of Indian Council of Medical Research (ICMR) and Department of Biotechnology (DBT), India. This paper reveals the structure of the IGV project highlighting its various aspects like genesis, objectives, strategies for selection of genes, identification of the Indian subpopulations, collection of samples and discovery and validation of genetic markers, data analysis and monitoring as well as the project's data release policy.
Bibliographical noteFunding Information:
Acknowledgements This work was supported by grants from the Council of Scientific and Industrial Research (CSIR), Government of India for the project ‘‘Predictive medicine using repeat and single nucleotide polymorphisms’’ (CMM0016).
Apart from the CSIR laboratories, a key participant in the project is the Indian Statistical Institute (ISI), Kolkata, which will help us in the analysis of our data. The institute has an established expertise in human genetic variation data analysis (Tapadar et al. 2000; Basu et al. 2005). The project also involves active participation of the Anthropological Survey of India (Singh 2002) that has helped in the identification of the various Indian subpopulations. In addition to the institutional facility, the project also has collaborations with The Centre for Genomic Application (TCGA), established through the support of Department of Science and Technology (DST), CSIR with The Chatterjee Group (TCG) for high throughput sequencing and genotyping and Silic-oGene Informatics Private Limited along with Lab-Vantage, India for development of a comprehensive platform for IGV database management, analysis and portal development.
- Genetic structure
- Indian genome variation database
- Indian population
- Repeat polymorphism
- Single nucleotide polymorphism