Understanding the genetic basis of schizophrenia continues to be major challenge. The research done during the last two decades has provided several candidate genes which unfortunately have not been consistently replicated across or within a population. The recent genome-wide association studies (GWAS) and copy number variation (CNV) studies have provided important evidence suggesting a role of both common and rare large CNVs in schizophrenia genesis. The burden of rare copy number variations appears to be increased in schizophrenia patients. A consistent observation among the GWAS studies is the association with schizophrenia of genetic markers in the major histocompatibility complex (6p22.1)-containing genes including NOTCH4 and histone protein loci. Molecular genetic studies are also demonstrating that there is more overlap between the susceptibility genes for schizophrenia and bipolar disorder than previously suspected. In this review we summarize the major findings of the past decade and suggest areas of future research.
Schizophrenia is a devastating psychiatric disease that has a lifetime prevalence of approximately 1% worldwide. It is characterized by the occurrence of delusions, hallucinations, disorganized speech, alterations in drive and volition, impaired cognition, and mood symptoms.  The importance of both environmental as well as genetic factors in the development of this complex disorder has been demonstrated. Growing up in an urban environment, immigration, cannabis usage, male gender and perinatal events (hypoxia, maternal infection, stress, and malnutrition) are associated with increased risk of developing schizophrenia. , Evidence from family, twin, and adoption studies suggests a strong genetic component. A meta-analysis of twin studies estimated the heritability to be 81% (confidence interval 73% to 90% )  and a recent Swedish study of more than 2 million families estimated it to be 64%.  Of all the known risk factors for schizophrenia, genetics is the single largest one. No precise mode of inheritance is known, and less than one third of patients with schizophrenia have a family history. Elucidation of etiological factors remains the overwhelming challenge to schizophrenia researchers. In fact, the most effective method for identification of genetic risk factors for schizophrenia is not clear, and this has led to a number of approaches, including epigenetics, being adopted, in an attempt to clarify the genetic etiology.
Since the initial observation that schizophrenia has a polygenic mode of inheritance and availability of polymorphic markers for genetic mapping, many attempts have been made to find susceptibility genes for schizophrenia using either the methods of linkage or association. ,
One of the early strategies used to begin unraveling the genetic contribution of schizophrenia was the investigation of chromosomal aberrations and familial syndromes with schizophrenia-like phenotypes. Chromosomal aberrations have been reported in families with schizophrenia and other psychiatric disorders. , The deletion of a section of 22q11 leads to the phenotype of velocardiofacial (DiGeorge) syndrome that includes abnormalities in facial features, the palate, and midline heart defects.
Patients with 22q11 deletion syndrome exhibit psychotic symptoms resembling schizophrenia at a frequency of 18%, a rate that is much higher than the overall 1% prevalence of schizophrenia in the general population. On the other hand, when focusing on populations of patients who have the diagnosis of schizophrenia, at least 2% of these individuals are reported to have 22q11 deletions compared with the 0.025% prevalence in the general population. , This region of 22q11 harbours the COMT and PRODH genes among others (Table I).
A number of other chromosomal aberrations have been associated with schizophrenia. On chromosome one, a balanced (1;11)(q42.1;q14.3) translocation, which disrupted the DISC1 gene (Table I), was found in a large Scottish family with high frequency of psychiatric disorders including schizophrenia. ,, More recently, another family of mother-daughter pair with schizophrenia was identified with t(9;14)(q34;q13) translocation which disrupts the NPAS3 (neuronal Period Aryl hydrocarbon receptor Single-minded) gene, , that codes for a transcription factor implicated in neurogenesis. Another cytogenetic investigation of a patient with schizophrenia revealed complex chromosomal rearrangement involving regions containing FEZ1 (which interacts with DISC1) and the kainate glutamate receptor gene GRIK4.  The gene coding for another DISC1 interacting protein, PDE4B, was found to be disrupted by a translocation in a schizophrenia proband with family history of psychiatric disorders.  In these studies, however, the chromosomal aberrations do not fully cosegregate with the SCZ phenotype; thus these chromosomal abnormalities alone are not sufficient to cause SCZ, and they may predispose to several major psychiatric disorders as observed in some of these families, including bipolar disorder, major recurrent depression, addictions, impulse control disorders, and others.
Linkage studies are family based analyses that utilize genetic markers and the information from multiple affected individuals present in a given family to identify linked regions of the genome that is, regions coinherited or segregating with the disease. Linkage studies were initially carried out using highly informative microsatellite markers (approximately 400 markers to cover the genome). At present, pedigree studies can utilize singlenucleotide polymorphism (SNP) linkage marker sets (eg, 6056 SNPs set from Illumina Inc). Some interesting candidate genes have been identified from linkage scans and have been replicated in independent association studies. These include dystrobrevin binding protein 1 (dysbindin, DTNBP1, 6p22.3),  neuregulin 1 (NRG1, 8p12),  and D-amino acid oxidase activator (DAOA, 13q24).  A recent meta-analysis of 32 genome -wide linkage scans across 3255 pedigrees including 7413 affected individuals identified suggestive evidence of linkage based on the summed rank statistics (P SR<0.0077) in two regions: 5q (5q31. 3-35. 1; PSR=0. 0046 ) and 2q (2q12.121.2; P SR=0.0075).  Following secondary analysis, genome wide evidence of linkage was observed on 2q (PSR=0. 00035) after shifting the frame of the 30 centi-morgan wide bins by 50%. The next most significant regions, in descending order were: 1p13.2-q23.3, 2q33.336.3, 2q21. 2-31.1, 1p32. 2-31.1, 5q35.1-35.3, 8p22-12, 10q26. 12-26.3, and 3p14.1-q13.32. Suggestive evidence of linkage with the 8p region (8p22-12, 16-33Mb; P SR=0. 00057) was also observed in the subsample of patients of European ancestry. The 2q, 5q, and 8p regions were found to be linked to schizophrenia in an earlier meta-analysis from a subgroup of 20 studies from this larger set.  Furthermore, Holmans et al  reported suggestive evidence of linkage with 8p21 in a subset of the above families of European ancestry (707 families, 1615 affected). Since this 8p21 region does not include the gene NRG1, they concluded that this linkage might be due to the presence of one or more loci with multiple rare risk-associated SNPs and/or structural variants. The utility of linkage studies was further demonstrated in a recent study where the protein kinase C alpha (PRKCA) gene was identified as a schizophrenia susceptibility site. Carroll et al  examined a linkage area that they had identified earlier on chromosome 17p11.2q25.125 and they identified a rare four-marker haplotype in the 3' untranslated region of the PRKCA gene. They further demonstrated that this low-frequency haplotype showed a trend of association in a sample of unrelated schizophrenia cases and controls (661 cases, 2824 controls, P=0.078, OR=1.9) and was significant in a pooled sample of schizophrenia, schizoaffective, and bipolar disorder patients (P=0.037, OR=1.9). This association was more significant in a stratified sample of males with schizophrenia than in the pooled sample. However, this association was not replicated in independent samples from Ireland and Bulgaria. Caroll et al  also reported that common SNPs in the linkage region showed association with schizophrenia in a United Kingdom sample, although these SNPs did not replicate across other samples. A possible interpretation of this scenario is that both common and rare SNPs in the PRKCA gene region may be associated with schizophrenia and related disorders.
Although some interesting candidate genes have been identified using linkage methods, a major criticism of these studies is that linkage signals are observed on most of the chromosomes and cover thousands of genes. Furthermore, small effect sizes that are now expected for schizophrenia-associated polymorphisms (OR<1.2) and locus heterogeneity further reduces the chances of finding a truly significant region in linkage studies. In addition, collection of large numbers of families with multiple affected individuals for detecting these small effect sizes is labor-intensive and expensive. However, in contrast to genome-wide association studies (GWAS), largescale linkage studies have the advantageous ability to detect regions with multiple rare as well as common variants (allelic heterogeneity) in one or more susceptibility genes. ,, Furthermore, focusing on families with multiple affected individuals likely enriches for transmitted genetic factors and reduces etiologic heterogeneity.
Candidate gene and genome-wide association studies
The limited power of linkage studies to identify genes of modest effect led Risch and Merikangas  to propose the usage of association studies for disease gene identification in disorders with complex architecture such as schizophrenia. The primary advantage of the latter strategy was the possibility to recruit a large sample size with enough power to detect loci of moderate effect. However, they recognized that an important limitation was the lack of availability of technology to assay a large number of polymorphisms across the genome (up to 100 000). This limitation was overcome by the development of prototype SNP chips containing initially only 500 SNPs,  however progressing rapidly to the present-day commercially available chips containing over a million SNPs.  Furthermore, the cost of these chips has come down dramatically over the last few years, making it possible to carry out high-throughput genotyping in very large populations (over 20 000 individuals). The completion of the human genome project,  the parallel development of the HapMap database of human SNP variation, and the availability of information on more than 3.1 million SNPs across the human genome have paved the way to effectively carry out large-scale GWAS. ,,
Genetic association studies are based on the common disease common variant hypothesis. This hypothesis proposes that common diseases are a result of interactive contribution of common variants with small effect sizes, and the susceptibility alleles will be shared by a significant proportion of unrelated affected individuals. This is the basis of both hypothesis-based candidate gene association studies as well as the hypothesis-free GWAS method.
In the past 4 years at least 11 GWAS have attempted to identify susceptibility genes for schizophrenia by genotyping individual samples as well as using DNA poolingbased methods.
GWAS using DNA pooling
DNA pooling was initially proposed as a method to reduce genotyping costs in large-scale association studies.  DNA from cases and controls are pooled into two separate groups, and the differences in allele frequency between the two groups are estimated to assess association. The first pooling-based association study was genecentric, and analyzed 25 494 SNPs present within 10 kb of each of a large set of genes  (Table II). In the initial discovery sample a significant association of the marker rs752016 in intron 11 of the Plexin A2 gene (PLXNA2, 1q32; OR = 1.49; P=0.006) was found. A similar association was observed in the replication case-control as well as family based samples. However, independent replications for this SNP have been mixed.  - 
Shifman et al  conducted a pooling-based GWAS study and observed female-specific association of the SNP rs7341475 G/A, in the fourth intron of the reelin gene (RELN, ORGG=2.1, 9.8x10-5). This was confirmed in a replication sample of patients of European ancestry from the United Kingdom, but not in samples from three other populations (Irish, American, and Chinese). The trend in the last three samples was in the same direction, and was significant in a meta-analysis including all samples ORGG=1.58 (1.31-1.89), P=8.8x10-7. However none of these observations were significant after correcting for multiple testing. The reelin protein is a serine protease that plays an important role in corticogenesis and it is associated with an autosomal recessive form of lissencephaly.  It has also been implicated in neurotransmitter-related GSK3(3 signaling and regulation of NMDA receptor activation.  Polymorphisms in the RELN gene have been associated with neurocognitive endophenotypes of schizophrenia (eg, working memory and executive functioning). , Furthermore, the association of the RELN gene with schizophrenia has been replicated in an independent sample. 
In a whole genome pooling-based study on patients of eastern European ancestry (Bulgaria) significant association of rs11064768 in intron 1 of coiled-coiled domain containing protein 60 (CCDC60, 12q24.23) was observed.  The second best SNP was rs11782269 that is present in an intergenic region on 8p23.1, the closest gene being claudin 23 (CLDN23). The third best was rs893703 in the intron 2 of retinol binding protein 1 gene (RBP1, 3q23). Interestingly, the RBP1 gene has been implicated in schizophrenia pathogenesis and inhibits PI3K/Akt signaling. However these observations are not significant at the genome -wide significance level (P=1.85X10-7) and have not been investigated in other independent samples. In general, the genetic findings support the neurodevelopment hypothesis of schizophrenia. 
GWAS with individual genotyping
In the first GWAS, Lencz et al  observed genome - wide significant association of a SNP rs4129148 near the colony stimulating factor 2 receptor alpha gene (CSF2RA) in the pseudoautosomal region (Table II). Homozygosity for the C-allele of this polymorphism was associated with over threefold increased risk for schizophrenia. They targeted the exonic sequences and upstream region of CSFR2A and its immediate neighbor, the interleukin 3 receptor alpha (IL3RA) for sequencing in an independent patient sample (n=102). They observed that intronic haplotype blocks within CSF2RA and IL3RA were significantly associated with SCZ. Interestingly, one polymorphism, rs6603272, in intron 5 of the IL3RA gene, was also found to be associated with schizophrenia in independent samples of Han Chinese patients. , Lencz et al  also observed an excess of rare non-synonymous mutations in CSF2RA and IL3RA in schizophrenia patients. No further studies of these two genes in schizophrenia have been reported since the findings of Lencz et al in 2007. There may be a tendency to be noted here, that each new GWAS study highlights the top ranking markers that it finds, and does not pay much attention to previously reported findings. This tendency is compounded by the fact that there is an explosion of data available to GWAS investigators; thus, putative new associations are arising in large numbers, providing a wide array of leads to follow.
In the GWAS on schizophrenia subjects from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study (n=1471), no marker achieved genome-wide significance level.  A possible reason for this was the inclusion of patients of diverse ancestry who were not adequately covered in the genotyping platform that the investigators had utilized. The genomic coverage for common SNP variation was 86% for subjects of European ancestry, 79% for East Asian ancestry and only 49% for African ancestry (with r2>=0.8). 
One of the most interesting candidate genes for schizophrenia and psychosis came from the multistage GWAS analyzing over 20 000 cases and controls.  In the initial GWAS (479 cases, 2937 controls), O'Donovan et al  observed 12 loci to be moderately associated with schizophrenia (P<10-5). In the first replication sample (1664 cases, 3541 controls) association of 6 of the 12 SNPs was replicated. These six SNPs were genotyped in the second replication sample (4143 cases, 6515 controls). In the complete replication sample set (stage one + two), three loci namely: Zinc finger protein 804A (ZNF804A, rs1344706, 2q32.1, P=9.25x10-5), intergenic regions on 11p14.1 (rs1602565, P=3. 22x10-4) and 16p13.12 (rs7192086, P=5.10x10-4) were modestly associated with schizophrenia. In the combined sample from the initial stage plus the two replication sets, ZNF804A showed strong evidence of association (OR=1.12; P=1.6x10-7). Furthermore, when patients with bipolar disorder were included, the obser-vation became more significant, suggesting that ZNF804A might be a susceptibility gene for the broader psychosisphenotype. ZNF804A is a putative transcription factorand the risk allele of the rs1344706 polymorphism (intron2, C>A) has been recently shown to be associated withdisturbed connectivity between the dorsolateral pre-frontal cortex (DLPFC) and the hippocampus, as well asbetween left and right hemispheres. Also there wasaltered coupling of DLPFC with the amygdala.  The former may lead to disturbed executive function and the latter can affect the interaction between prefrontal and limbic structures. The association of ZNF804A markerrs1344706 with schizophrenia was recently replicated inpatients from Ireland (n=1021 cases, 626 controls; P=0.01).  Increased expression of the A allele comparedwith the C allele was observed in the dorsolateral pre-frontal cortex of postmortem control brain samples. However, there was no difference between the two alleles in overall mRNA expression between postmortem schizophrenia cases and controls.  The SNP rs1344706 was also significant in the GWAS conducted by theInternational Schizophrenia Consortium (P=0.029, OR (A-allele) =1.08).  Additionally, in a large multicenterstudy, association of rs1344706 with schizophrenia (5164 cases and 20 709 controls; OR=1.08, P=0.0029) and psy-chosis (OR=1.09, P=0.00065) has been replicated.  Based on this replication by three other independent groups, andthe demonstration of functional effect on brain connec-tivity, ZNF804A is a promising candidate gene for schiz-ophrenia and psychosis in general. The small effect sizes (OR 1.08-1.12) account for only 1% to 2% of the variancein risk for the disease. In general these replicating smalleffects are consistent with the common disease commonvariant hypothesis.
In the GWAS conducted by Need et al  no SNP reached the genome-wide significance level; however the strongest association was observed for a SNP in the 3'-UTR of ADAMTS-Like protein 3 (ADAMTSL3, rs2135551, P=1.35x10-7, OR=0.68). An effect on splicing was demonstrated for the SNP rs950169 which is in linkage disequilibrium with rs2135551. The genetic association of rs2135551 was replicated in an independent cohort of patients but was not replicated in three other sample sets. Need et al also analyzed whether association of other current candidate genes for schizophrenia was present in their study. They observed gene-wide association of polymorphisms in Fasciculation and elongation protein zeta-1 (FEZ1) and Notch homolog 4 (NOTCH4).
By the middle of 2009, three GWAS of impressive scale were conducted by three large consortia for schizophrenia genetics. Each effort examined several thousand cases and controls, and they were published in the same issue of Nature. The study from the International Schizophrenia Consortium (ISC) analyzed 3322 schizophrenia cases and 3587 controls.  The most significant association was observed for rs5761163 in the first intron of myosin XVIIIB (MYO18B; 22q11.2-q12.1; P=3.4x10-7). The second strongest association was with SNPs spanning the major histocompatibility complex (MHC, 6p22.1) and the most significant SNP was rs3130375. Following imputation, rs3130297, located in the MHC region 7.1 kilobases (kb) upstream from the Notch homolog 4 gene (NOTCH4) showed genome -wide significance (P=4.79x10-8). A trend for association of this SNP was observed in the Molecular Genetics of Schizophrenia (MGS) sample (P=0.086)  but not in the SGENE sample (P=0.14).  In the combined sample from the three studies, including both imputed and genotyped SNPs in the MHC region, rs13194053 was genome -wide significant. The study also provided evidence that a large number of common variants have an important role to play in schizophrenia susceptibility, as a group explaining about one third of the total variation in risk for the disease (34% [CI=32%-36%]). 
In the second GWAS, performed by the SGENE consortium, Stefansson et al  did not observe any genomewide significant SNPs in their sample of 2663 schizophrenia cases and 13 498 controls. They analyzed the top 1500 SNPs in the ISC (2602 cases and 2885 controls) and the MGS samples (2681 cases and 2653 controls). The top 25 markers (P<1x10-5) in the combined sample were followed up in four independent samples (total of 4999 cases and 15 555 controls). They observed genome -wide significant association of SNPs in the MHC region (6p21. 3-22.1, HIST1H2BJ, PRSS16, and NOTCH4), as well as with the marker rs12807809 located 3.4kb upstream of the neurogranin gene (NRGN, 11q24.2), and an intron four SNP in transcription factor 4 (TCF4, 18q21.2). The odds ratios for associated SNPs in the MHC region varied from 1.15 (HIST1H2BJ, rs6913660, P=1.1x10-9), to 1.19 (NOTCH4, rs3131296, P=2.3x10-10) to 1.24 (PGDB1, rs13211507, P=8.3x10-7). The genes associated in this study are involved in immune-related functions (MHC), impaired memory function (NRGN), and neurodevelopment (TCF4). 
The Molecular Genetics of Schizophrenia (MGS) study  was not able to identify genome -wide significant markers in their sample set. The MGS study had both African-American (AA, 1 286 cases, 973 controls) as well as European ancestry (EA) patients (2681 cases, 2653 controls). In the EA sample the top SNP was an intron 10 polymorphism rs13025591 in the AGAP1 gene (ArfGAP with GTPase domain, ankyrin repeat and PH domain 1, 2q37.2, P=4.6x10-7, OR=1. 22). Among the top SNPs were an intron 12 SNP, rs16941261, in NTRK3 (neurotrophic tyrosine kinase, receptor, type 3, 15q25.3, P=8.1x10-7, OR=1.25) and intron two SNP, rs10140896, in EML5 (echinoderm microtubule associated proteinlike 5, 14q13.3, P=9.5x10-7, OR=1.22). In the African-American subsample, ERBB4 (v-erb-a erythroblastic leukemia viral oncogene homolog 4, 2q34, rs1851196, P=2.4x10-6, OR=0.733) and CBX2 (chromobox homolog 2, 17q25.3, rs3751954, P=4.6x10-6, OR=0.528) were associated. ERBB4 and its ligand neuregulin 1 (NRG1) have been associated with schizophrenia in earlier studies (for discussion see http://www.schizophreniaforum.org). NRG1 is a very large gene that presents challenges for more detailed study due to its size. Certain areas of the gene have been highlighted by the fact that haplotypes of markers in these regions have exhibited some replication across schizophrenia studies. However, the lack of detailed information as to which markers in the gene alter its biological function leads to a need to cover the entire gene with a set of hundreds of markers, in order to be able to state that a comprehensive analysis was done. Thus, even in this single locus, one can see that multiple testing challenges arise. Overall, it can be seen that much more work by genome researchers on annotation of the functional significance of variants in any given gene is required.
Shi et al  further conducted a meta-analysis of the samples of European ancestry in their MGS sample as well as samples from ISC and SGENE (total sample: 8008 cases, 19 077 controls). Genome -wide significant association was observed with SNPs on chromosome 6p22.1 spanning 209kb. The strongest association was with the SNPs present within a cluster of five histone genes (HIST1H2BJ, HIST1H2AG, HIST1H2BK, HIST1H4I, and HIST1H2AH). The surrounding region also includes genes related to immunity, chromatin modification and G protein -coupled receptor signaling. To summarize across these large GWAS investigations, each group of authors did not find genome -wide significant results when they analyzed their samples individually. They found significant results only when they merged samples from several large studies and conducted pooled analysis. The overall finding was the association of SNPs in the MHC region (6p22.1) with schizophrenia. These results provide strong evidence that common variants are associated with schizophrenia; however, the effect size of the risk variants is small (<1.2). Therefore, studies with small sample size are unlikely to detect these common variants, unless these SNPs have larger effect size in more refined phenotypes. In a recent GWAS on a Norwegian sample, Athanasiu et al  also failed to find genome-wide significant results in their discovery sample (n=506), probably due to the small sample size. However, when they analyzed the top 1000 SNPs of their study (or the surrogates of these SNPs) in the SGENE-plus consortia sample,  16 loci showed marginal association (P<0.05). In the combined analysis they observed three markers to be significant at the genome-wide level. These were- rs7045881 in the gene phospholipase A2-activating protein (PLAA, 9p21, P=2.12x10-6, OR=0.86); rs433598 in acyl-CoA synthetase medium-chain family member 1 (ACSM1, 16p12.3, P=3.27x10-6, OR=1.13); and rs10761482 in ankyrin 3, node of Ranvier (ankyrin G) gene (ANK3, 10q21, P=3.27x10-6, OR=0.86). The function of these genes includes inflammatory response and membrane integrity (PLAA), endocrine function and dislipidemia (ACSM1), and involvement in activities such as cell motility, activation, proliferation, contact, and maintenance of specialized membrane domains (ANK3). Interestingly ANK3 has also been associated with bipolar disorder in a recent meta-analysis.  In addition to the above observations, Schulze et al  also observed nominal association of genes for bipolar disorder that have been associated with schizophrenia in candidate gene as well as genome-wide association studies (DISCI, NRG1, RELN, and OPCML).
One of the major new observations from the GWAS studies is the fact that many of the positive loci for schizophrenia are also positive in bipolar, and vice-versa. This molecular-genetic overlap may have important implications for diagnostic classification of schizophrenia in the future DSM-5 and beyond. Also in the future, the field will see a transition from SNP methodology to DNA sequencing. Thus all of the DNA variation in a given gene will be detected. For example, the sequencing will detect small insertions and deletions, as well as repeat sequences, that largely would have been missed by the current SNP arrays. The downside of this large increase in the amount of information available will be even more multiple testing challenges, as well as considerable labor to establish the functional status of each variant of the gene in question.
|Study no||Study no||Ethnicity||Sample size||Sample size||No of SNPs tested||Significant SNPs (P-value)||Gene|
|Discovery (cases/ controls)||Replication (cases/ controls)|
|1||Mah et al, 200677||European ancestry||320/325||200/230c||25 494||rs752016 (0.006)||PLXNA2|
|2||Lencz et al, 200789||European ancestry||178/144||71/31a||439 511||rs4129148 (3.7X10-7)||CSF2RA|
|3||Sullivan et al, 200892||European/African ancestry/ others||738/733||--||492 900||--||--|
|4||O'Donovan et al, 200893||European||479 /2937||6,666/9897b||362 532||rs1344706 (1.95x10-7)||ZNF804A|
|5||Shifman et al, 200881||Ashkenazi Jews||241/964d||768/2194d||rs7341475 (2.9X10-5)d||RELN|
|6||Kirov et al, 200987||European ancestry||574 Trios/605||--||433 680||rs11064768(1.2X10-6)||CCDC60|
|7||Need et al, 200998||European ancestry||871/1460||863/12 995||312 565||rs2135551 (1.34X10-6)||ADAMTSL3|
|8||Internation Schizophrenia Consortium96||European ancestry||3322/3587||8008 /19 077 (ISC, MGS & SGENE combined)||739 995||rs13194053 (9.5X10-9)||6p22.1|
|9||Stefansson et al, 2009 (SGENE-plus)41||European ancestry||2663/13 498||4999/15 555||314 868||rs6913660 (1.1X10-9)||HIST1H2BJ|
|10||Shi et al, 2009 (MGS)109||European and African ancestry||3967/3626||8008 /19 077 (ISC, MGS & SGENE combined)||696 788 in EA||6p22.1||HIST1H2BJ, HIST1H2AG, HIST1H2BK, HIST1H4I and HIST1H2AH|
|843 798 in AA||1.06-4.35X10-8||HIST1H2BJ, HIST1H2AG, HIST1H2BK, HIST1H4I and HIST1H2AH|
|11||Athanasiu et al, 2010100||European ancestry||201/305||2666/13 780||572 888||rs7045881 (2.12X10-6)||PLAA|
Copy number variation and schizophrenia
Considering that two thirds of the cases of schizophrenia are sporadic, a role of rare variation in development of schizophrenia is not unexpected. Rare variations can include mutations as well as deletions and duplications. Copy number variations (CNV) are submicroscopic deletions or duplications stretching from a few kilobases to several megabases covering several or many genes. One of the earliest well-supported deletions relating to schizophrenia is on 22q11. This region, also known as the velocardiofacial/DiGeorge (VCFS) syndrome region, is caused by a deletion of 1.5 to 3Mb on one chromosome (thus rendering the person hemizygous for the genes involved). Besides the features observed in patients with VCFS, the 22q11 deletion region is associated with increased risk of schizophrenia and other neuropsychiatric disorders. 
Initial studies of CNVs in schizophrenia were carried out using bacterial artificial chromosome (BAC) array comparative genome hybridization (aCGH) with the resolution ranging from 2.3 Mb,  1.4 Mb,  0.7 Mb,  to 150 -200kb.  The more recent studies have utilized SNP arrays for estimation of CNVs with a resolution of approximately 30 to 100kb. ,,,,,
Using array CGH, several regions harboring deletions and duplications were identified in Korean schizophrenia patients.  Two frequent CNVs were observed. First was a gain in sequence that varies in length across individuals in the Xq23 region (in 52% of schizophrenia cases). Second, was a loss at 3q13.12 (in 32% of schizophrenia cases). They also identified deletion as well as duplications at 22q11.21. Wilson et al  analysed 105 postmortem brain samples, n=35 each for schizophrenia, bipolar disorder and controls and observed CNVs at 1p34.3 (GLUR7), 5q21.3 (EFNA5), 14q23.3 (AKAP5), and 22q12.3 (including the CACNG2 gene) in cases but not in controls. Furthermore, three of the CNV loci, except for 5q12.3, were replicated in an independent sample of 60 psychiatric patients.  Similarly, Mizuguchi et al also observed CNVs in six (10%) of the 59 schizophrenia patients they analyzed. If this relatively high percentage of SCZ patients with one of a set of common deletions continues to be observed, then genetic screening for these deletions may become more useful, as discussed recently for 22q11.2 deletions. 
The study by Kirov et al  detected 13 CNVs in 93 schizophrenia patients. These were not detected in 372 control individuals. Of these CNVs two were thought to have a possible role in schizophrenia pathogenesis. The first was a 1.4 Mb de novo duplication on chromosome 15q13.1 which includes the gene amyloid ß (A4) precursor protein binding, family A, member 2 (APBA2), and the second was a 0.25Mb deletion on 2p16.3 that includes the Neurexin gene (NRXN1). Gene-gene interaction between APBA2 and NRXN1 is known and both play a role in synaptic development and function. 
An excess of rare CNVs (both deletion and duplications) in schizophrenia and schizoaffective disorder patients (20%) versus controls (5%) was reported by Walsh et al.  Similar excess of rare CNVs was observed in young-onset cases (20%) as well as in a replication sample of childhood-onset schizophrenia. Schizophrenia patients were over three times more likely to carry deletions or duplications compared with controls. However, there was no difference in the distribution of common CNVs between cases and controls. The CNVs present in schizophrenia patients were disproportionately more likely to disrupt genes from signaling networks controlling neurodevelopment, including neuregulin and glutamate pathways. Interestingly, similarly to Kirov et al,  a deletion disrupting NRXN1 was found in identical twins concordant for childhood-onset schizophrenia. ERBB4, a type I transmembrane tyrosine kinase receptor for neuregulin (NRG1), was also found to be disrupted by a 399 kb deletion in one schizophrenia patient. Similarly, a 503 kb deletion that disrupted SLC1A3, a glutamate transporter was also observed. However, this excess of rare CNVs was not replicated in an independent sample of Han Chinese schizophrenia patients (155 cases and 187 controls). 
Xu et al  analyzed the presence of de novo CNVs and observed that these are approximately 8 times more frequent in sporadic cases (10%, 15 of 152) than in controls (1.3% (2 of 159). Furthermore, they did not observe any de novo CNVs in familial cases (n=48). Additionally, sporadic cases of schizophrenia were ~1.5-times more likely to inherit a rare CNV (~30%, 46/152) than unaffected controls (~20%, 32/159). Among the de novo CNVs observed in cases, deletions were seen at 22q11.2 (1.8%) along with deletions at 12p11.23 and 16p12.1p12.2 which were also observed by Walsh et al. 
The study by Stefansson et al  tested an interesting hypothesis that regions in the genome that have a high mutation rate, but harbor rare CNVs, are likely to be selected against during evolution. These rare CNVs may be associated with disorders such as schizophrenia, autism, and mental retardation, which appear to reduce fecundity of the affected patients. Using SNP arrays to identify de novo rare CNVs in the healthy control population, they identified 66 de novo rare CNVs and tested for association with schizophrenia in two independent schizophrenia case-control samples (Phase I: 1433 cases and 33 250 controls; Phase II: 3285 cases and 7951 controls). They observed three deletions, 1q21.1, 15q11.2, and 15q13.3, to be nominally associated with schizophrenia and related psychosis in the Phase I samples. In the phase II sample, all three deletions were significantly associated with schizophrenia and related psychosis. In the combined sample, the odds ratio (OR) varied from 14.8 for 1q21.1, 11.54 for 15q13.3 to 2.73 for 15q11.2. The 15q loci were not significant if patients with psychoses other than schizophrenia were excluded. The 1q21.1 deletion which varies from 1.35Mb to 2.19Mb was present in 0.23% (11 out of the 4718) of the cases compared with 0.02% (8 of 41 199) controls. The 470 kb deletion on 15q11.2 occurred in 0.55% of cases (26 of 4 718) versus 0.19% controls (79 of 41 194). The 15q13.3 deletion spans 1.57Mb and is present in 0.17% of cases (7 of 4213) versus 0.02% of controls (8 of 39 800). They also found the 22q11.2 deletion in 0.2% of the cases (8 of 3838). The association with 1q21.1 is interesting, as this region has been associated with schizophrenia.  The 15q11.2 region has been associated with mental retardation and is also deleted in a minority of cases with Angelman syndrome or Prader-Willi syndrome. In the 15q13.3 deletion region, the alpha 7 subunit of the nicotinic receptor gene (CHRNA7) is present. This receptor is targeted to axons by NRG1, and has been implicated in schizophrenia and mental retardation. 
In the study from the International Schizophrenia Consortium,  a genome -wide survey of 3391 patients with schizophrenia and in 3181 ancestrally matched controls using SNP arrays revealed 6753 rare CNVs. These rare CNVs, observed in less than 1% of the sample and more than 100Kb in length, were 1.15-fold increased in patients with schizophrenia in comparison with controls. They also observed deletions on 1q21.1 (0.29%, OR=6.6), 2p16.3 (0.12%, NRXN1), 7q35 (0.09%, CNTNAP2), 12p11.23 (0.12%), 15q13.3 (0.03%, APBA2), 15q13.3 (0.27%, CHRNA7, OR=17.9), 16p12.2-12.1, (0.12%) and 22q11.2 (VCFS region, 0.38%, OR=21.6).
Further support for the role of rare CNVs in schizophrenia came from a recent study that analysed 471 patients with schizophrenia or schizoaffective disorder and 2792 controls.  Kirov et al observed an excess of rare CNVs larger than 1Mb in cases (OR=2.26, P=0.00027) compared with controls. The associations were stronger with deletions (OR=4.53, P=0.00013) than with duplications (OR=1.71, P=0.04). Similar to the abovementioned studies, these investigators also observed deletion at 22q11.2 in two schizophrenia patients. A deletion at 17p12 was also observed in two patients but not in controls. Unlike the ISC study,  Kirov et al  did not observe an overall excess burden of rare CNVs in their investigation. However, rare CNVs >500kb were also enriched in schizophrenia cases (OR=2.18). On combining their results with the ISC and the SGENE study,  Kirov et al observed that the 17p12 deletion was more common in cases than controls (OR~10, 0.15% vs 0.015%) and also observed the deletion at 15q11.2 (OR=2.8, 0.62% vs 0.22). A duplication observed at 16p13.1, which includes the DISC1 interacting gene NDE1, was more common in cases than controls in the Kirov study as well as the ISC study 
Need et al  in their GWAS investigation, analysed a subset of patients for presence of copy number variations, and they identified large deletions (>2Mb) in eight cases but not in controls. Of these four CNVs, one was at 22q11.2, one at 16p13.11-p12.4 (includes the gene NDE1 a binding partner for DISC1), one at 8p22 and two at 1q21.1 (these are same as that reported by Stone et al  and Stefansson et al107). Overall, similarly to Kirov et al,  they did not observe an excess of rare CNVs (>100kb) in schizophrenia cases. Unlike Walsh et al  they did not observe an excess of rare CNVs disrupting genes from the neurodevelopmental pathways. An important difference may have been that the primary investigation samples in the study by Walsh et al were child-onset schizophrenia patients.
Similar observations have also been reported in Japanese schizophrenic patients. Ikeda et al  did not observe an excess of large rare CNVs (>500kb) in schizophrenia patients, but observed an overall nonsignificant trend for excess of rare CNVs in schizophrenia (P=0.087). Similar to previous studies, they identified one deletion in a case at 1q21.1, a deletion within NRXN1, and four duplications at 16p13.1 in cases, and one in a control subject. However no deletions were observed at 22q11.2 or 15q13.3 loci. In a reverse trend for the 15q11.2 locus, three deletions were observed in controls compared with one in cases.
Based on all the above studies, we may summarize that rare CNVs, until recently, were only thought to play an important role in disorders such as mental retardation and autism. However, it now appears that CNV make a substantial contribution to the understanding of schizophrenia etiology and pathogenesis. Deletions at 1q21.1, 15q11.2, and 15q13.3 might join the ranks of 22q11.2 as uncommon but important chromosomal aberrations that can lead to severe behavioral disturbances including schizophrenia.
What next? Conclusions and future directions
Despite decades of research effort, our understanding of the genesis of schizophrenia remains an enigma. The methods used for mapping susceptibility genes have progressed enormously over the past several years. The genome-wide studies have pointed to the role of both common variants as well as rare variants in schizophrenia susceptibility. However, the effect size associated with common variants is smaller than initially estimated (OR<1.2) and only rare variants generally have a large effect. Furthermore, the number of total susceptibility variants for schizophrenia may be in the order of thousands.  Considering the low effect size observed for the associated SNPs the sample size required for replicating these associations with adequate power would theoretically be up to 100 000 each of cases and controls. To achieve such sample sizes with detailed and consistent phenotype measurement is a formidable challenge. It may be that testing broader phenotypes such as psychosis might help the field to collect these large numbers as well as detect genes which overlap between different disorders. However, the opposite approach may also be valid, that is to narrow the phenotype to a hopefully more homogeneous subgroup, for example including use of brain imaging measures, electrophysiology, or carefully defined symptom subtypes. A smaller number of genes of greater effect sizes may influence more refined, specific phenotypes.
An interesting outcome of the GWAS data analyses is that there appears to be a considerable overlap between schizophrenia and bipolar disorder, consistent with the idea that they exist on a clinical continuum with overlapping symptom dimensions. A recent study in two million Swedish families also observed that schizophrenia and bipolar disorder share susceptibility genes.  Furthermore, the GWAS study by O'Donovan et al  found that inclusion of patients with bipolar disorder along with schizophrenia further strengthened their association with the SNP rs1344706 in the ZNF804A gene. The demonstration of a functional effect that alters connectivity of brain structures associated with both schizophrenia and bipolar disorder lends further support  to the overlap of the two diseases. Similarly, genes such as DISC1,  NRG1,  and ANK3  are associated with both schizophrenia and bipolar disorder. A similar overlap may also exist with autism spectrum disorders wherein deletions in the Neurexin 1 gene have been associated with both autism and schizophrenia. One implication of this high degree of overlap is that combining phenotypes to build very large sample sizes may be a useful strategy to find small effect genes. These small effect genes may then be able to be assembled into neurobiological systems that would explain a significant degree of the pathological mechanisms of schizophrenia as well as other behavioral disorders.
Another complimentary strategy for detecting disease associated genetic variants will be the use of endophenotypes. These can be defined as disease-associated phenotypes that are heritable, state independent, cosegregate with illness in families, and are also found in unaffected family members.  Generally the terms “alternate phenotype” or “intermediate” means a phenotype that does not meet all of the criteria for an endophenotype, but represents a different and usually more objective measure of part of the schizophrenia phenotype. One of the historically most studied endophenotypes in schizophrenia is abnormal movement of the eyes while tracking a moving object across a screen.  Other endophenotypes include impairments in attention, language, and memory (neurocognitive deficits), deficits in sensory gating of auditory information (prepulse inhibition),  P50 event-related potential,  P300 event-related potential  and structural imaging phenotypes (for a detailed review see ref 122). Interestingly, these endophenotypes are generally applicable to both schizophrenia and bipolar disorder, and several genes have been reported to influence them. For example the catechol-o-methyl transferase gene (COMT) and Reelin are associated with neurocognitive deficits, and the alpha 7 nicotinic receptor subunit gene variants are associated with P50 deficits. Overall there is hope that the use of endophenotypes will improve our understanding of the biology of the disease as well as creating phenotypically more homogeneous groups of patients that can reduce the number of samples required for detecting genetic signals.
The effect of environmental factors, including maternal infection (serological evidence of influenza infection during pregnancy), and recreational drug use/abuse should also be taken into account when conducting association studies in schizophrenia.  Cannabis usage is an important risk factor aggravating psychosis, and preonset cannabis use hastens the onset of prodromal symptoms as well as fully developed psychosis. These and likely other environmental factors need to be considered in the design of future genetic studies. Furthermore, epigenetic changes, including both DNA methylation and histone acetylation, have also been suggested to play an important role in development of major psychoses. ,
These epigenetic mechanisms can be influenced by environmental effects such as stress (cortisol) and hormonal factors. Thus we need a comprehensive systems biology approach to incorporate the effect of genetic and nongenetic factors to understand the genesis of schizophrenia and related disorders.
A possible approach that may take into account most of these variables is high-throughput whole-genome sequencing. Whole-genome sequencing has the potential to detect virtually all SNPs, CNVs (both large as well as relatively small deletions <1kb) and epigenetic modifications (DNA methylation). Small deletions and duplications are relatively common in the human genome and have been shown to affect levels of several brain expressed genes (eg, DAT1, 5-HTTLPR). The cost of high-throughput sequencing at present is rather expensive but is comparable to the price of first-generation SNP chips. Further technological advances and reduction in cost of high throughput sequencing will make the discovery of each and every variant in a given person's genome feasible. While this more extensive and detailed information on each subject's DNA will provide comprehensive information at the DNA level, the data analysis, multiple testing, and other bioinformatic challenges are also greatly increased.
At this juncture in the research effort, investigations of schizophrenia have demonstrated that genetic factors indeed have an important role to play in its genesis. However, to progress further we need better phenotypic classification of our patients. A major area of development in this regard is the steady advance of neuroimaging technology. We now can have phenotypes of the volume of the dorsal lateral prefrontal cortex, or the connectivity between different brain regions. It should be the case that more accurate study of the target organ in schizophrenia research, that is the brain, will lead to more objective and reliable associations with genetic variants. Also, the methods to capture the complete variability of the genome (as well as the epigenome) will add to the comprehensiveness of the measurement of DNAbased information. Furthermore, a concerted effort is needed to understand the biology (the annotation) of the disease-associated genes. These genes will help us in identifying novel targets for drug development and thereby improve the efficacy of the treatment.
The discovery of genetic factors also leads to the consideration of whether or not it is useful to perform genetic testing in a clinical setting. In fact, there are direct-toconsumer genetic testing services for schizophrenia already available on the internet. Most regulatory authorities (for example see the International Society for Psychiatric Genetics Web site: www.ispg.org) agree that genetic testing for schizophrenia is premature at this stage, given the small odds ratios and inconsistent replications reviewed above. Until some as-yet unknown approach is achieved to combine large numbers of small effect genes into meaningful prediction algorithms, we are likely facing many more years of research before acceptable testing for risk is possible. More hopefully, the uncommon but large effect CNVs on 22q, 15q, 16p, and 1q, that appear to be replicating fairly well in large datasets, may lead to tests that could identify molecular subtypes for at least a small percentage of schizophrenia cases. These molecular subtypes may in turn lead to better understanding of the pathogenesis of schizophrenia, and treatment might become more specific and individualized for these cases. Through the continued efforts of careful investigation in genetics of schizophrenia that embrace the nuances of phenotype and the latest technological developments, the field has an excellent chance to solve the etiologic puzzle of this enigmatic disorder.