Cereal genes are classified into two distinct classes according to the guanine-cytosine (GC) content at the third codon sites (GC3). Natural selection and mutation bias have been proposed to affect the GC content. However, there has been controversy about the cause of GC variation. Here, we characterized the GC content of 1 092 paralogs and other single-copy genes in the duplicated chromosomal regions of the rice genome (ssp. indica) and classified the paralogs into GC3-rich and GC3-poor groups. By referring to out-group sequences from Arabidopsis and maize, we confirmed that the average synonymous substitution rate of the GC3-rich genes is significantly lower than that of the GC3-poor genes. Furthermore, we explored the other possible factors corresponding to the GC variation including the length of coding sequences, the number of exons in each gene, the number of genes in each family, the location of genes on chromosomes and the protein functions. Consequently, we propose that natural selection rather than mutation bias was the primary cause of the GC variation.
Xiaoli ShiXiyin WangZhe LiQihui ZhuJi YangSong GeJingchu Luo