Database for Single Exon Coding Sequences in Mammalian Genomes
Home Statistics Tutorial Download New

Home > Statistics

SinEx DB provides information regarding the occurrence, properties and genomic distribution of 31,624 SEGs out of a total of 248,152 annotated coding sequences (CDS) from ten completely sequenced mammalian genomes.

Occurrence of SEGs in mammals:

The average percentage of SEGs to total protein encoding genes within the ten mammalian genomes is 12.9%, with no statistically significant difference between genomes (SD = +/- 3.05). The occurrence of SEGs ranges from 8.9% in human (Homo sapiens) to 17.3% in rat (Rattus norvegicus) (Fig. 1).

Figure 1: Percetage of SEGs out of a total annotated CDS in mammalian genomes.

Comparative analysis of SEG functions in mammals:

The distribution of SEGs within the functional categories of the different genomes is non-uniform, with statistical support (p < 0.05) in some instances. For example, an enrichment of SEGs relative to MEGs was observed in functions related to: i) chromatin structure and dynamics including histones, ii) signal transduction mechanisms including G protein-coupled cell surface receptors (GPCRs) and iii) translation related proteins, such as ribosomal proteins (Fig. 2).

Figure 2: SEG/MEG proportion in different KOG functional categories for mammals, represented as a combined z-score from multiple tests (mammals).