Stable IDs

Stable identifiers are ways that databases, such as Ensembl, can label the features, such as genes, transcripts, exons or proteins, in their database. The identifiers aim to be unambiguous and consistent across Ensembl releases. Unlike gene names which can change as a result of improvements in scientific knowledge, stable identifiers should continue to refer to the same genomic features. We recommend noting down the stable identifiers of your feature of interest in your publications, so that your readers can be sure exactly what feature you are refering to, especially as different databases can have slightly different data.

When annotating a species for the first time, these identifiers are auto-generated. For subsequent new gene annotations for a species, the entity identifiers are mapped between re-annotation processes using a combination of location based mappings and those generated by Exonerate so that equivalent genes for a species can be linked between releases. The process performs exon based mapping and deriving subsequent identifier mappings based upon these findings. Further details about our methods is available from Ruffier M. et al. 2016.

Format

Stable IDs are created in the form ENS[species prefix][feature type prefix][a unique eleven digit number]. For example a mouse gene might be ENSMUSG###########. This means that we can immediately tell from a stable ID what kind of feature they refer to and what species they are in. The prefixes are listed on our prefix page.

Versioning

When reassigning stable identifiers between reannotation we can optionally choose to increment the version number assigned with a stable identifier. We do so to indicate an underlying change in the entity. The rules for incrementing the version depends on the entity in question and detailed below:

  • Genes: increments when the set of transcripts linked to a gene changes
  • Transcripts: increments when there is a change in a transcript's splicing pattern, chromsome location or a sequence change in the cDNA
  • Proteins: increments when there is a sequence change in the peptide sequence
  • Exons: increments when there is a sequence change in the exon genomic sequence

Stable identifier versions do not increment when there are changes to the annotation linked to an entity. Gene tree increments follow a more complex system, described here.