Gene Naming

Gene Naming

Human genes

Most human protein-coding genes have an associated HGNC symbol from the HUGO Gene Nomenclature Committee. ncRNA genes are given names from miRBase and RFAM. 'Clone-based' identifiers apply to transcripts that cannot be associated with an HGNC symbol.

The list of gene name categories for human is as follows:

  • HGNC automatic
  • HGNC curated
  • Ensembl stable ID

On occasion, the Hugo Gene Nomenclature Committee (HGNC) review the approved gene names for a number of genes. This review process aims to assign gene names that describe gene function more accurately. Please see the HGNC site for more details as to which genes are more at risk of a likely change. Please note that previous symbols will be maintained as ‘synonyms’, however we recommend using the HGNC ID to ensure stability in your pipelines and analyses.

Other species

Mouse genes are named from MGI, rat genes are named from RGD and zebrafish genes are named from ZFIN. Other species have gene names imported from Uniprot and NCBIGene, or if this is not available, from the human, mouse, rat or zebrafish orthologue.

Transcript numbering

Each transcript is named using the gene name, followed by a number. This does not take the place of the Ensembl gene ID, ENSG..., which is stable from release to release, and does not change.