Standardisation of gene model nomenclature for reference Brassica genomes
Following discussions within the MBGP steering group and publication of reference genomes in 2014, the following standard has been adopted for naming of gene models assigned to pseduo-chromosome sequences.
Formally:The genus/species and genome designations follow the convention of Ostergaard & King, with chromosome numbers assigned with leading zero (thus BnaC01 for chromosome C01 in B. napus. Gene models are assigned numbers decatonically (eg 10, 20, 30) with 5-digit leading zero integers from top to bottom of correctly orientated pseudochromosome sequence. This allows for additional or alternative gene models to be inserted. A default version number of 1 (eg for different splicing models) is assigned after a ".". Following this, in order to distinguish between reference sequences from different plant genotypes (e.g. TO1000 and O212 for B. oleracea) a single capital letter is allocated (e.g. "T" or "O"). Thus:
<GENUS 1 LETTER> [<species 2 letters>]<GENOME 1 LETTER>|<X>.<Chromosome number (leading zero)>g<5 digit gene model number>.g<version number>g<1 LETTER designating Genotype/line/cultivar>
Shortened formsShortened forms of the above are also used, where the version number and preceding '.' delimiter is omitted and the genotype concatenated with the gene model number. Thus:
from Genoscope annotation: BnaC01g010030D