Gene Regulation

TRANSFAC^® Release 7.0 - Documentation

Factor: Explanations

In the present release, no identifiers have yet been assigned to the FACTOR entries, instead the accession numbers are repeated.

The field Synonyms covers different spelling (AP-1/AP1) as well as real alternative names (HNF-1: HNF-1alpha, APF, LF-B1). In contrast, the field Suggested Homologs indicates the names of other proteins, frequently from evolutionarily more distant species, which may be functionally and/or structurally related to the factor under consideration.

The CL line indicates the major class of DNA-binding domains a factor may be assigned to. It also contains a systematic decimal classification number referring to the proposed transcription factor classification scheme. Note that this is a tentative assignment which may change according to the insights into the structure-function releationships of this large protein category

The Size field shows the number of amino acid residues of a polypeptide and its molecular weight. The method by which this figure has been obtained is indicated in brackets; (cDNA) or (gene) means that it has been calculated after cloning, (SDS) or (sedim.) hints on the corresponding experimental approaches.

The Sequence field contains the full amino acid sequence of the transcription factor. It may have been copied from SwissProt or PIR or conceptually translated from an EMBL/GenBank/DDBJ nucleic acid sequence, as is indicated in the Sequence comment (SC) field. In case that some manual editing has been done, this is also indicated in the SC line.

The Feature Table may contain information on:

regions that are enriched in some amino acid residues and may therefore represent trans-activating domains; the content is given as (M/N) which means that M out of N residues are of the enriched amino acid; positions of the typical DNA-binding/dimerization motifs and the motif structure within the individual molecule; e. g. tryptophan cluster motifs are explained with regard to Trp spacing, and the nature of a leucine zipper is given as well (e. g. L4 means that it consists of four leucine residues spaced by 6-AA-intervals, L2EL2 indicates a motif such as L-X6-L-X6-E-X6-L-X6-L);
posttranslational modifications (phosphorylation, glycosylation).

In the html version, the positional features are visualized between the FT and the SF lines. The individual features can be assign by the colour code given in the feature table. Overlapping features are displayed using a triangular instead of the rectangular representation.

The field Structural Features gives information about global structural features of the factor.

Data may be referenced, the source of information used is indicated by a bracketed number that points to the corresponding paper at the end of the entry.

Cell Specificity (positive) gives predominant occurrence of a factor in certain cell types or tissues. Occasionally, cells from which the factor has been isolated are indicated in brackets; this information does not necessarily point to a true cell specificity. Additionally, Cell Specificity (negative) list cells / tissues which have been proven not to express the corresponding factor. For human and mouse factors the CP/CN fields have been started to be replaced by the better structured expression patterns (EX), giving place (organ and/or cell name), system, developmental stage, relative expression level, detection method and the detected molecule type (RNA or protein).

Interactions: Since most transcription factors bind to DNA as dimers, the dimerization partners are indicated in this field. Repeating the factors name in this field means that it forms homodimers. Also given are inhibitory protein-protein-interactions such as NF-kappaB - IkappaB.

The field Matrix gives accession number and identifier of the connected MATRIX table entries.

"External Databases" points to corresponding entries within the EMBL, SwissProt, PIR, FlyBase or PDB data libraries. Also indicated is whether an EMBL entry is for the factor's gene or its mRNA/cDNA.