TRANSFAC® Release 7.0 - Documentation
Factor: Explanations
In the present release, no identifiers have yet been assigned to
the FACTOR entries, instead the accession numbers are repeated.
The field Synonyms covers different spelling (AP-1/AP1) as well as real
alternative names (HNF-1: HNF-1alpha, APF, LF-B1). In contrast, the
field Suggested Homologs indicates the names of other proteins,
frequently from evolutionarily more distant species, which may be
functionally and/or structurally related to the factor under
consideration.
The CL line indicates the major class of DNA-binding domains a factor
may be assigned to. It also contains a systematic decimal classification
number referring to the proposed transcription factor classification scheme.
Note that this is a tentative assignment which may change according to the
insights into the structure-function releationships of this large protein category
The Size field shows the number of amino acid residues of a polypeptide
and its molecular weight. The method by which this figure has been
obtained is indicated in brackets; (cDNA) or (gene) means that it has
been calculated after cloning, (SDS) or (sedim.) hints on the
corresponding experimental approaches.
The Sequence field contains the full amino acid sequence of the
transcription factor. It may have been copied from SwissProt or PIR or
conceptually translated from an EMBL/GenBank/DDBJ nucleic acid sequence,
as is indicated in the Sequence comment (SC) field. In case that some
manual editing has been done, this is also indicated in the SC line.
The Feature Table may contain information on:
-
regions that are enriched in some amino acid residues and may
therefore represent trans-activating domains; the content is
given as (M/N) which means that M out of N residues are of the
enriched amino acid;
positions of the typical DNA-binding/dimerization motifs and
the motif structure within the individual molecule; e. g.
tryptophan cluster motifs are explained with regard to Trp
spacing, and the nature of a leucine zipper is given as well
(e. g. L4 means that it consists of four leucine residues spaced
by 6-AA-intervals, L2EL2 indicates a motif such as
L-X6-L-X6-E-X6-L-X6-L);
-
posttranslational modifications (phosphorylation, glycosylation).
In the html version, the positional features are visualized between the
FT and the SF lines. The individual features can be assign by the colour
code given in the feature table. Overlapping features are displayed
using a triangular instead of the rectangular representation.
The field Structural Features gives information about global structural
features of the factor.
Data may be referenced, the source of information used is indicated by a
bracketed number that points to the corresponding paper at the end of the
entry.
Cell Specificity (positive) gives predominant occurrence of a factor in
certain cell types or tissues. Occasionally, cells from which the
factor has been isolated are indicated in brackets; this information
does not necessarily point to a true cell specificity. Additionally,
Cell Specificity (negative) list cells / tissues which have been proven
not to express the corresponding factor. For human and mouse factors
the CP/CN fields have been started to be replaced by the better structured
expression patterns (EX), giving place (organ and/or cell name), system,
developmental stage, relative expression level, detection method and the
detected molecule type (RNA or protein).
Interactions: Since most transcription factors bind to DNA as dimers,
the dimerization partners are indicated in this field. Repeating the
factors name in this field means that it forms homodimers. Also given
are inhibitory protein-protein-interactions such as NF-kappaB - IkappaB.
The field Matrix gives accession number and identifier of the connected
MATRIX table entries.
"External Databases" points to corresponding entries within the EMBL, SwissProt, PIR, FlyBase or PDB data libraries. Also indicated is whether an EMBL entry is for the factor's gene or its mRNA/cDNA.