Source | Match | ReputationScore* |
---|---|---|
CLUSTAL-W Alignment Format
CLUSTAL-W Alignment Format is a simple text-based format, often with a *.aln file extension, used for the input and output of DNA or protein sequences into the Clustal suite of multiple alignment programs.
|
|
|
GenBank
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G
...
|
|
|
Sequence Alignment Map
The Sequence Alignment/Map (SAM) format is a TAB-delimited text format consisting of a header section, which is optional, and an alignment section.
|
|
|
Sequence Read Archive
The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi
...
|
|
|
FASTA Sequence Format
FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede th
...
|
|
|
MEROPS
The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.
|
|
|
European Nucleotide Archive
The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe
...
|
|
|
Gramene: A curated, open-source, integrated data resource for comparative functional genomics in plants
Gramene's purpose is to provide added value to plant genomics data sets available within the public sector, which will facilitate researchers' ability to understand the plant genomes and take advantage of genomic sequence known in one species for ide
...
|
|
|
NCBI Gene
The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic
...
|
|
|
JASPAR
JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxo
...
|
|
|
Insertion Sequence Finder
This database provides a list of insertion sequences (IS) isolated from bacteria and archae. It is organized into individual files containing their general features (name, size, origin, family.....) as well as their DNA and potential protein sequence
...
|
|
|
ConoServer
ConoServer is a database specializing in sequences and structures of peptides expressed by marine cone snails. The database gives access to protein sequences, nucleic acid sequences and structural information on conopeptides. ConoServer's data are fi
...
|
|
|
ImMunoGeneTics Information System
IMGT is a high-quality integrated knowledge resource specialized in the immunoglobulins (IG) or antibodies, T cell receptors (TR), major histocompatibility complex (MHC) of human and other vertebrate species, and in the immunoglobulin superfamily (Ig
...
|
|
|
BioSamples at the European Bioinformatics Institute
The BioSamples database aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI's assay databases such as ArrayExpress, the European Nucleotide Archive or PRIDE. It provides lin
...
|
|
|
Nucleic Acids Database
The Nucleic Acids Database contains information about experimentally-determined nucleic acids and complex assemblies. NDB can be used to perform searches based on annotations relating to sequence, structure and function, and to download, analyze, and
...
|
|
|
SwissRegulon
The Swissregulon Database contains genome-wide annotations of regulatory sites. The predictions are based on Bayesian probabilistic analysis of a combination of input information including i) Experimentally determined binding sites reported in the li
...
|
|
|
NCBI BioSample
The NCBI BioSample database stores submitter-supplied descriptive information, or metadata, about the biological materials from which data stored in NCBI’s primary data archives are derived. NCBI’s archives host data from diverse types of samples fro
...
|
|
|
Yeast Searching for Transcriptional Regulators and Consensus Tracking
YEASTRACT (Yeast Search for Transcriptional Regulators And Consensus Tracking) is a curated repository of more than 48333 regulatory associations between transcription factors (TF) and target genes in Saccharomyces cerevisiae, based on more than 1200
...
|
|
|
Animal Transcription Factor Database
AnimalTFDB is a comprehensive animal transcription factor database. The resource is classification of transcription factors from 50 genomes from species including Homo sapiens and Caenorhabditis elegans. The database also has information on co-transc
...
|
|
|
HOmo sapiens transcription factor COmprehensive MOdel COllection
HOmo sapiens COmprehensive MOdel COllection (HOCOMOCO) v10 provides transcription factor (TF) binding models for 601 human and 396 mouse TFs. In addition to basic mononucleotide position weight matrices (PWMs), HOCOMOCO provides a set of dinucleotide
...
|
|
|
Universal PBM Resource for Oligonucleotide Binding Evaluation
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.
|
|
|
Dfam
The Dfam database is a open collection of DNA Transposable Element sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. Dfam represents a collection of multiple sequence alignments, each containing a set of r
...
|
|
|
Genome Sequence Archive
GSA is a data repository specialized for archiving raw sequence reads. It supports data generated from a variety of sequencing platforms ranging from Sanger sequencing machines to single-cell sequencing machines and provides data storing and sharing
...
|
|
|
DataBase of Transcriptional Start Sites
This database includes TSS data from adult and embryonic human tissue. DBTSS now contains 491 million TSS tag sequences for collected from a total of 20 tissues and 7 cell cultures.
|
|
|
The ribosomal RNA operon copy number database
The ribosomal RNA operon copy number database is a publicly available, curated resource for ribosomal operon (rrn) copy number information for Bacteria and Archaea.
|
|
|
Database of Sequence Tagged Sites
dbSTS is an NCBI resource that contains sequence data for short genomic landmark sequences or Sequence Tagged Sites.
|
|
|
VISTA Enhancer Browser
Despite the known existence of distant-acting cis-regulatory elements in the human genome, only a small fraction of these elements has been identified and experimentally characterized in vivo. This paucity of enhancer collections with defined activit
...
|
|
|
Open Regulatory Annotation
The Open REGulatory ANNOtation database (ORegAnno) is an open database for the curation of known regulatory elements from scientific literature. Annotation is collected from users worldwide for various biological assays and is automatically cross-ref
...
|
|
|
Polymorphism in microRNAs and their TargetSites
PolymiRTS (Polymorphism in microRNAs and their TargetSites) is a database of naturally occurring DNA variations in microRNA (miRNA) seed regions and miRNA target sites. MicroRNAs pair to the transcripts of protein-coding genes and cause translational
...
|
|
|
UniVec
UniVec is a database that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination). In addition to vector sequences, UniVec also contains sequences for those adapters, linkers, and pr
...
|
|
|
A CLAssification of Mobile genetic Elements
ACLAME is a database dedicated to the collection and classification of mobile genetic elements (MGEs) from various sources, comprising all known phage genomes, plasmids and transposons.
|
|
|
Regulatory Element Database for Drosophila
REDfly is a curated collection of known Drosophila transcriptional cis-regulatory modules (CRMs) and transcription factor binding sites (TFBSs). REDfly seeks to include all experimentally verified fly regulatory elements along with their DNA sequence
...
|
|
|
European Mouse Mutant Archive
The European Mouse Mutant Archive (EMMA) is a non-profit repository for the collection, archiving (via cryopreservation) and distribution of relevant mutant strains essential for basic biomedical research. The laboratory mouse is the most important m
...
|
|
|
LncBook
LncBook is a curated knowledgebase of human lncRNAs that features a comprehensive collection of human lncRNAs and systematic curation of lncRNAs by multi-omics data integration, functional annotation and disease association. It integrates multi-omics
...
|
|
|
RNAcentral
RNAcentral is a free, public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of databases representing a broad range of organisms and RNA types.
|
|
|
Genetic Codes
NCBI takes great care to ensure that the translation for each coding sequence (CDS) present in GenBank records is correct. Central to this effort is careful checking on the taxonomy of each record and assignment of the correct genetic code for each o
...
|
|
|
The Arabidopsis Gene Regulatory Information Server
The Arabidopsis Gene Regulatory Information Server (AGRIS) is a new information resource of Arabidopsis promoter sequences, transcription factors and their target genes. AGRIS currently contains two databases, AtcisDB (Arabidopsis thaliana cis-regula
...
|
|
|
Super-Enhancer Archive
SEA (Super-Enhancer Archive) is a web-based comprehensive resource focusing on the collection, storage and online analysis of super-enhancers. It focuses on integrating super-enhancers in multiple species and annotating their potential roles in the r
...
|
|
|
PROkariotIC Database Of Gene-Regulation
PRODORIC is a comprehensive database about gene regulation and gene expression in prokaryotes. It includes a manually curated and unique collection of transcription factor binding sites.
|
|
|
TransmiR
TransmiR is a database for transcription factor-microRNA regulations, which is free for academic usage.
|
|
|
IRESite
The IRESite database presents information about experimentally studied IRES (Internal Ribosome Entry Site) segments. IRES regions are known to attract the eukaryotic ribosomal translation initiation complex and thus promote translation initiation ind
...
|
|
|
cis-Regulatory Element Database
The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data.
|
|
|
PAZAR
PAZAR is a software framework for the construction and maintenance of regulatory sequence data annotations; a framework which allows multiple boutique databases to function independently within a larger system (or information mall). The goal of PAZAR
...
|
|
|
Gene Transcription Regulation Database
Gene Transcription Regulation Database (GTRD) is a database of transcription factor binding sites (TFBSs) identified by ChIP-seq experiments for human and mouse.
|
|
|
MAPPER-2
This resource provides information primarily on the upstream non-coding sequence data of genes in 3 genomes which gives insight into the transcription factors binding sites (TFBSs). For each transcript, the region scanned extends from 10,000bp upstre
...
|
|
|
dbSUPER
dbSUPER is the first integrated and interactive database of super-enhancers, which contains 82234 super-enhancers in 102 human and 25 mouse tissue/cell types.
|
|
|
IPD-KIR - Killer-cell Immunoglobulin-like Receptors
The database provides a centralised repository for human KIR sequences. Killer-cell Immunoglobulin-like Receptors (KIR) have been shown to be highly polymorphic at the allelic and haplotypic level. KIRs are members of the immunoglobulin superfamily (
...
|
|
|
CoryneRegNet 6.0 - Corynebacterial Regulation Network
Corynebacterial Regulation Network a reference database and analysis platform for corynebacterial transcription factors and gene regulatory networks.
|
|
|
IPD-MHC - Major Histocompatibility Complex Database
The IPD-MHC Database provides a centralised repository for sequences of the Major Histocompatibility Complex (MHC) from a number of different species. Through a number of international collaborations IPD is able to provide the MHC sequences of differ
...
|
|
|
The Improved Database Of Chimeric Transcripts and RNA-Seq Data
The ESTs and mRNAs from GenBank have been used to identify chimeric RNAs of two or more different genes. By analyzing thousands of chimeric ESTs by RNA sequencing, we found that the expression level of chimeric ESTs is generally low and they are high
...
|
|
|
GyDB
Gypsy database of mobile genetic elements
|
|
|
CollecTF
CollecTF is a database of transcription factor binding sites (TFBS) in the Bacteria domain. It aims at becoming a reference, highly-accessed database by relying on its ability to customize navigation and data extraction, its relevance to the communit
...
|
|
|
4DNucleome Data Portal
The 4D Nucleome Data Portal (4DN) hosts data generated by the 4DN Network and other reference nucleomics data sets, and an expanding tool set for open data processing and visualization. It is a platform to search, visualize, and download nucleomics d
...
|
|
|
NRG-CING
Validated NMR structures of proteins and nucleic acids.
|
|
|
RegPrecise
Predicted regulons in prokaryotic genomes
|
|
|
FlyFactorSurvey
Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system
|
|
|
Transcription Factor Class
TFClass is a resource that classifies eukaryotic transcription factors (TFs) according to their DNA-binding domains. Combining information from different resources, manually checking the retrieved mammalian TF sequences and applying extensive phyloge
...
|
|
|
A database for spatially resolved transcriptomes
Spatially resolved transcriptomics providing gene expression profiles with positional information is key to tissue function and fundamental to disease pathology. SpatialDB is the first public database that specifically curates spatially resolved tran
...
|
|
|
Telomerase Database
The Telomerase Database is a Web-based tool for the study of structure, function, and evolution of the telomerase ribonucleoprotein. The objective of this database is to serve the research community by providing a comprehensive compilation of informa
...
|
|
|
APPRIS
Annotates variants with biological data such as protein structural information, functionally important residues, conservation of functional domains and evidence of cross-species conservation.
|
|
|
euL1db, the European database of L1-HS retrotransposon insertions in humans
Retrotransposons, which comprises LINE, SINE and LTR-containing elements, accounts for almost half of our genome (Fig. 1). They are mobile genetics elements - also known as jumping genes - but only the L1-HS subfamily has retained the ability to jump
...
|
|
|
Genome Variation Map
The Genome Variation Map (GVM) is a public data repository of genome variations, including single nucleotide polymorphisms (SNP) and small insertions and deletions (INDEL), with particular focuses on human as well as cultivated plants and domesticate
...
|
|
|
Saccharomyces cerevisiae Transcription Factor Database
ScerTF is a database of position weight matrices (PWMs) for transcription factors in Saccharomyces species. It identifies a single matrix for each TF that best predicts in vivo data, providing metrics related to the performance of that matrix in accu
...
|
|
|
ChimerDB
ChimerDB is a database of fusion sequences encompassing bioinformatics analysis of mRNA and EST sequences in the GenBank, manual collection of literature data and integration with other well known databases. Fusion transcripts with nonoverlapping ali
...
|
|
|
MethBank
MethBank stores DNA methylome data across a variety of species. MethBank integrates consensus reference methylomes (CRMs) compiled from healthy human samples at different ages, single-base resolution methylomes (SRMs) of both plant and animal species
...
|
|
|
T-psi-C
T-psi-C is a database of tRNA sequences and 3D tRNA structures. The T-psi-C database can be continuously updated by any member of the scientific community.
|
|
|
Tandem Repeats Database
Tandem Repeats Database (TRDB) is a public repository of information on tandem repeats in genomic DNA and contains a variety of tools for their analysis.
|
|
|
DDBJ/ENA/GenBank Feature Table
The GenBank, EMBL, and DDBJ nucleic acid sequence data banks have from their inception used tables of sites and features to describe the roles and locations of higher order sequence domains and elements within the genome of an organism. In February,
...
|
|
|
Annotated regulatory Binding Sites from Orthologous Promoters
ABS: A database of Annotated regulatory Binding Sites from known binding sites identified in promoters of orthologous vertebrate genes.
|
|
|
Short Read Archive eXtensible Markup Language
The SRA data model contains the following objects: Study: information about the sequencing project Sample: information about the sequenced samples Experiment: information about the libraries, platform; associated with study, sample(s) and run(s) Run:
...
|
|
|
WebGeSTer DB
WebGesTer Database (DB) is the largest compilation of intrinsic terminators of transcription. It comprises of >2,200,000 bacterial terminators identified from a total of 2036 chromosomes and 1508 plasmids. The database is the storehouse for algorithm
...
|
|
|
COXPRESdb
Coexpressed genes and networks in human and mouse
|
|
|
FlyTF
FlyTF (v2) is a manually curated catalogue of Drosophila site-specific transcription factors (TFs). It integrates proteins identified as DNA-binding TFs by computational prediction based on structural domain assignments, and experimentally verified T
...
|
|
|
Real-time PCR Data Markup Language
The RDML file format is developed by the RDML consortium (http://www.rdml.org) and can be used free of charge. The RDML file format was created to encourage the exchange, publication, revision and re-analysis of raw qPCR data. The core of an RDML fil
...
|
|
|
Ligand Expo
Ligand Expo is a data resource for finding information about small molecules bound to proteins and nucleic acids. Tools are provided to search the PDB dictionary for chemical components, to identify structure entries containing particular small molec
...
|
|
|
TcoF-DB
Database for Human Transcription Co-Factors
|
|
|
3D-Footprint
Estimates of DNA-binding specificity for protein-DNA complexes in PDB
|
|
|
NGSmethDB
Next-generation sequencing single-cytosine-resolution DNA methylation data
|
|
|
DBASS5/3
Database of Aberrant Splice Sites: sequences flanking cryptic and de novo 3' and 5' splice sites
|
|
|
Hardwood Genomics Project
The Hardwood Genomics Project is a databases for expressed genes, genetic markers, genetic linkage maps, and reference populations. It provides lasting genomic and biological resources for the discovery and conservation of genes in hardwood trees for
...
|
|
|
BEI Resource Repository
BEI Resources provides reagents, tools and information for studying Category A, B, and C priority pathogens, emerging infectious disease agents, non-pathogenic microbes and other microbiological materials of relevance to the research community.
|
|
|
ODB - Operon database
ODB (Operon DataBase) is a database of known operons among the many complete genomes. Additionally, putative operons that are conserved in terms of known operons are also provided. The first release of ODB conteins about 2000 known operons and 13,000
...
|
|
|
ASPicDB
ASPicDB is a database designed to provide access to reliable annotations of the alternative splicing pattern of human genes, obtained by ASPic algorithm (Castrignano' et al. 2006), and to the functional annotation of predicted isoforms.
|
|
|
Feature Annotation Location Description Ontology
The Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences for data resources represented in RDF and/or OWL. FALDO can be used to describe nucleotide features in sequ
...
|
|
|
R-loopDB
R-loop DB is a collection of R-loop forming sequences (RLFS) predicted computationally in the human genome based on quantitative model of RLFS (QmRLFS). The database additionally includes chromosome coordinates and annotation of many hundred thousand
...
|
|
|
Insect Microsatellite Database
InSatDb, unlike many other microsatellite databases that cater largely to the needs of microsatellites as markers, presents an interactive interface to query information regarding microsatellite characteristics of five fully sequenced insect genomes
...
|
|
|
TassDB
TassDB (TAndem Splice Site DataBase) stores extensive data about alternative splice events at GYNGYN donors and NAGNAG acceptors. These splice events are of subtle nature since they mostly result in the insertion/deletion of a single amino acid or th
...
|
|
|
GenomeTraFaC
GenomeTraFaC is a database of conserved regulatory elements obtained by systematically analyzing the orthologous set of human and mouse genes. It mainly focuses on all of the high-quality mRNA entries of mouse and human genes in the Reference Sequenc
...
|
|
|
SpliceInfo
The database provides a means of investigating alternative splicing and can be used for identifying alternative splicing - related motifs, such as the exonic splicing enhancer (ESE), the exonic splicing silencer (ESS) and other intronic splicing moti
...
|
|
|
Databases of Orthologous Promoters
DoOP is a database of eukaryotic promoter sequences (upstream regions), aiming to facilitate the recognition of regulatory sites conserved between species. Based on the Arabidopsis thaliana and Homo sapiens genome annotation, this resource is also a
...
|
|
|
TFBSshape
TFBSshape provides DNA shape features for transcription factor binding sites (TFBSs) that in addtion to sequence features, usually in the form of position weight matrices (PWMs), characterize DNA binding specificities of transcription factors (TFs) f
...
|
|
|
MetaSRA
MetaSRA is a database of normalized SRA human sample-specific metadata following a schema inspired by the metadata organization of the ENCODE project. This schema involves mapping samples to terms in biomedical ontologies, labeling each sample with a
...
|
|
|
DNAtraffic
A database for systems biology of DNA dynamics during the cell life.
|
|
|
OGRDB
OGRDB is a curated database of immunoglobulin and T cell receptor sequences inferred from immune receptor repertoires, together with supporting information describing the repertoires from which they were derived. Researchers can submit sequences and
...
|
|
|
EBI patent sequences
Non-redundant databases of patent DNA and protein sequences
|
|
|
Pseudogene
This ontology is about human pseudogenes, extending the existing SO framework to incorporate additional attributes. Relationships between pseudogenes and segmental duplications are defined in this standard. To answer research questions and to annotat
...
|
|
|
INSD sequence record XML
The International Nucleotide Sequence Database Collaboration (INSDC) is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. INSDC covers the spectrum of data raw reads, though alignments and assemblies to functional
...
|
|
|
Big Data Nucleic Acid Simulations Database
Atomistic Molecular Dynamics Simulation Trajectories and Analyses of Nucleic Acid Structures. BIGNASim is a complete platform to hold and analyse nucleic acids simulation data, based on two noSQL database engines: Cassandra to hold trajectory data an
...
|
|
|
ENA Sequence XML Schema
ENA Sequence XML Schema is a standardised XML schema for nucleotide sequences. All assembled and annotated sequences must conform to this schema.
|
|
|
Genome Variation Format
The Genome Variation Format (GVF) is a very simple file format for describing sequence alteration features at nucleotide resolution relative to a reference genome.
|
|
|
ENA Sequence Flat File Format
ENA Sequence Flat File Format is a standardised plain text format for nucleotide sequences. This format was previously called the EMBL Sequence Flat File Format.
|
|
|
DDBJ Sequence Read Archive
DDBJ Sequence Read Archive (DRA) is an archive database for output data generated by next-generation sequencing machines including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, and others. DRA is a member of the I
...
|
|
|
Ontology for Genetic Interval
Using BFO (Basic Formal Ontology) as its upper-level ontology, the Ontology for Genetic Interval (OGI) represents gene as an entity with its 3D shape, topography, and primary DNA sequence as the foundation for its 3D structure. There is no official h
...
|
|
|
DiProDB
Database for dinucleotide properties
|
|
|
PANDIT
PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains. It contains the seed protein sequence alignments from the Pfam-A (curated families) database; nucleotide sequence alignments derived f
...
|
|
|
UTRdb/UTRsite
The 5' and 3' untranslated regions of eukaryotic mRNAs may play a crucial role in the regulation of gene expression controlling mRNA localization, stability and translational efficiency. For this reason we developed UTRdb, a specialized database of 5
...
|
|
|
DBD
DBD provides transcription factor predictions for more than 150 completely sequenced genomes available for browsing and download. Predictions are based on presence of sequence specific DNA binding domain assignments using hidden Markov models from th
...
|
|
|
SpliceAid-F
A comprehensive knowledge of all the factors involved in splicing, both proteins and RNAs, and of their interaction network is crucial for reaching a better understanding of this process and its functions. A large part of relevant information is buri
...
|
|
|
CEGA
CEGA, (Conserved Elements from Genomic Alignments), is a database of conserved vertebrate elements. This database provides acces to precomputed sets of conserved sequences from different species and at different levels of the vertebrate phylogeny.
|
|
|
AREsite
AU-rich elements in vertebrate mRNA UTR sequences
|
|
|
IUPAC-IUB Commission on Biochemical Nomenclature - Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents
The Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents, created by the IIUPAC-IUB Commission on Biochemical Nomenclature, formalizes the naming scheme for simple nucleotides; nucleotide coenzymes and related substance
...
|
|
|
Greglist
G-quadruplex motifs and potentially G-quadruplex regulated genes
|
|
|
GISSD
Group I Intron Sequence and Structure Database
|
|
|
YeTFaSCo
Yeast Transcription Factor binding Site sequence Collection
|
|
|
TESS
TESS (Transcription Element Search System, http://www.cbil.upenn.edu/tess) is a web-based service that searches DNA sequence for transcription factor binding sites. It integrates three databases of transcription factors and binding site models, and p
...
|
|
|
lncRNASNP2 |
|
|
CORG - A database for COmparative Regulatory Genomics
Sequence conservation in non-coding, upstream regions of orthologous genes from man and mouse is likely to reflect common regulatory DNA sites. Motivated by this assumption we have delineated a catalogue of conserved non-coding sequence blocks and pr
...
|
|
|
PRODORIC2 |
|
|
TIGR Plant Transcript Assembly database
The TIGR Plant Transcript Assemblies (TA) database (http://plantta.tigr.org) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed Sequenc
...
|
|
|
Spliceosome Database |
|
|
TrSDB
Transcription factor database
|
|
|
Hollywood
Exon annotation database
|
|
|
SINEBase
A database of short interspersed elements (SINEs)
|
|
|
Factorbook
Human transcription factor binding data from ChIP-seq
|
|
|
L1Base
Functional annotation and prediction of LINE-1 elements
|
|
|
ARED-Plus |
|
|
ECgene
Genome annotation for alternative splicing
|
|
|
RetrOryza
With the availability of the complete genomic sequence of rice, the identification and annotation of LTR-Retrotransposons has become a necessity as they comprise an important part of plant genomes (1). RetrOryza is a database that aims at providing t
...
|
|
|
TTSMI
Triplex Target DNA Sites in the human genome
|
|
|
MachiBase
Drosophila melanogaster 5' mRNA transcription start site database
|
|
|
DoriC
DoriC regions in bacterial and archaeal genomes
|
|
|
SNP2TFBS
Regulatory SNPs affecting predicted transcription factor binding sites
|
|
|
RetNet
RetNet provides tables of genes and loci causing inherited retinal diseases, such as retinitis pigmentosa, macular degeneration and Usher syndrome, and related information. This information is provided to the research community and other interested i
...
|
|
|
TiProD
TiProD is a database of human promoter sequences for which some functional features are known. It allows a user to query individual promoters and the expression pattern they mediate, gene expression signatures of individual tissues, and to retrieve s
...
|
|
|
GBshape
DNA shape analysis has been established in recent years as an approach that reveals protein-DNA binding specificity determinants beyond nucleotide sequence.GBshape provides DNA shape annotations of entire genomes.The database currently contains annot
...
|
|
|
ExtraTrain
ExtraTrain is a new database for exploring Extragenic and Transcriptional information in prokaryotic organisms. Transcriptional regulation processes are the principal mechanisms of adaptation in prokaryotes. In these processes, the regulatory signals
...
|
|
|
Synthetic Gene Database
The Synthetic Gene Database (http://www.evolvingcode.net/codon/sgdb/index.php) is a resource that has collected together sequence information on synthetic genes (i.e. genes that were designed conceptually, rather than built from an initial, physical
...
|
|
|
OriDB - The DNA Replication Origin Database
OriDB provides a web-based catalogue of confirmed and predicted DNA replication origin sites. At present this is limited to budding yeast (S. cerevisiae). Each proposed or confirmed origin site appears as a record in OriDB, with each record comprisin
...
|
|
|
U12DB
U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. The resource described here, the U12 Intron Database (U12DB), aims to
...
|
|
|
RegTransBase
RegTransBase is a manually curated database of regulatory interactions in prokaryotes that captures the knowledge in public scientific literature using a controlled vocabulary. Although several databases describing interactions between regulatory pro
...
|
|
|
PReMod
The PReMod database describes more than 100,000 computational predicted transcriptional regulatory modules within the human genome. These modules represent the regulatory potential for 229 transcription factors families and are the first genome-wide/
...
|
|
|
SNPSTR
The SNPSTR database contains the SNP-STR/microsatellite compound markers in the five model species, where sufficient SNP information exists in both of the NCBI and Ensembl databases. These species are human (Homo sapiens), mouse (Mus musculus), rat (
...
|
|
|
CMGSDB
Computational models for gene silencing in C. elegans
|
|
|
CTCF Binding Site Database
Experimentally identified and predicted CTCF binding sties
|
|
|
Plant Stress-Responsive Gene Catalog
Stress-responsive gene in various plant species
|
|
|
ProSAS
Protein Structure and Alternative Splicing: effects of alternative splicing events on protein structure
|
|
|
ooTFD
ooTFD (1) is a database of transcription factors maintained in object-oriented and object-relational database systems. There are, at the time of this writing, about 7500 TF binding sites entries in this database, from both prokaryotic and eukaryotic
...
|
|
|
ACTIVITY
ACTIVITY, a database on DNA site sequences with known activity magnitudes, measurement systems and sequence-activity relationships under fixed experimental conditions is additionally adapted to applications to the phylogenetic footprints of known sit
...
|
|
|
SELEXdb
SELEX_DB is an online resource containing both the experimental data on in vitro selected DNA/RNA oligomers (aptamers) and the applets for these oligomers recognition. In vitro selection of oligomers binding target proteins is a novel technology inte
...
|
|
|
PlantProm
Plant promoter sequences
|
|
|
SCPD - Saccharomyces cerevisiae promoter database
A database of yeast promoters
|
|
|
SpliceNest
A tool for visualizing splicing of genes from EST data
|
|
|
Plant repeat database
Repetitive sequences in plant genomes
|
|
|
SKY/M-FISH and CGH
The NCI and NCBI SKY/M-FISH and CGH Database is a repository of publicly submitted data from Spectral Karyotyping (SKY), Multiplex Fluorescence In Situ Hybridization (M-FISH), and Comparative Genomic Hybridization (CGH), which are complementary fluor
...
|
|
|
EDAS - EST-Derived Alternative Splicing Database
EDAS is a database of alternative splicing derived from the anlaysis of genomic, protein, mRNA and EST data. It provides classification of elementary alternatives into main types, combined searches for specific alternative variants over tissues and d
...
|
|
|
Ciliate IES-MDS database
Macro- and micronuclear genes in spirotrichous ciliates
|
|
|
NPRD - Nucleosome Positioning Region Database
Nucleosome positioning region database
|
|
|
TRACTOR db
Experimental data on the Escherichia coli transcriptional regulatory system has been used in the past years to predict new regulatory elements (promoters, Transcription Factors (TFs), TFs' binding sites, operons) within its genome. As more genomes of
...
|
|
|
TRED - Transcriptional Regulatory Element Database
Transcriptional regulatory element database
|
|
|
HTPSELEX
Transcription factor binding site sequences obtained using high-throughput SELEX method
|
|
|
STIFDB2
Various genes get upregulated in plants during adverse environmental conditions, which alter the metabolic functions to mitigate the stress effects for adaptation. Therefore, it is important to know the regulatory motifs of stress-induced genes for g
...
|
|
|
UCNEbase
A database of ultraconserved non-coding elements and gene regulatory blocks
|
|
|
HEXEvent
Human Exone Splicing Events
|
|
|
ChIPBase
ChIPBase v2.0 is an open database for studying the transcription factor binding sites and motifs, and decoding the transcriptional regulatory networks of lncRNAs, miRNAs, other ncRNAs and protein-coding genes from ChIP-seq data. Our database currentl
...
|
|
|
uORFdb
Upstream ORFs and their effect of translation of downstream CDSs
|
|
|
BloodChIP
Transcription factor binding profiles in human haematopoietic stem/progenitor cells
|
|
|
DPRP
A database of phenotype-specific regulatory programs derived from transcription factor binding data
|
|
|
OnTheFly
DNA-binding specificities of transcription factors in Drosophila
|
|
|
JuncDB
Exon-exon Junction database
|
|
|
MethSMRT
DNA methylation data from single molecule, real-time sequencing
|
|
|
TFBSbank
Transcription factor binding profiles deduced from ChIP-seq or ChIP-chip data
|
|
|
VectorDB
Data available for download from the SGD site, be aware that data dates from 1997
|
|
|
ASPD
ASPD is a new curated database that incorporates data on full-length proteins, protein domains and peptides that were obtained through in vitro directed evolution process (mainly by means of phage display technique). ASPD database is being compiled b
...
|
|
|
S/MARt DB
The nuclear organization of metaphase and interphase cells has been studied over several decades and increasing evidence supports the concept upon which the eukaryotic chromatin is organized in the form of functional independent loop domains [1; 2].
...
|
|
|
GeneNet
The GeneNet system is designed for collection and analysis of the data on gene and metabolic networks, signal transduction pathways, and kinetic characteristics of elementary processes. In the past two years, the GeneNet structure was considerably im
...
|
|
|
TRANSFAC®
The TRANSFAC® database has been constructed to model the interaction of eukaryotic transcription factors with their DNA-binding sites and how this affects gene expression. At its core are the three tables FACTOR, SITE, and GENE. A link between FACTOR
...
|
|
|
TRANSPATH®
TRANSPATH® is a database on signal transduction pathways that are modeled as bipartite graphs with molecules and reactions as node classes [1,2,3,4,5]. The molecule entries include polypeptides, modified forms, multicomponent complexes, high-order ab
...
|
|
|
Yeast Intron Database
This searchable database contains information about the location, structure, and function of spliceosomal introns in the nuclear genome of Saccharomyces cerevisiae. Searches produce reports for each intron satisfying the search criteria, showing key
...
|
|
|
TRANSCompel®
The TRANSCompel® database is devoted to the particular aspect of gene transcriptional regulation [1-7]. It contains information about composite elements - the basic structures of combinatorial gene regulation [7]. Composite regulatory elements consis
...
|
|
|
UgMicroSatdb
UniGene MicroSatellite database: short tandem repeats from various eukaryotic genomes
|
|
|
UTRome
3'UTRs and their functional elements in C. elegans
|
|
|
ECRbase
Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. We have created a database of evolutionary conserved regions in vertebrate genomes, entitled ECRbase, which is constructed from a col
...
|
|
|
UCbase and miRfunc
Ultraconserved sequences (UCRs) were first described by Bejerano et al. in 2004. They are highly conserved genome regions that share 100% identity among human, mouse and rat. UCRs are 481 sequences longer than 200 bases. They are frequently located a
...
|
|
|
European Genome-phenome Archive (EGA)
The European Genome–phenome Archive (EGA) is a permanent repository for all types of potentially identifiable genetic and phenotypic data from biomedical research projects. The EGA contains data collected from individuals who have given consen
...
|
|
|
TranspoGene
Transposed elements influence on the transcriptome of seven vertebrates and invertebrates
|
|
*ReputationScore indicates how established a given datasource is. Find out more.