16年底Nucleic Acids Research ( NAR ) 上共收录152篇与数据库相关的文章,其中54个数据库是之前没有新开发出来的,98个数据库进行升级,还有16篇处于待发表状态,这些数据库涵盖基因组结构、基因表达及其调控、蛋白质、蛋白质结构域和蛋白质-蛋白质相互作用等。近年来数据库的一个趋势倾向于处理人类健康从癌症突变到药物和药物靶标,由此也可以看出来已开始从科研层面向应用层面转变,越来越贴近人们健康。同时NAR评审员和编辑选定三个数据库作为“突破性贡献”,它们主要是涵盖人类从头基因变型,模式生物中的疾病相关表型和用于治疗靶标识别和验证的生物信息学平台,预计这些数据库将会吸引更多的遗传学和基因组学领域工作的研究人员的关注。
以下是最新发表的50多个数据库的链接地址及功能描述。
| Database name | Brief descriptiona | URL |
| 3DSNP | Human noncoding SNPs: interactions with genes and other SNPs | http://biotech.bmi.ac.cn/3dsnp/ |
| AAgAtlas | Human AutoAntigen database | http://aagatlas.ncpsb.org |
| ADPriboDB | ADP-ribosylated proteins and sites | http://adpribodb.leunglab.org/ |
| antiSMASH | antibiotics and Secondary Metabolite Analysis SHell | http://antismash-db.secondarymetabolites.org |
| AraPheno | Phenotypic data for Arabidopsis thaliana | https://arapheno.1001genomes.org |
| ccNET | Co-expression networks for diploid and polyploid Gossypium | http://structuralbiology.cau.edu.cn/gossypium/ |
| CeNDR | C. elegans Natural Diversity Resource | http://www.elegansvariation.org |
| CGDB | Circadian Gene database | http://cgdb.biocuckoo.org/ |
| CistromeDB | ChIP-Seq and DNase-Seq data in human and mouse | http://cistrome.org/db |
| Coexpedia | Gene co-expression data mapped to medical subject headings (MeSH). | http://www.coexpedia.org |
| dbSAP | Single Amino acid Polymorphisms: SNP-derived variation in human proteins | http://www.megabionet.org/dbSAP |
| denovo-db | Human de novo gene variants detected by parent-child sequencing | http://denovo-db.gs.washington.edu |
| DrugCentral | Active ingredients of approved pharmaceutical products, indications and mode of action | http://drugcentral.org |
| EURISCO | European catalogue for plant genetic resources | http://eurisco.ecpgr.org/ |
| ExAC browser | Exome Aggregation Consortium sequence data | http://exac.broadinstitute.org |
| Exposome-Explorer | Biomarkers of exposure to disease risk factors | http://exposome-explorer.iarc.fr |
| FAIRDOMHub | Findable, Accessible, Interoperable and Reusable Data, Operating procedures and Models | https://fairdomhub.org/ |
| FuzDB | Database of fuzzy protein complexes | http://protdyn-database.org |
| GenomeCRISPR | High-throughput screening using the CRISPR/Cas-9 system | http://genomecrispr.org |
| GTRD | Gene Transcription Regulation Database | http://gtrd.biouml.org |
| HieranoiDB | Ortholog groups and trees inferred by Hieranoid2 software | http://hieranoidb.sbc.su.se/ |
| IGSR | International Genome Sample Resource | http://www.1000genomes.org/data-portal |
| IMG/VR | DOE Joint Genome Institute Viral Resource | https://img.jgi.doe.gov/vr/ |
| JET2 Viewer | viewer/ Joint Evolutionary Trees: protein-protein interaction patches in known structures | http://www.lcqb.upmc.fr/jet2 |
| jPOSTrepo | Japanese ProteOme STandard repository | https://repository.jpostdb.org/ |
| KERIS | Kaleidoscope of gEne Responses to Inflammation among Species | http://igenomed.org/KERIS |
| LinkProt | Topologically complex protein structures | http://linkprot.cent.uw.edu.pl/ |
| LNCediting | RNA editing sites in lncRNAs from human, monkey, mouse and fly | http://bioinfo.life.hust.edu.cn/LNCediting/ |
| MEGaRes | Mechanisms of antimicrobial resistance | https://meg.colostate.edu/MEGaRes/ |
| Membranome | A database of single-pass membrane proteins | http://membranome.org/ |
| MethSMRT | DNA methylation data from Single Molecule, Real-Time sequencing | http://sysbio.sysu.edu.cn/methsmrt |
| mirDNMR | Background de novo mutation rates in human genes | https://www.wzgenomics.cn/mirdnmr/ |
| Monarch Initiative | Human disease-related genotypes and phenotypes in model organisms | http://monarchinitiative.org |
| MRPrimerV | PCR primer pairs for detecting RNA virus-mediated infectious diseases | http://infolab.dgist.ac.kr/MRPrimerV |
| mutLBSgeneDB | Mutations in Ligand Binding Sites gene DataBase | http://www.zhaobioinfo.org/mutLBSgeneDB/ |
| NSDNA | Nervous System Disease NcRNA Atlas | http://www.bio-bigdata.net/nsdna/ |
| Ontobee | Ontology database server of OBO Foundry | http://www.ontobee.org/ |
| Open Targets | Target validation platform: links between potential drug targets and diseases | https://targetvalidation.org |
| pathDIP | Pathway data integration and analysis portal | http://ophid.utoronto.ca/pathDIP |
| PathoYeastract | Transcription regulation in pathogenic yeasts | http://pathoyeastract.org/index.php |
| PceRBase | Plant competing endogenous RNAs | http://bis.zju.edu.cn/pcernadb/index.jsp |
| Pharos | Data on unstudied and understudied drug targets | https://pharos.nih.gov/idg/index |
| PLaMoM | Plant Mobile Macromolecules: Extracellular siRNAs, microRNAs, mRNAs and proteins in plants | http://www.byanbioinfo.org/plamom/ |
| Plant Reactome | Plant metabolic, regulatory and signaling pathways | http://plantreactome.gramene.org/ |
| PMDBase | Plant microsatellites and marker development | http://www.sesame-bioinfo.org/PMDBase |
| POSTAR | Post-transcriptional regulation by RNA-binding proteins | http://POSTAR.ncrnalab.org |
| proGenomes | Consistently annotated bacterial and archaeal genomes | http://van.embl.de/progene/ |
| Proteome-pI | Pre-computed isoelectric points for >5000 proteomes | http://isoelectricpointdb.org/ |
| REDIportal | A-to-I RNA editing events in human | http://srv00.recas.ba.infn.it/atlas/ |
| RNALocate | RNA localization in the cell | http://www.rna-society.org/rnalocate/ |
| SNP2TFBS | Regulatory SNPs affecting predicted transcription factor binding sites | http://ccg.vital-it.ch/snp2tfbs/: |
| SoyNet | Co-functional networks for soy bean Glycine max | http://www.inetbio.org/soynet/ |
| TFBSbank | Transcription Factor Binding Site profiles deduced from ChIP-seq or ChIP-chip data | http://tfbsbank.co.uk/ |
| TSTMP | Target Selection database for human TransMembrane Proteins | http://tstmp.enzim.ttk.mta.hu |
| Uniclust | Clustered protein sequences and multiple sequence alignments | http://uniclust.mmseqs.com/ |
| WERAM | Writers, Erasers and Readers of histone Acetylation and Methylation | http://weram.biocuckoo.org/ |
注:可能有些网站失效
下面是回顾汇总12年来全社会广泛使用的具有权威性,全面性和便利性的数据库。小编觉得数据库不仅可以提高个人知名度和权威性,还可以推动加速科研产出造福人类。
| Database name | Current URL | Brief description |
| Annual updates | ||
| DDBJ | http://www.ddbj.nig.ac.jp | All known nucleotide and protein sequences |
| ENA | http://www.ebi.ac.uk/ena | All known nucleotide and protein sequences |
| GenBank | https://www.ncbi.nlm.nih.gov/genbank/ | All known nucleotide and protein sequences |
| Ensembl | http://www.ensembl.org/ | Annotated information on eukaryotic genomes |
| Mouse Genome Database | http://www.informatics.jax.org | Mouse genome database |
| UCSC Genome Browser | http://genome.ucsc.edu/ | A universal genome viewing and analysis platform |
| UniProt | http://www.uniprot.org | A universal database of protein sequences (includes Swiss-Prot and TrEMBL) |
| Regular updates | ||
| ArrayExpress | http://www.ebi.ac.uk/arrayexpress | Array-based gene expression data |
| BioCycd | http://biocyc.org/ | Pathway information for sequenced |
| BioGRID | http://www.thebiogrid.org | Genetic and physical interactions in yeast, worm and fly |
| BRENDA | http://www.brenda-enzymes.info | Enzyme names and biochemical properties |
| CGD | http://www.candidagenome.org/ | Candida Genome Database |
| CanSAR | http://cansar.icr.ac.uk | Cancer research and drug discovery resource |
| CATH | http://www.cathdb.info | Protein domain structure database |
| CAZy | http://www.cazy.org | Carbohydrate-Active enZymes database |
| CDD | http://www.ncbi.nlm.nih.gov/cdd | Conserved Domain Database |
| ChEBI | http://www.ebi.ac.uk/chebi | Chemical Entities of Biological Interest |
| ChEMBL | https://www.ebi.ac.uk/chembldb | Interaction of drugs and compounds with their targets |
| ChimerDB | http://ercsb.ewha.ac.kr/fusiongene | Chromosome translocations and gene fusions |
| COG | http://www.ncbi.nlm.nih.gov/COG | Clusters of Orthologous Groups of proteins |
| Comparative Toxicogenomics Database | http://ctdbase.org | A knowledgebase for curated chemical-gene-disease networks |
| COSMIC | http://cancer.sanger.ac.uk | Catalogue of Somatic Mutations in Cancer |
| CyanoBase | http://genome.microbedb.jp/cyanobase | Cyanobacterial genomes |
| dbPTM | http://dbPTM.mbc.nctu.edu.tw/ | Post-translational modification of proteins |
| DBTSS | http://dbtss.hgc.jp/ | Database of transcriptional start sites |
| DEG | http://www.essentialgene.org | Database of essential genes |
| DictyBase | http://dictybase.org | Model organism database for Dictyostelium discoideum |
| DrugBank | http://www.drugbank.ca/ | Drug and drug target database |
| EcoCyc | http://ecocyc.org/ | E. coli K12 genes, metabolic pathways, transporters, and gene regulation |
| eggNOG | http://eggnog.embl.de/ | Evolutionary genealogy of genes:Non-supervised Orthologous Groups |
| ELM | http://elm.eu.org/ | Eukaryotic Linear Motif: functional sites in eukaryotic proteins |
| EMAGE | http://www.emouseatlas.org/emage/ | e-Mouse Atlas of Gene Expression |
| ENCODE project at UCSC | http://genome.ucsc.edu/ENCODE | Encyclopedia of DNA Elements,functional elements in human genome |
| EPD | http://epd.vital-it.ch | Eukaryotic Promoter Database |
| EuPathD | http://eupathdb.org/ | Unified genome databases on eukaryotic pathogens (includes PlasmoDB, ToxoDB, ApiDB,TrichDB, TriTrypDB, GiardiaDB,etc.) |
| Expression Atlas | http://www.ebi.ac.uk/gxa/ | Dene expression patterns deduced from microarray and RNA-seq data |
| FANTOM | http://fantom.gsc.riken.jp/ | Functional annotation of mouse full-length cDNA clones |
| FINDBase | http://www.findbase.org | Frequencies of INherited Disorders |
| FlyBase | http://flybase.org/ | Drosophila sequences and genomic information |
| FlyRNAi | http://flyrnai.org/ | Genome-wide RNAi analysis in Drosophila |
| Gene3D | http://gene3d.biochem.ucl.ac.uk | Structural domain assignments for protein sequences |
| Genenames | http://www.genenames.org/ | The HGNC human gene nomenclature database |
| GenomeRNAi | http://www.genomernai.org | RNA interference data for human and Drosophila |
| GEO | http://www.ncbi.nlm.nih.gov/geo/ | NCBI’s Gene Expression Omnibus |
| GO | http://www.geneontology.org | Gene Ontology Database |
| GOA | http://www.ebi.ac.uk/GOA | Gene Ontology annotations for proteins in UniProt |
| GOLD | https://gold.jgi.doe.gov/ | Genomes online database: completed and ongoing genome projects |
| GPCRdb | http://gpcrdb.org/ | Data and tools for studying G protein-coupled receptors |
| Gramene | http://www.gramene.org | Comparative genomics of crops and model plant species |
| GXD | http://www.informatics.jax.org/expression.shtml | Mouse Gene Expression Database |
| HAMAP | http://hamap.expasy.org/ | High-quality Automated and Manual Annotation of Proteins |
| HMDB | http://www.hmdb.ca | Human Metabolome Database |
| IEDB | http://www.iedb.org/ | Immune Epitope Database |
| IMG/M | http://img.jgi.doe.gov/m | JGI’s Integrated Microbial Genomics and Metagenomics |
| IMGT | http://www.imgt.org | International ImMunoGeneTics database. |
| InParanoid | http://InParanoid.sbc.su.se | Orthologous relationships between eukaryotic proteomes |
| IntAct | http://www.ebi.ac.uk/intact/ | Protein–Protein INTerACTion data |
| InterPro | http://www.ebi.ac.uk/interpro | Integrated resource of protein families, domains and functional sites |
| IPD | http://www.ebi.ac.uk/ipd | Immuno Polymorphism database (includes IMGT/HLA) |
| JASPAR | http://jaspar.genereg.net/ | PSSMs for transcription factor DNA-binding sites |
| KEGG | http://www.genome.ad.jp/kegg | Kyoto Encyclopedia of Genes and Genomes: genes, proteins, pathways |
| MEROPS | http://merops.sanger.ac.uk/ | Database of proteases (peptidases) |
| MetaCyc | http://metacyc.org/ | Metabolic pathways and enzymes in various organisms |
| miRBase | http://www.mirbase.org/ | MicroRNA sequences, names and predicted targets in animals |
| miRGator | http://mirgator.kobic.re.kr | MicroRNA expression profiles and mRNA targets |
| miRGen | http://www.microrna.gr/mirgen | MicroRNA promoters and transcription start sites |
| miRTarBase | http://miRTarBase.mbc.nctu.edu.tw/ | Experimentally validated microRNA–target interactions |
| MMDB | http://www.ncbi.nlm.nih.gov/Structure | Molecular Modeling Database of protein structures |
| MODOMICS | http://genesilico.pl/modomics/ | RNA modification pathways |
| Mouse Tumor Biology Database | http://tumor.informatics.jax.org/mtbwi/ | Mouse as a model system of human cancers |
| neXtProt | https://www.nextprot.org/ | A database of human proteins |
| NONCODE | http://noncode.org/ | A database of noncoding RNAs |
| OMIM | http://www.omim.org | Online Mendelian inheritance in man: A catalog of human genetic and genomic disorders |
| OrthoDB | http://www.orthodb.org | An hierarchical catalog of orthologous proteins |
| PANTHER | http://www.pantherdb.org | Protein sequence evolution mapped to functions and pathways |
| PATRIC | http://www.patricbrc.org | PathoSystems Resource Integration Center |
| PDB | http://rcsb.org/pdb | Protein DataBank: All biological macromolecular structures |
| PDBe | http://www.ebi.ac.uk/pdbe/ | Protein Databank in Europe |
| PDBsum | http://www.ebi.ac.uk/pdbsum | Summaries and analyses of PDB structures |
| Pfam | http://pfam.xfam.org | Protein families: Multiple sequence alignments and profile hidden Markov models of protein domains |
| PHI-base | http://www4.rothamsted.bbsrc.ac.uk/phibase/ | Genes affecting fungal pathogen–host interactions |
| PIR | http://pir.georgetown.edu/ | Protein Information Resource, part of UniProt |
| PRIDE | http://www.ebi.ac.uk/pride/ | Proteomics peptide identification database |
| PRINTS | http://www.bioinf.man.ac.uk/dbbrowser/PRINTS | Protein fingerprints, conserved motifs used to characterise a protein family |
| Prosite | http://www.expasy.org/prosite | Biologically-significant protein patterns and profiles |
| PubChem | http://pubchem.ncbi.nlm.nih.gov/ | Structures and biological activities of small organic molecules |
| RGD | http://rgd.mcw.edu/ | Rat Genome Database |
| RDP | http://rdp.cme.msu.edu | Ribosomal Database Project:Bacterial and archaeal 16S rRNA and fungal 28S rRNA sequences |
| Reactome | http://www.reactome.org | A database of metabolic and signaling pathways |
| REBASE | http://rebase.neb.com/rebase/ | Restriction enzyme database |
| RefSeq | https://www.ncbi.nlm.nih.gov/refseq/ | NCBI Reference Sequence Database |
| Rfam | http://rfam.xfam.org | RNA families with multiple sequence alignments |
| SCOP | http://scop.mrc-lmb.cam.ac.uk/ | Structural Classification Of Proteins |
| SGD | http://www.yeastgenome.org | Saccharomyces Genome Database |
| SILVA | http://www.arb-silva.de/ | Aligned small- and large subunit rRNA sequences |
| SIMAP | http://mips.gsf.de/simap/ | Similarity Matrix of Proteins |
| SMART | http://smart.embl-heidelberg.de | Simple Modular Architecture Research Tool: signalling,extracellular and chromatin-associated protein domains |
| STITCH | http://stitch-db.org/ | Search Tool for Interactions of Chemicals |
| STRING | http://string.embl.de/ | Predicted functional associations between proteins |
| SUPERFAMILY | http://supfam.org | Genome-wide identification of protein domains of known structure |
| SWISS-MODEL | http://swissmodel.expasy.org/ | 3D models for proteins of unknown structure |
| TAIR | http://www.arabidopsis.org/ | The Arabidopsis information resource |
| TarBase | http://microrna.gr/tarbase | Database of experimentally supported microRNA targets |
| TCDB | http://www.tcdb.org/ | Transporter protein classification database |
| UCSC Cancer Genomics Browser | https://genome-cancer.ucsc.edu/ | Visualization of cancer genomic datasets |
| VectorBase | https://www.vectorbase.org/ | Invertebrate vectors of human pathogens |
| WormBase | http://www.wormbase.org | Community portal on all aspects of C. elegans biology |
| XenBase | http://www.xenbase.org | Xenopus frog database |
| YEASTRACT | http://www.yeastract.com | Transcriptional regulation in Saccharomyces cerevisiae |
| ZFIN | http://zfin.org/ | Zebrafish information network |
参考文献
1.Galperin M Y, Fernándezsuárez X M, Rigden D J. The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes[J]. Nucleic Acids Research, 2017, 45(Database issue):D1-D11.
