1、SNP2TFBS – a database of regulatory SNPs affecting predicted transcription factor binding site affinity
snp2tfbs是一个研究人类基因组调控原件变异引起相关分子调控机制改变的数据库。具体英文摘要:
SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs.
数据库网址:http://ccg.vital-it.ch/snp2tfbs/.
2、TFBSTools: an R/bioconductor package for transcription factor binding site analysis
TFBSTools 是一个全基因组水平鉴定转录结合位点及可视化的工具。具体英文摘要:
The ability to efficiently investigate transcription factor binding sites (TFBSs) genome-wide is central to computational studies of gene regulation. TFBSTools is an R/Bioconductor package for the analysis and manipulation of TFBSs and their associated transcription factor profile matrices. TFBStools provides a toolkit for handling TFBS profile matrices, scanning sequences and alignments including whole genomes, and querying the JASPAR database. The functionality of the package can be easily extended to include advanced statistical analysis, data visualization and data integration.
下载网址:http://bioconductor.org/packages/TFBSTools/
3、TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions
TcoF-DB v2 是一个收录人和小鼠的调控转录因子的 co-factors 数据库。
具体英文摘要:
For the regulation of transcription, interactions of different regulatory proteins known as transcription co-factors (TcoFs) and TFs are essential in forming necessary protein complexes. Although TcoFs themselves do not bind DNA directly, their influence on transcriptional regulation and initiation, although indirect, has been shown to be significant, with the functionality of TFs strongly influenced by the presence of TcoFs. In the TcoF-DB v2 database, we collect information on TcoFs. In this article, we describe updates and improvements implemented in TcoF-DB v2. TcoF-DB v2 provides several new features that enables exploration of the roles of TcoFs. The content of the database has significantly expanded, and is enriched with information from Gene Ontology, biological pathways, diseases and molecular signatures. TcoF-DB v2 now includes many more TFs; has substantially increased the number of human TcoFs to 958, and now includes information on mouse (418 new TcoFs). TcoF-DB v2 enables the exploration of information on TcoFs and allows investigations into their influence on transcriptional regulation in humans and mice.
网址:http://tcofdb.org/.
4、PEDLA: predicting enhancers with a deep learning-based algorithmic framework
PEDLA 是一个基于深度学习预测增强子的软件。
具体英文摘要:
Although existing methods have achieved some success in enhancer prediction, they still suffer from many issues. We developed a deep learning-based algorithmic framework named PEDLA , which can directly learn an enhancer predictor from massively heterogeneous data and generalize in ways that are mostly consistent across various cell types/tissues. We first trained PEDLA with 1,114-dimensional heterogeneous features in H1 cells, and demonstrated that PEDLA framework integrates diverse heterogeneous features and gives state-of-the-art performance relative to five existing methods for enhancer prediction. We further extended PEDLA to iteratively learn from 22 training cell types/tissues. Our results showed that PEDLA manifested superior performance consistency in both training and independent test sets. On average, PEDLA achieved 95.0% accuracy and a 96.8% geometric mean (GM) of sensitivity and specificity across 22 training cell types/tissues, as well as 95.7% accuracy and a 96.8% GM across 20 independent test cell types/tissues. Together, our work illustrates the power of harnessing state-of-the-art deep learning techniques to consistently identify regulatory elements at a genome-wide scale from massively heterogeneous data across diverse cell types/tissues.
下载地址:https://github.com/wenjiegroup/PEDLA
5、PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants
PlantTFDB 是一个收集植物转录因子及对应的靶基因数据库
具体英文摘要:
With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/
). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.
网址:http://planttfdb.cbi.pku.edu.cn/
6、JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles
JASPAR 是一个收集经过注释的非冗余的转录因子数据库。
具体英文摘要:
JASPAR is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release.
参考文献
Kumar S, Ambrosini G, Bucher P. SNP2TFBS–a database of regulatory SNPs affecting predicted transcription factor binding site affinity[J]. Nucleic Acids Research, 2017, 45(D1): D139-D144.
Tan G, Lenhard B. TFBSTools: an R/bioconductor package for transcription factor binding site analysis[J]. Bioinformatics, 2016: btw024.
Schmeier S, Alam T, Essack M, et al. TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions[J]. Nucleic Acids Research, 2017, 45(D1): D145-D150.
Liu F, Li H, Ren C, et al. PEDLA: predicting enhancers with a deep learning-based algorithmic framework[J]. Scientific Reports, 2016, 6.
Jin J, Tian F, Yang D C, et al. PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants[J]. Nucleic Acids Research, 2017, 45(D1): D1040-D1045.
Mathelier A, Fornes O, Arenillas D J, et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles[J]. Nucleic acids research, 2016, 44(D1): D110-D115.
图片来源于网络,侵删