myTAI

Travis-CI Build Status rstudio mirror downloads rstudio mirror downloads Paper link

Evolutionary Transcriptomics with R

Motivation

Evolutionary transcriptomics studies can serve as a first approach to screen in silico for the potential existence of evolutionary constraints within a biological process of interest. This is achieved by quantifying transcriptome conservation patterns and their underlying gene sets in biological processes. The exploratory analysis functions implemented in myTAI provide users with a standardized, automated and optimized framework to detect patterns of evolutionary constraints in any transcriptome dataset of interest.

Please find a detailed documentation here.

Citation

Please cite the following paper when using myTAI for your own research. This will allow me to continue working on this software tool and will motivate me to extend its functionality and usability in the next years. Many thanks in advance :)

Drost et al. myTAI: evolutionary transcriptomics with R . Bioinformatics 2018, 34 (9), 1589-1590. doi:10.1093

Installation

Please install the following package dependencies:

# Install core Bioconductor packages
if (!requireNamespace("BiocManager"))
    install.packages("BiocManager")
BiocManager::install()
# Install package dependencies
BiocManager::install("Biostrings")
BiocManager::install("edgeR")

Now users can install myTAI from CRAN:

# install myTAI 0.9.3
install.packages("myTAI", dependencies = TRUE)

Short package description

Using myTAI, any existing or newly generated transcriptome dataset can be combined with evolutionary information (find details here) to retrieve novel insights about the evolutionary conservation of the transcriptome at hand.

For the purpose of performing large scale evolutionary transcriptomics studies, the myTAI package implements the quantification, statistical assessment, and analytics functionality to allow researchers to study the evolution of biological processes by determining stages or periods of evolutionary conservation or variability in transcriptome data.

We hope that myTAI will become the community standard tool to perform evolutionary transcriptomics studies and we are happy to add required functionality upon request.

Scientific background

Today, phenotypic phenomena such as morphological mutations, diseases or developmental processes are primarily investigated on the molecular level using transcriptomics approaches. Transcriptomes denote the total number of quantifiable transcripts present at a specific stage in a biological process. In disease or developmental (defect) studies transcriptomes are usually measured over several time points. In treatment studies aiming to quantify differences in the transcriptome due to biotic stimuli, abiotic stimuli, or diseases usually treatment / disease versus non-treatment / non-disease transcriptomes are being compared. In either case, comparing changes in transcriptomes over time or between treatments allows us to identify genes and gene regulatory mechanisms that might be involved in governing the biological process of investigation. Although transcriptomics studies are based on a powerful methodology little is known about the evolution of such transcriptomes. Understanding the evolutionary mechanism that change transcriptomes over time, however, might give us a new perspective on how diseases emerge in the first place or how morphological changes are triggered by changes of developmental transcriptomes.

Evolutionary transcriptomics aims to capture and quantify the evolutionary conservation of genes that contribute to the transcriptome during a specific stage of the biological process of interest. The resulting temporal conservation pattern then enables to detect stages of development or other biological processes that are evolutionarily conserved (Drost et al., 2018). This quantification on the highest level is achieved through transcriptome indices (e.g. Transcriptome Age Index or Transcriptome Divergence Index) which aim to quantify the average evolutionary age or sequence conseration of genes that contribute to the transcriptome at a particular stage. In general, evolutionary transcriptomics can be used as a method to quantify the evolutionary conservation of transcriptomes to investigate how transcriptomes underlying biological processes are constrained or channeled due to events in evolutionary history (Dollow’s law) (Drost et al., 2017.

Please note, since myTAI relies on gene age inference and there has been an extensive debate about the best approaches for gene age inference in the last years, please follow my updated discussion about the gene age inference literature.

Install Developer Version

Some bug fixes or new functionality will not be available on CRAN yet, but in the developer version here on GitHub. To download and install the most recent version of myTAI run:

if (!requireNamespace("BiocManager"))
    install.packages("BiocManager")
BiocManager::install()
# Install package dependencies
BiocManager::install("Biostrings", version = "3.8")
BiocManager::install("edgeR")
# install developer version of myTAI
BiocManager::install("drostlab/myTAI")

NEWS

The current status of the package as well as a detailed history of the functionality of each version of myTAI can be found in the NEWS section.

Tutorials

The following tutorials will provide use cases and detailed explainations of how to quantify transcriptome onservation with myTAI and how to interpret the results generated with this software tool.

Example

Load example data

library(myTAI)
# example dataset covering 7 stages of A thaliana embryo development
data("PhyloExpressionSetExample")
# transform absolute expression levels to log2 expression levels
ExprExample <- tf(PhyloExpressionSetExample, log2)

Quantify transcriptome conservation using TAI

# visualize global Transcriptome Age Index pattern
PlotSignature(ExprExample)

Quantify expression level distributions for each gene age category

# plot expression level distributions for each age (=PS) category 
# and each developmental stage 
PlotCategoryExpr(ExprExample, "PS")

Quantify mean expression of individual gene age categories

# plot mean expression of each age category seperated by old (PS1-3)
# versus young (PS4-12) genes
PlotMeans(ExprExample, Groups = list(1:3, 4:12))

Quantify relative mean expression of each age category seperated by old versus young genes

# plot relative mean expression of each age category seperated by old (PS1-3)
# versus young (PS4-12) genes
PlotRE(ExprExample, Groups = list(1:3, 4:12))

# plot the significant differences between gene expression distributions 
# of old (=group1) versus young (=group2) genes
PlotGroupDiffs(ExpressionSet = ExprExample,
               Groups        = list(group_1 = 1:3, group_2 = 4:12),
               legendName    = "PS",
               plot.type     = "boxplot")

Getting started with myTAI

Users can also read the tutorials within (RStudio) :

# source the myTAI package
library(myTAI)

# look for all tutorials (vignettes) available in the myTAI package
# this will open your web browser
browseVignettes("myTAI")

# or as single tutorials

# open tutorial: Introduction to Phylotranscriptomics and myTAI
 vignette("Introduction", package = "myTAI")

# open tutorial: Intermediate Concepts of Phylotranscriptomics
 vignette("Intermediate", package = "myTAI")

# open tutorial: Advanced Concepts of Phylotranscriptomics
 vignette("Advanced", package = "myTAI")

# open tutorial: Age Enrichment Analyses
 vignette("Enrichment", package = "myTAI")
 
# open tutorial: Gene Expression Analysis with myTAI
 vignette("Expression", package = "myTAI")
 
 # open tutorial: Taxonomic Information Retrieval with myTAI
 vignette("Taxonomy", package = "myTAI")

In the myTAI framework users can find:

Phylotranscriptomics Measures:

Visualization and Analytics Tools:

A Statistical Framework and Test Statistics:

All functions also include visual analytics tools to quantify the goodness of test statistics.

Differential Gene Expression Analysis

Taxonomic Information Retrieval

Minor Functions for Better Usibility and Additional Analyses

Studies that successfully used myTAI to quantify transcriptome conservation:

Discussions and Bug Reports

I would be very happy to learn more about potential improvements of the concepts and functions provided in this package.

Furthermore, in case you find some bugs or need additional (more flexible) functionality of parts of this package, please let me know:

https://github.com/drostlab/myTAI/issues

References

Domazet-Lošo T. and Tautz D. A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature (2010) 468: 815-8.

Quint M, Drost HG, et al. A transcriptomic hourglass in plant embryogenesis. Nature (2012) 490: 98-101.

Drost HG, Gabel A, Grosse I, Quint M. Evidence for Active Maintenance of Phylotranscriptomic Hourglass Patterns in Animal and Plant Embryogenesis. Mol. Biol. Evol. (2015) 32 (5): 1221-1231.

Drost HG, Bellstädt J, Ó’Maoiléidigh DS, Silva AT, Gabel A, Weinholdt C, Ryan PT, Dekkers BJW, Bentsink L, Hilhorst H, Ligterink W, Wellmer F, Grosse I, and Quint M. Post-embryonic hourglass patterns mark ontogenetic transitions in plant development. Mol. Biol. Evol. (2016) doi:10.1093/molbev/msw039

Acknowledgement

I would like to thank several individuals for making this project possible.

First I would like to thank Ivo Grosse and Marcel Quint for providing me a place and the environment to be able to work on fascinating topics of Evo-Devo research and for the fruitful discussions that led to projects like this one.

Furthermore, I would like to thank Alexander Gabel and Jan Grau for valuable discussions on how to improve some methodological concepts of some analyses present in this package.

I would also like to thank my past Master Students: Sarah Scharfenberg, Anne Hoffmann, and Sebastian Wussow who worked intensively with this package and helped me to improve the usability and logic of the package environment.