3.5. Detection of Highly Age-Related Transcripts: AgingGenes

The first analysis performed was the identification of highly age-related transcripts. For this, we applied a moving average transformation with a time window of 5 years in no ageCollapsed, to reduce the noise of the time series (transcripts). This transformation was done through the R package (“CRAN - Package TTR,” [s.d.]) of R. Then, the dimensionality was reduced through two procedures in order to use only the most important probes for the analysis. Finally, probes highly correlated with age were selected.

3.5.1. Removal of Unexpressed and/or Constant Probes

As normalization by UPC gives us the probability that a transcript is being expressed, all that had a value less than or equal to 0.5 were classified as not expressed, as suggested by (PICCOLO et al., 2013) If a transcript has never been expressed, that is, its expression value was less than or equal to 0.5 for all ages, it was removed. In addition, transcripts with a standard deviation (sd) close to zero (sd < 0.01) were also removed.

3.5.2. Agreement of Correlations between Genders: SexGenes

Before any analysis, the agreement between genders in the relationship between gene expression profiles and age was evaluated. To this end, two datasets similar to ageCollapsed were constructed, but each consisting only of samples of each sex. The correlation (Spearman) between the expression of each transcript with age was calculated and it was identified which transcripts minimally correlated with age (with the absolute value of Spearman's Correlation Coefficient (Rho) > 0.35) had discrepancies in the direction of correlations between the sexes. That is, transcripts correlated positively with age in men but negatively in women, and vice versa. Such transcripts were named SexGenes and removed from further analysis.

3.5.3. Selection of Age-Correlated Transcripts

Age-correlated transcripts were identified using Spearman's correlation (Rho > 0.75) and Maximal Information Coefficient (MIC, MIC > 0.75), which is a way of detecting non-linear relationships between two variables, present in the minerva package (RESHEF et al. ., 2011) of R, and which will be further explained in Section 3.6 of the Methodology. Transcripts that passed these criteria were selected for the following analyzes and named AgingGenes. AgingGenes have three classes of transcripts: transcripts with positive linear correlation (pos, Rho > 0.75), negative linear (neg, Rho < -0.75) and non-linear (mic, MIC > 0.75). As the MIC can also identify transcripts linearly correlated with age, only those identified exclusively by the MIC were defined as non-linear.

Pink - LinearNeg; green - mic. AgingGenes Enrichment Analysis

Each AgingGenes class underwent an Over Representation Analysis (ORA) from the Reactome database (vs. 2016, (CROFT et al., 2011) using the enrichR R package tool (KULESHOV et al., 2016).

‌ Even though the AgingGenes are separated into classes according to their transcriptional profile, we assume that each class represents transcriptional alterations in different processes of the same system, having, then, some kind of relationship at the level of biological processes. Taking advantage of the hierarchical characteristic existing between the Reactome pathways, a methodology was developed that allows a global look at the enriched hierarchical structures among the AgingGenes. To this end, all roads with an enrichment value (Combined Score) greater than 0 were previously selected and combined in a tree structure through their hierarchical relationships. Then, all the “daughter” tracks (terminal node of a tree structure) with a Combined Score <5 were removed. This procedure leads to the creation of several enrichment structures (trees) with the most varied sizes (number of components/ pathways in the structure). After selecting only the large structures, consisting of more than 6 components, we created networks from the visNetwork package (“visNetwork: Network Visualization using 'vis.js' Library,” [sd]) from R to represent them and help to understand.

Last updated