
Identification of multiple tar dna binding protein retropseudogene lineages during the evolution of primates
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT The TAR DNA Binding Protein (TARDBP) gene has become relevant after the discovery of its several pathogenic mutations. The lack of evolutionary history is in contrast to the amount
of studies found in the literature. This study investigated the evolutionary dynamics associated with the retrotransposition of the TARDBP gene in primates. We identified novel
retropseudogenes that likely originated in the ancestors of anthropoids, catarrhines, and lemuriformes, i.e. the strepsirrhine clade that inhabit Madagascar. We also found species-specific
retropseudogenes in the Philippine tarsier, Bolivian squirrel monkey, capuchin monkey and vervet. The identification of a retropseudocopy of the TARDBP gene overlapping a lncRNA that is
potentially expressed opens a new avenue to investigate TARDBP gene regulation, especially in the context of TARDBP associated pathologies. SIMILAR CONTENT BEING VIEWED BY OTHERS
ALU-SC-MEDIATED EXONIZATION GENERATED A MITOCHONDRIAL _LKB1_ GENE VARIANT FOUND ONLY IN HIGHER ORDER PRIMATES Article Open access 27 January 2025 STR MUTATIONS ON CHROMOSOME 15Q CAUSE
THYROTROPIN RESISTANCE BY ACTIVATING A PRIMATE-SPECIFIC ENHANCER OF _MIR7-2_/_MIR1179_ Article 07 May 2024 DECIPHERING THE ROLE OF A SINE-VNTR-_ALU_ RETROTRANSPOSON POLYMORPHISM AS A
BIOMARKER OF PARKINSON’S DISEASE PROGRESSION Article Open access 13 May 2024 INTRODUCTION The availability of whole-genome sequences has accelerated research on the evolution of different
genetic elements. Together with genomic DNA-based gene duplication, an important source of evolutionary innovation are the events of RNA retrotranscription and its insertion into the
genome1,2. In mammals, retrotranscription depends on the long interspersed nuclear element 1 (L1 or LINE1) enzymatic machinery encoded by retrotransposable elements, which generate an
intronless gene duplicate that could produce a protein similar to the parental counterpart3. However, most retrotranscribed sequences are inserted at a random position in the genome, lacking
all necessary transcription elements and becoming a pseudogene, a phenomenon called “dead on arrival”3. However, because a significant number of retrocopies are located in introns of other
genes, they have potential to regulate their host genes functioning as antisense transcripts4,5,6. On the other hand, in the human genome, a number of retrocopies overlap with long noncoding
RNAs (lncRNAs)7, which are regulatory noncoding RNAs of > 200 nucleotides8. lncRNAs can establish specific interactions with nucleic acids and proteins, acting in diverse fashions as
critical regulators of gene expression in several biological processes, including pathological conditions such as cancer and neurodegenerative disorders9. Retrocopies are abundant in
placental mammals, especially in primates10. Extensive evidence indicates that their presence is related to several types of diseases, including neurodegenerative disorders11. Because
retrocopies have potential to produce harmful effects on genomes and transcriptomes, silencing mechanisms seem to have evolved to restrict retrotransposition. Intriguingly, during the life
of healthy humans the brain is the only known somatic tissue where retrotransposition is de-repressed12. Therefore, identifying the presence of retrocopies/retropseudogenes is not only an
important piece of information to have a complete picture of the evolution of any particular gene, but also is necessary to fully understand human health. The TAR DNA Binding Protein
(TARDBP) gene, which encodes the Transactive response DNA-binding protein 43 kDa (TDP-43), has gained considerable attention after the initial discovery that its mutations can cause familial
amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), two major forms of neurodegenerative disorders13,14, with ALS being the most frequent motor neuron disorder in
adults15. Up to date, more than 50 pathogenic missense mutations have been characterized13,14. TDP-43 is an RNA-binding protein with a variety of RNA metabolism functions, including
transcription, mRNA transport and stabilization, miRNA biogenesis, lncRNA processing, and translation16. More recent findings indicate that TDP-43 participates in the pathogenesis of other
neurodegenerative disorders of several other proteinopathies, such as Parkinson’s disease and Alzheimer’s disease, which are conditions characterized by toxic protein aggregation17. In human
cells, under physiological conditions, TDP-43 mainly localizes in the nucleus, but in neurons and glial cells of ALS and FTD patients it shuttles and accumulates in the cytoplasm where
eventually aggregates and contribute to the onset and progression of these diseases18,19,20,21,22. The TARDBP gene is conserved in species that share a common ancestor deep in time23,
suggesting that this gene carries out essential functions. This gene underwent an event of positive selection in the ancestor of mammals24, suggesting functional adaptations for the group.
More recently in evolutionary time, it has been shown that during the evolution of humans, genes related to diseases like Alzheimer's also underwent positive selection25. Although
events of positive selection are seen as conferring selective advantage, as a by-product, they can also have adverse effects26. In this regard, it is proposed that human susceptibility to
neurodegenerative disorders could be a consequence of improving our cognitive function27,28. Besides these studies, not much is known regarding the evolution of TARDBP in primates.
Understanding the evolutionary history of genes represents a critical piece of information, among other things, to perform meaningful comparisons and to understand the variation in function
in different species. However, evolutionary studies have been primarily directed to the functional copy, while much less is known of other phenomena that could potentially impact the
functions associated with the gene. In this regard, the evolution of primates is characterized by a peak of retrotransposition activity in the anthropoid ancestor29,30, which left a
signature of intronless copies, functional or not, of a number of genes in the genome. The aim of this study is to investigate the retrotransposition dynamics associated with the TARDBP gene
in primates. According to our phylogenetic and synteny analyses, we identified retropseudogenes that originated at different times during the evolution of primates. TARDBP retropseudogenes
originated in the anthropoid ancestor, between 67 and 43.2 million years ago, in the ancestor of catarrhines, between 43.2 and 29.4 million years ago, and in the ancestor of lemuriformes,
i.e. the strepsirrhine clade that inhabit Madagascar, between 59.3 and 55 million years ago. We also found species-specific retropseudogenes in the Philippine tarsier (_Carlito syrichta_),
Bolivian squirrel monkey (_Saimiri boliviensis_), capuchin monkey (_Cebus capucinus imitator_) and vervet (_Chlorocebus sabaeus_). Although annotated sequences are not putatively functional,
the identification of a retropseudocopy overlapping a lncRNA opens a new avenue to investigate TARDBP gene regulation. RESULTS MULTIPLE RETROPSEUDOGENES LINEAGES CHARACTERIZE THE EVOLUTION
OF THE TARDBP GENE IN PRIMATES According to our phylogenetic and synteny analyses, we identified retropseudogenes of the TARDBP gene that originated at different times during the evolution
of primates. We identified retropseudogenes originated in the ancestor of anthropoids, between 67 and 43.2 million years ago, in the ancestor of catarrhines, between 43.2 and 29.4 million
years ago, and in the ancestor of lemuriformes, i.e. the strepsirrhine clade that inhabit Madagascar, between 59.3 and 55 million years ago (Fig. 1). More recently in evolutionary time, we
found species-specific retropseudogenes in the Philippine tarsier (_Carlito syrichta_), Bolivian squirrel monkey (_Saimiri boliviensis_), capuchin monkey (_Cebus capucinus imitator_) and
vervet (_Chlorocebus sabaeus_). All of them did not have intron sequences and were identified on a different autosome in comparison to the chromosomal location of the functional copy (Fig.
1). Our gene tree did not significantly deviate from the most updated phylogenetic hypotheses for the main group of primates31,32,33, suggesting that the functional copy of the TARDBP gene
was present as a simple copy gene in the ancestor of the group (Fig. 1). We recovered three highly supported monophyletic groups containing representative species of all major groups of
anthropoids i.e. apes, Old World monkeys and New World monkeys (light blue, brown and green lineages, Fig. 1), indicating that these retropseudogenes originated in the ancestor of the group,
between 67 and 43.2 millions of years ago, and were maintained in representative species of all descendant primate groups. The retropseudogene lineage depicted with the purple shading (Fig.
1), although it was not recovered monophyletic, our synteny analyses suggest that it indeed belongs to a single lineage (Fig. 1). Representative species of the three purple clades possess
the same flanking genes, DIAPH3 at the 5´ side and TDRD3 at the 3´ side of the retropseudogene, strongly suggesting that the lack of monophyly could be attributed to a phylogenetic artifact
(Fig. 1). The small number of changes, as illustrated by the short branches that define the sister group relationships of the main clades, could be the main cause (Fig. 1). We also found a
retropseudogene lineage that according to our phylogenetic tree originated in the ancestor of catarrhine primates, the group that includes apes and Old World monkeys (blue lineage, Fig. 1),
between 43.2 and 29.4 million years ago. In this case, we recovered a clade containing the functional copy of the TARDBP gene in catarrhines (upper pink lineage, Fig. 1), sister to a group
containing a retropseudogene in the same primate group (blue lineage, Fig. 1). The clade containing TARDBP functional sequences from New World monkeys was recovered sister to the above
mentioned clade (Fig. 1). In this clade in addition to the functional TARDBP copy, we found New World monkey specific retropseudogenes for which the evolutionary history is difficult to
resolve given the shortness of the branches (Fig. 1). We identified three retropseudogenes, two in the capuchin monkey (_Cebus capucinus imitator_) and one in the Bolivian squirrel monkey
(_Saimiri boliviensis_). Finally, we recovered a sequence from the Angola colobus (_Colobus angolensis_)(yellow branch, Fig. 1), which was recovered sister to a clade containing the TARDBP
functional copy (pink lineage, Fig. 1) and three retropseudogenes lineages (blue, purple and green clades, Fig. 1). The phylogenetic position of this branch in our gene tree suggests that it
represents a retropseudogene originated in the anthropoid ancestor, but only conserved in this species. In support of this claim, the single flanking gene (PRMT2) found in the genomic piece
containing the TARDBP retropseudogene in the Angola colobus is not shared with any other gene lineage described in this study (Fig. 1). In all cases, the identified retropseudogenes during
the evolutionary history of anthropoid primates have premature stop codons, insertions and/or deletions (supplementary Figs. 1–5). We also identified retropseudogenes in tarsiers and
strepsirrhines (Fig. 1). We found a single retropseudogene in the Philippine tarsier (_Carlito syrichta_), which shows the hallmark of a sequence free from selective constraints, i.e., a
long branch as a signal of an accelerated rate of evolution in comparison to the functional copy (Fig. 1). In the strepsirrhine clade we identified a highly supported lineage containing the
TARDBP functional copy in three species, greater bamboo lemur (_Prolemur simus_), coquerel's sifaka (_Propithecus coquereli_) and the mouse lemur (_Microcebus murinus_), which in turn
was recovered sister to an also highly supported clade containing retropseudogenes in the greater bamboo lemur (_Prolemur simus_) and coquerel's sifaka (_Propithecus coquereli_) (Fig.
1). This tree topology suggests that this retrocopy originated in the ancestor of lemuriformes, i.e. the strepsirrhine clade that inhabits Madagascar, between 59.3 and 55 million years ago,
and it has been maintained in the genome of descendant species. Finally, the functional copy of the bushbaby (_Otolemur garnettii_) was recovered sister to the lemuriformes clade. Similar to
the case of anthropoids, all retropseudogenes identified in tarsiers and strepsirrhines have premature stop codons, insertions and/or deletions (supplementary Figs. 6 and 7). Regarding the
location of the retropseudocopies, most of them are within intergenic regions (Table 1). The exception is the one located on chromosome 2, which overlaps with a lncRNA gene (Fig. 2).
Specifically, the retropseudocopy starts on position 198 of the first lncRNA exon and finishes on position 375 of the first lncRNA intron. DISCUSSION In this study we revealed that the
evolutionary history of TARDBP, a gene that in humans encodes TDP-43, an RNA-binding protein involved in several neurodegenerative disorders13,14, is characterized by the presence of
retropseudogenes that originated at different ages during the evolutionary history of primates. An important fraction of the retropseudogenes originated in the anthropoid ancestor, between
67 and 43.2 million years ago, and has remained in the genome of the species (Fig. 3). This phenomenon fits the expectation of a peak of retrocopy formation around 40 million years ago,
which coincides with an increased activity of L1 retroelements that produced an increment in SINE/Alu retrocopy repeat amplification29,30. Interestingly, this period of time represents a key
moment during the evolutionary history of primates, the radiation of the anthropoid lineage, where significant morphological and physiological traits arose34. Thus, this period of Vesuvian
mode of evolution could be seen as a source of evolutionary novelty that fueled the origin of the phenotypes that define the anthropoid lineage2,35,36. Other retropseudogenes originated in
the catarrhine ancestor, between 43.2 and 29.4 million years ago, and in other primate groups (Fig. 3). In agreement with the literature, and given the nature of the process originating
retrocopies, all of them seem to be non-functional as canonical TARDBP3, which can be verified by the presence of insertions, deletions and/or premature stop codons (supplementary Figs. 1
and 7). The identification of several retropseudogenes for the TARDBP gene in primates appears to be not a surprise as this gene complies with all the requisites to be a gene with multiple
retropseudogenes38,39,40, i.e., short transcripts (coding for 61 to 414 amino acids)41, widely and highly expressed42, low GC-content (47%, average among 23 primate species) and highly
conserved (3.4%, maximal divergence among primates). Furthermore, in agreement with the slow rate of pseudogene length shortening over time, the identified retropseudogenes possess a length
(mean 1128 bp, median 1193 bp) similar to the functional TARDBP gene (1245 bp). Among apes, the number of TARDBP retrotransposition events appear to be higher in comparison to the average
number of retrocopies per parental gene in their genomes43. On average, ape genomes possess 2.9 retrocopies per parental gene43, however in our study we identified five TARDBP
retropseudogenes in each examined ape species. Coincident with previous evidence, we also found a higher number of retropseudogenes in New World monkeys43. Although it is not clear why this
group of primates has more retrocopies compared to catarrhines, it is suggested that a specific lineage expansion of L1PA1 and L1P3 subelements could be related to the observed pattern2,43.
TDP-43 binds to long clusters of GU-rich RNA sequences, which in humans are found in one-third of transcribed genes44. This allows TDP-43 to regulate the processing of thousands of
transcripts, including that of its own transcript45. In fact, TDP-43 establishes a tightly regulated feedback loop46. It has been demonstrated that a twofold increase or decrease in TDP-43
levels is sufficient to promote neurodegeneration45. Thus, TARDBP retropseudogenes could represent an additional layer of regulation of TDP-43 levels and activity. In this regard, it seems
interesting that one of the retropseudocopies is located in a lncRNA (Fig. 2), a pattern that appears not unusual in the human genome47. This fact opens the possibility that this
retropseudocopy regulates the expression of the functional copy of TARDBP48,49,50. In fact, blast searches against the expressed sequence tags (est) database, which represent a snapshot of
genes expressed in a given tissue and/or at a specific developmental stage, show at least one record (BI825397), which possesses an identity value of 90.5% with the retropseudocopy located
on chromosome 2 and is expressed in medulla in an adult male. In contrast, with the TARDBP functional copy, the identity value is 70.6%. It will be important to determine whether in humans
the levels of TDP-43 are affected by the levels of this lncRNA, in particular in the brain of patients suffering the aforementioned neurodegenerative disorders. In conclusion, in this work,
we demonstrate that the TARDBP gene in primates has an evolutionary history characterized by the presence of multiple retropseudogene lineages. In the ancestor of anthropoids occurred a
significant increment of retrotransposition activity, which led to intronless sequences that cannot give rise to functional proteins. However, the fact that one of the retropseudocopies is
present in a lncRNA and is transcribed opens the opportunity to investigate further its role in regulating the expression of the functional TARDBP gene copy, and its influence in the outcome
or fate of the associated neurodegenerative disorders. METHODS DNA SEQUENCES DNA SEQUENCES AND PHYLOGENETIC ANALYSES We performed searches for TAR DNA Binding Protein (TARDBP) genes in
primate genomes in Ensembl v.10241. We retrieved primate orthologs, using the human (_Homo sapiens_) entry, based on the ortholog prediction function of Ensembl v.10241. We identified TARDBP
retropseudogenes in primate species by performing BLASTN searches51, against the whole genome sequence in Ensembl v.10241 using default settings. In each case the query sequence (TARDBP)
was from the same species of the genome in which retropseudogenes were looking for. In our searches, a retropseudogene was recognized as a sequence containing all exons together and found in
a different chromosome in comparison to the functional copy. Genomic fragments containing retropseudogenes were extracted and manually annotated by comparing the coding sequence of the same
species using the program Blast2seq v2.552 with default parameters. Accession numbers and details about the taxonomic sampling are available in Supplementary Table S1. Nucleotide sequences
were aligned using MAFFT v.753, allowing the program to choose the alignment strategy (FFT-NS-i). We used the proposed model tool of IQ-Tree v.1.6.1254 to select the best-fitting model of
nucleotide substitution, which selected GTR + F + R3. We used the maximum likelihood method to obtain the best tree using the program IQ-Tree v1.6.1255. We assessed support for the nodes
using three strategies: a Bayesian-like transformation of aLRT (aBayes test)56, SH-like approximate likelihood ratio test (SH-aLRT)57 and the ultrafast bootstrap approximation58. We carried
out 20 independent runs to explore the tree space, and the tree with the highest likelihood score was chosen. TARDBP sequences from the African elephant (_Loxodonta africana_), blue whale
(_Balaenoptera musculus_) and red fox (_Vulpes vulpes_) were used as outgroups. ASSESSMENT OF CONSERVED SYNTENY We examined genes found upstream and downstream of functional copies and
retropseudogenes. We used the estimates of orthology and paralogy derived from the Ensembl Compara database59; these estimates are obtained from a pipeline that considers both synteny and
phylogeny to generate orthology mappings. These predictions were visualized using the program Genomicus v100.0160. Our assessments were performed in representative species for each lineage.
REFERENCES * Kaessmann, H., Vinckenbosch, N. & Long, M. RNA-based gene duplication: Mechanistic and evolutionary insights. _Nat. Rev. Genet._ 10, 19–31 (2009). CAS PubMed PubMed
Central Google Scholar * Casola, C. & Betrán, E. The genomic impact of gene retrocopies: What have we learned from comparative genomics, population genomics, and transcriptomic
analyses?. _Genome Biol. Evol._ 9, 1351–1373 (2017). CAS PubMed PubMed Central Google Scholar * Zhang, J. Evolution by gene duplication: an update. _Trends Ecol. Evol._ 18, 292–298
(2003). Google Scholar * Pace, J. K. 2nd. & Feschotte, C. The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage. _Genom. Res._ 17,
422–432 (2007). CAS Google Scholar * Tam, O. H. _et al._ Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. _Nature_ 453, 534–538 (2008). ADS CAS PubMed
PubMed Central Google Scholar * Watanabe, T. _et al._ Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. _Nature_ 453, 539–543 (2008). ADS CAS PubMed
Google Scholar * Kubiak, M. R., Szcześniak, M. W. & Makałowska, I. Complex analysis of retroposed genes’ contribution to human genome. _Proteome Trans. Genes_ 11, 542 (2020). CAS
Google Scholar * Nie, L. _et al._ Long non-coding RNAs: versatile master regulators of gene expression and crucial players in cancer. _Am. J. Transl. Res._ 4, 127–150 (2012). CAS PubMed
PubMed Central Google Scholar * Aliperti, V., Skonieczna, J. & Cerase, A. Long non-coding RNA (lncRNA) roles in cell biology, neurodevelopment and neurological disorders. _Noncoding
RNA_ 7, 36 (2021). CAS PubMed PubMed Central Google Scholar * Mighell, A. J., Smith, N. R., Robinson, P. A. & Markham, A. F. Vertebrate pseudogenes. _FEBS Lett._ 468, 109–114 (2000).
CAS PubMed Google Scholar * Ciomborowska-Basheer, J., Staszak, K., Kubiak, M. R. & Makałowska, I. Not So dead genes-retrocopies as regulators of their disease-related progenitors and
hosts. _Cells_ 10, 912 (2021). CAS PubMed PubMed Central Google Scholar * Terry, D. M. & Devine, S. E. Aberrantly high levels of somatic LINE-1 expression and retrotransposition in
human neurological disorders. _Front. Genet._ 10, 1244 (2019). CAS PubMed Google Scholar * Sreedharan, J. _et al._ TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis.
_Science_ 319, 1668–1672 (2008). ADS CAS PubMed PubMed Central Google Scholar * Kabashi, E. _et al._ TARDBP mutations in individuals with sporadic and familial amyotrophic lateral
sclerosis. _Nat. Genet._ 40, 572–574 (2008). CAS PubMed Google Scholar * Chiò, A. _et al._ Global epidemiology of amyotrophic lateral sclerosis: a systematic review of the published
literature. _Neuroepidemiology_ 41, 118–130 (2013). PubMed Google Scholar * Hanson, K. A., Kim, S. H. & Tibbetts, R. S. RNA-binding proteins in neurodegenerative disease: TDP-43 and
beyond. _Wiley Interdiscip. Rev. RNA_ 3, 265–285 (2012). CAS PubMed Google Scholar * Klim, J. R., Pintacuda, G., Nash, L. A., Juan, I. G. S. & . & Eggan, K., K. Connecting TDP-43
Pathology with Neuropathy. _Trends Neurosci_ https://doi.org/10.1016/j.tins.2021.02.008 (2021). Article PubMed Google Scholar * Neumann, M. _et al._ Ubiquitinated TDP-43 in frontotemporal
lobar degeneration and amyotrophic lateral sclerosis. _Science_ 314, 130–133 (2006). ADS CAS PubMed Google Scholar * Arai, T. _et al._ TDP-43 is a component of ubiquitin-positive
tau-negative inclusions in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. _Biochem. Biophys. Res. Commun._ 351, 602–611 (2006). CAS PubMed Google Scholar *
Robberecht, W. & Philips, T. The changing scene of amyotrophic lateral sclerosis. _Nat. Rev. Neurosci._ 14, 248–264 (2013). CAS PubMed Google Scholar * Heyburn, L. & Moussa,
C.E.-H. TDP-43 in the spectrum of MND-FTLD pathologies. _Mol. Cell. Neurosci._ 83, 46–54 (2017). CAS PubMed PubMed Central Google Scholar * Pinarbasi, E. S. _et al._ Active nuclear
import and passive nuclear export are the primary determinants of TDP-43 localization. _Sci. Rep._ 8, 7083 (2018). ADS PubMed PubMed Central Google Scholar * Wang, H.-Y., Wang, I.-F.,
Bose, J. & Shen, C.-K.J. Structural diversity and functional implications of the eukaryotic TDP gene family. _Genomics_ 83, 130–139 (2004). CAS PubMed Google Scholar * Zhao, L. _et
al._ TDP-43 facilitates milk lipid secretion by post-transcriptional regulation of Btn1a1 and Xdh. _Nat. Commun._ 11, 341 (2020). ADS CAS PubMed PubMed Central Google Scholar *
Vamathevan, J. J. _et al._ The role of positive selection in determining the molecular cause of species differences in disease. _BMC Evol. Biol._ 8, 273 (2008). PubMed PubMed Central
Google Scholar * Holt, R. D., Nesse, R. M. & Williams, G. C. Why we get sick: The new science of Darwinian medicine. _Ecology_ 77, 983 (1996). Google Scholar * Gearing, M., Rebeck, G.
W., Hyman, B. T., Tigges, J. & Mirra, S. S. Neuropathology and apolipoprotein E profile of aged chimpanzees: Implications for Alzheimer disease. _Proc. Natl. Acad. Sci. USA_ 91,
9382–9386 (1994). ADS CAS PubMed PubMed Central Google Scholar * Keller, M. C. & Miller, G. Resolving the paradox of common, harmful, heritable mental disorders: Which evolutionary
genetic models work best?. _Behav. Brain Sci._ 29, 385–404 (2006). PubMed Google Scholar * Ohshima, K. _et al._ Whole-genome screening indicates a possible burst of formation of processed
pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates. _Genome Biol._ 4, R74 (2003). PubMed PubMed Central Google Scholar * Marques, A. C., Dupanloup, I.,
Vinckenbosch, N., Reymond, A. & Kaessmann, H. Emergence of young human genes after a burst of retroposition in primates. _PLoS Biol._ 3, e357 (2005). PubMed PubMed Central Google
Scholar * Pozzi, L. _et al._ Primate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes. _Mol. Phylogenet. Evol._ 75, 165–183 (2014). PubMed
PubMed Central Google Scholar * Finstermeier, K. _et al._ A mitogenomic phylogeny of living primates. _PLoS One_ 8, e69504 (2013). ADS CAS PubMed PubMed Central Google Scholar *
Perelman, P. _et al._ A molecular phylogeny of living primates. _PLoS Genet._ 7, e1001342 (2011). CAS PubMed PubMed Central Google Scholar * Kay, R. F., Ross, C. & Williams, B. A.
Anthropoid origins. _Science_ 275, 797–804 (1997). CAS PubMed Google Scholar * Long, M., Betrán, E., Thornton, K. & Wang, W. The origin of new genes: glimpses from the young and old.
_Nat. Rev. Genet._ 4, 865–875 (2003). CAS PubMed Google Scholar * Kaessmann, H., Vinckenbosch, N. & Long, M. RNA-based gene duplication: mechanistic and evolutionary insights. _Nat.
Rev. Genet._ 10, 19–31 (2009). CAS PubMed PubMed Central Google Scholar * Kumar, S., Stecher, G., Suleski, M. & Hedges, S. B. Timetree: A resource for timelines, timetrees, and
divergence times. _Mol. Biol. Evol._ 34, 1812–1819 (2017). CAS PubMed Google Scholar * Zhang, Z. & Gerstein, M. Large-scale analysis of pseudogenes in the human genome. _Curr. Opin.
Genet. Dev._ 14, 328–335 (2004). CAS PubMed Google Scholar * McDonell, L. & Drouin, G. The abundance of processed pseudogenes derived from glycolytic genes is correlated with their
expression level. _Genome_ 55, 147–151 (2012). CAS PubMed Google Scholar * Gonçalves, I., Duret, L. & Mouchiroud, D. Nature and structure of human genes that generate
retropseudogenes. _Genome Res._ 10, 672–678 (2000). PubMed PubMed Central Google Scholar * Yates, A. D. _et al._ Ensembl 2020. _Nucleic Acids Res._ 48, D682–D688 (2020). CAS PubMed
Google Scholar * Uhlén, M. _et al._ Proteomics: Tissue-based map of the human proteome. _Science_ 347, 1260419 (2015). PubMed Google Scholar * Navarro, F. C. P. & Galante, P. A. F. A
genome-wide landscape of retrocopies in primate genomes. _Genome Biol. Evol._ 7, 2265–2275 (2015). CAS PubMed PubMed Central Google Scholar * Rengifo-Gonzalez, J. C. _et al._ The
cooperative binding of TDP-43 to GU-rich RNA repeats antagonizes TDP-43 aggregation. _eLife_ https://doi.org/10.7554/eLife.67605 (2021). Article PubMed PubMed Central Google Scholar *
Weskamp, K. & Barmada, S. J. TDP43 and RNA instability in amyotrophic lateral sclerosis. _Brain Res._ 1693, 67–74 (2018). CAS PubMed PubMed Central Google Scholar * Ayala, Y. M. _et
al._ TDP-43 regulates its mRNA levels through a negative feedback loop. _EMBO J._ 30, 277–288 (2011). CAS PubMed Google Scholar * Milligan, M. J. _et al._ Global intersection of long
non-coding RNAs with processed and unprocessed pseudogenes in the human genome. _Front. Genet._ 7, 26 (2016). PubMed PubMed Central Google Scholar * Milligan, M. J. & Lipovich, L.
Pseudogene-derived lncRNAs: Emerging regulators of gene expression. _Front. Genet._ 5, 476 (2014). PubMed Google Scholar * Nam, J.-W., Choi, S.-W. & You, B.-H. Incredible RNA: Dual
Functions of Coding and Noncoding. _Mol. Cells_ 39, 367–374 (2016). CAS PubMed PubMed Central Google Scholar * Zhu, Y. _et al._ Discovery of coding regions in the human genome by
integrated proteogenomics analysis workflow. _Nat. Commun._ 9, 903 (2018). ADS PubMed PubMed Central Google Scholar * Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D.
J. Basic local alignment search tool. _J. Mol. Biol._ 215, 403–410 (1990). CAS PubMed Google Scholar * Tatusova, T. A. & Madden, T. L. BLAST 2 Sequences, a new tool for comparing
protein and nucleotide sequences. _FEMS Microbiol. Lett._ 174, 247–250 (1999). CAS PubMed Google Scholar * Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software
version 7: Improvements in performance and usability. _Mol. Biol. Evol._ 30, 772–780 (2013). CAS PubMed PubMed Central Google Scholar * Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F.,
von Haeseler, A. & Jermiin, L. S. ModelFinder: Fast model selection for accurate phylogenetic estimates. _Nat. Methods_ 14, 587–589 (2017). CAS PubMed PubMed Central Google Scholar *
Trifinopoulos, J., Nguyen, L.-T., von Haeseler, A. & Minh, B. Q. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. _Nucleic Acids Res._ 44, W232–W235 (2016).
CAS PubMed PubMed Central Google Scholar * Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and
robustness of fast likelihood-based approximation schemes. _Syst. Biol._ 60, 685–699 (2011). PubMed PubMed Central Google Scholar * Guindon, S. _et al._ New algorithms and methods to
estimate maximum-likelihood phylogenies: assessing the performance of PhyML Syst. _Biol._ 59, 307–321 (2010). CAS Google Scholar * Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B.
Q. & Vinh, L. S. UFBoot2: Improving the ultrafast bootstrap approximation. _Mol. Biol. Evol._ 35, 518–522 (2018). CAS PubMed Google Scholar * Herrero, J. _et al._ Ensembl comparative
genomics resources. _Database_ 2016, bav096. https://doi.org/10.1093/database/bav096 (2016). Article CAS PubMed PubMed Central Google Scholar * Nguyen, N. T. T., Vincens, P., Roest
Crollius, H. & Louis, A. Genomicus 2018: Karyotype evolutionary trees and on-the-fly synteny computing. _Nucleic Acids Res._ 46, D816–D822 (2018). CAS PubMed Google Scholar Download
references ACKNOWLEDGEMENTS This work was supported by Fondo Nacional de Desarrollo Científico y Tecnológico from Chile (FONDECYT 1210471) and Millennium Nucleus of Ion Channel Associated
Diseases (MiNICAD), Iniciativa Científica Milenio, Ministry of Economy, Development and Tourism from Chile to JCO, Fondo Nacional de Desarrollo Científico y Tecnológico from Chile (FONDECYT
1180957) to FJM and LVC and Fondo Nacional de Desarrollo Científico y Tecnológico from Chile (FONDECYT 1211481) to GAM. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Integrative Biology
Group, Universidad Austral de Chile, Valdivia, Chile Juan C. Opazo, Luis Vargas-Chacoff, Francisco J. Morera & Gonzalo A. Mardones * Instituto de Ciencias Ambientales y Evolutivas,
Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile Juan C. Opazo & Kattina Zavala * Millennium Nucleus of Ion Channel-Associated Diseases (MiNICAD), Valdivia, Chile Juan
C. Opazo * Instituto de Ciencias Marinas y Limnológicas, Universidad Austral de Chile, Valdivia, Chile Luis Vargas-Chacoff * Centro Fondap de Investigación de Altas Latitudes (IDEAL),
Universidad Austral de Chile, Valdivia, Chile Luis Vargas-Chacoff * Applied Biochemistry Laboratory, Facultad de Ciencias Veterinarias, Instituto de Farmacología y Morfofisiología,
Universidad Austral de Chile, Valdivia, Chile Francisco J. Morera * Department of Physiology, School of Medicine, Universidad Austral de Chile, Valdivia, Chile Gonzalo A. Mardones * Center
for Interdisciplinary Studies of the Nervous System (CISNe), Universidad Austral de Chile, Valdivia, Chile Gonzalo A. Mardones Authors * Juan C. Opazo View author publications You can also
search for this author inPubMed Google Scholar * Kattina Zavala View author publications You can also search for this author inPubMed Google Scholar * Luis Vargas-Chacoff View author
publications You can also search for this author inPubMed Google Scholar * Francisco J. Morera View author publications You can also search for this author inPubMed Google Scholar * Gonzalo
A. Mardones View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS J.C.O. and G.A.M. designed the study. K.Z., J.C.O. collected and analyzed data.
J.C.O. and G.A.M. wrote the manuscript. L.V.C., F.J.M. reviewed and edited the manuscript. All authors contributed to the article and approved the submitted version. CORRESPONDING AUTHORS
Correspondence to Juan C. Opazo or Gonzalo A. Mardones. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PUBLISHER'S NOTE
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION 1. SUPPLEMENTARY
FIGURE 1. SUPPLEMENTARY FIGURE 2. SUPPLEMENTARY FIGURE 3. SUPPLEMENTARY FIGURE 4. SUPPLEMENTARY FIGURE 5. SUPPLEMENTARY FIGURE 6. SUPPLEMENTARY FIGURE 7. SUPPLEMENTARY TABLE 1. SUPPLEMENTARY
INFORMATION 2. RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation,
distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and
indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit
line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use,
you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS
ARTICLE CITE THIS ARTICLE Opazo, J.C., Zavala, K., Vargas-Chacoff, L. _et al._ Identification of multiple TAR DNA binding protein retropseudogene lineages during the evolution of primates.
_Sci Rep_ 12, 3823 (2022). https://doi.org/10.1038/s41598-022-07908-8 Download citation * Received: 25 May 2021 * Accepted: 22 February 2022 * Published: 09 March 2022 * DOI:
https://doi.org/10.1038/s41598-022-07908-8 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not
currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative