Chloroplast genome analysis and evolutionary insights in the versatile medicinal plant calendula officinalis l.

Nature

Select a language for the TTS:
UK English Female
UK English Male
US English Female
US English Male
Australian Female
Australian Male
Language selected: (auto detect) - EN

Play all audios:

ABSTRACT _Calendula officinalis_ L.is a versatile medicinal plant with numerous applications in various fields. However, its chloroplast genome structure, features, phylogeny, and patterns

of evolution and mutation remain largely unexplored. This study examines the chloroplast genome, phylogeny, codon usage bias, and divergence time of _C. officinalis_, enhancing our

understanding of its evolution and adaptation. The chloroplast genome of _C. officinalis_ is a 150,465 bp circular molecule with a G + C content of 37.75% and comprises 131 genes.

Phylogenetic analysis revealed a close relationship between _C. officinalis_, _C. arvensis_, and _Osteospermum ecklonis_. A key finding is the similarity in codon usage bias among these

species, which, coupled with the divergence time analysis, supports their close phylogenetic proximity. This similarity in codon preference and divergence times underscores a parallel

evolutionary adaptation journey for these species, highlighting the intricate interplay between genetic evolution and environmental adaptation in the Asteraceae family. Moreover unique

evolutionary features in _C. officinalis_, possibly associated with certain genes were identified, laying a foundation for future research into the genetic diversity and medicinal value of

_C. officinalis_. SIMILAR CONTENT BEING VIEWED BY OTHERS PHYLOGENY AND EVOLUTIONARY DYNAMICS OF THE _RUBIA_ GENUS BASED ON THE CHLOROPLAST GENOME OF _RUBIA TIBETICA_ Article Open access 24

April 2025 COMPARATIVE CHLOROPLAST GENOME ANALYSIS OF FOUR _POLYGONATUM_ SPECIES INSIGHTS INTO DNA BARCODING, EVOLUTION, AND PHYLOGENY Article Open access 01 October 2023 PHYLOGENOMIC

ANALYSIS AND DYNAMIC EVOLUTION OF CHLOROPLAST GENOMES OF _CLEMATIS NANNOPHYLLA_ Article Open access 02 July 2024 INTRODUCTION _Calendula officinalis_ L., a short-lived annual herbaceous

species of the genus _Calendula_ in the Asteraceae family, garners global recognition for its ubiquity and resilience. Predominantly located in the United States and Europe, it thrives in

sunlit or partially shaded environments, necessitating minimal cultivation and management1. The plant stands 12–30 inches tall, characterized by its yellow to orange hermaphrodite flowers

usually 2–3 inches in diameter, which bloom in a head-shaped inflorescence2. The leaves of _C. officinalis_ are oblanceolate, alternate, sessile, and bright green, with stems adorned with

umbel-like branches. The plant yields a curved, ring-shaped, and sickle-shaped achene2. The European Union has funded multiple research projects focused on _C. officinalis_ due to its

multifaceted role. Its diverse colors and aroma make it a favored decorative plant, and its bioactive compounds—carotenoids, saponins, amino acids—have found significant applications in

chemical and pharmacological domains, offering anti-inflammatory, anti-viral, anti-genotoxic properties among others2,3. Despite its extensive utility and recognition as a versatile

medicinal plant, _C. officinalis_ L. remains a subject of scientific curiosity, particularly regarding its genetic makeup and evolutionary history. Previous studies have laid the groundwork

by identifying its pharmacological benefits and some aspects of its bioactive compounds2,3. However, research into its chloroplast genome structure, phylogenetic relationships, and

evolutionary dynamics has been notably sparse. This gap signifies a substantial opportunity to deepen our understanding of _C. officinalis_'s genetic underpinnings and evolutionary

trajectory. Consequently, a comprehensive understanding of the chloroplast genome structure, phylogeny, and evolutionary mutation patterns of _C. officinalis_ remains elusive. Chloroplasts,

critical plant organelles, govern photosynthesis, biosynthesis, and carbon sequestration4. These organelles possess an independent genetic system from the nuclear genome, and since the first

chloroplast genome from _Nicotiana tabacum_ was sequenced, the structure and function of chloroplast genomes have been progressively elucidated5. A typical chloroplast genome measures

between 100 to 200 kb and has a four-part structure encompassing the large-single copy (LSC) region, the small-single copy (SSC) region, and two inverted repeat regions (IR)6. Chloroplast

genome plays a crucial role in elucidating the evolutionary dynamics and phylogenetic relationships of plant species. By analyzing chloroplast DNA, particularly its conserved and variably

evolving noncoding regions, researchers have gained insights into plant diversity, evolutionary rates, and lineage-specific evolutionary patterns7,8,9. Codons form the fundamental link

between nucleic acids and proteins, and synonymous codons, barring methionine and tryptophan, encode identical amino acids10,11. Codon usage bias, a phenomenon prevalent across organisms,

refers to the variability in the frequency of synonymous codons coding for the same amino acid12. This bias not only mirrors the species' or genes' origin, evolution, and mutation

patterns but also significantly impacts gene function and protein expression13. Despite previous studies focusing on the codon usage bias in nuclear genomes14,15, the chloroplast

genomes' genetic code varies from the standard genetic code16, thereby necessitating an analysis of the codon usage bias in the chloroplast genomes. High-throughput sequencing

technologies have facilitated the sequencing of numerous plant chloroplast genomes, and over two thousand have been deposited in GenBank at the National Center for Biotechnology Information

(NCBI), thereby bolstering systematic evolutionary research based on chloroplast genome codon analysis. Most plant species' codon usage bias based on chloroplast genomes has been

analyzed, and their phylogenetic status, evolution, and mutation patterns have been well-delineated. For instance, multiple species of the _Oryza_ and _Gynostemma_ genera have had their

phylogenetic status, evolution, and mutation patterns elucidated through phylogenetic analysis and codon usage bias analysis based on chloroplast genomes. In the present investigation, we

characterized the chloroplast genome of _C. officinalis_ and performed a comprehensive phylogenetic analysis anchored on this genome. Furthermore, we conducted a detailed exploration of the

codon usage bias in _C. officinalis_ and its closely related species _Calendula arvensis_ and _Osteospermum ecklonis_. This approach facilitated our understanding of the plant's genomic

attributes, evolutionary adaptation mechanisms, and phylogenetic positioning. MATERIALS AND METHODS PLANT MATERIALS AND GENOME SEQUENCING Fresh leaves were picked from _C. officinalis_

(Fig. 1) planted near the Changde Vocational Technical College, Changde, Hunan province, China (N29°02′29.74", E111°38′05.31", 34 m). The voucher specimens were well placed at the

College of Life and Environmental Sciences, Hunan University of Arts and Sciences (Contact Person: Kerui Huang, [email protected], voucher number JZH007). The library was constructed

using the DNAsecure Plant Kit (TIANGEN Biotech Co., Ltd., Beijing) and the sequencing was performed on an Illumina HiSeq 2500 platform (San Diego, CA), both outsourced to Shanghai

Personalbio Technology Co., Ltd. (China). CHLOROPLAST GENOME ASSEMBLING AND ANNOTATION After filtering out the low-quality reads using fastp17, 81,419,412 clean reads were retained for

further analysis. The chloroplast genome of _C. officinalis_ was de novo assembled using GetOrganelle v1.7.518 with parameters set as -R 15 -k 21,45,65,85,105 -F embplant_pt. Subsequently,

the assembled chloroplast genome was annotated using CPGAVAS219 with default settings, and a circular genome map was visualized using CPGView (http://www.1kmpg.cn/cpgview/). PHYLOGENETIC

ANALYSIS A total of 44 chloroplast genomes closely related to _C. officinalis_, along with 2 outgroups, were downloaded from GenBank for phylogenetic analysis. Among them, 74 protein-coding

genes shared by all genomes were screened out for subsequent analysis. Sequence alignment of each gene was performed separately using MAFFT v7.31320. Gblocks 0.91b was then utilized to

remove poorly aligned regions of each gene. The filtered gene sequences were concatenated head-to-tail into supergenes21. Maximum likelihood phylogenies were generated using IQ-TREE

v1.6.1222. The TVM + F + I + G4 model was selected based on the Bayesian Information Criterion (BIC) in ModelFinder. This process was further strengthened with 5000 ultrafast bootstrap

replications for robust statistical support along with Shimodaira-Hasegawa-like approximate likelihood ratio test. CODON USAGE BIAS ANALYSIS AND IR BORDER ANALYSIS In this study, the

chloroplast genome sequences of _C. officinalis_, _C. arvensis_, and _O. ecklonis_ were used to analyze codon usage bias. Coding sequences (CDS) were meticulously screened to meet specific

criteria: multiples of three in base count, sequence length ≥ 300 bp, inclusion of only A, T, C, G bases, presence of start (ATG) and stop codons (TAG, TGA, TAA), and absence of internal

stop codons and duplicate sequences, retaining 53 CDS for each species. Using CodonW and CUSP online software, metrics such as ENc, RSCU, CAI, CBI, Fop, and GC content were calculated.

Codons for Met, Trp, and stop codons were excluded. Analyses including ENc-plot, PR2-plot, neutrality plot23,24, and correspondence analysis based on RSCU values25,26 were conducted,

assessing the influence of mutation pressure and natural selection on codon usage bias. The comparative analysis of the boundaries separating the IRs, SSC, and LSC regions within chloroplast

genomes was conducted utilizing the online tool IRscope, which is available at https://irscope.shinyapps.io/irapp (accessed on May 29, 2024). DIVERGENCE TIME ESTIMATION The divergence times

for the species included in our phylogenetic analysis were estimated using the Markov chain Monte Carlo (MCMC) approach implemented in the PAML software package, specifically utilizing its

MCMCtree program27. The optimal phylogenetic tree topology for our dataset was determined using IQ-TREE. For calibrating the molecular clock, we incorporated three fossil-based calibration

points derived from previous studies27,28,29,30,31,32,33,34. These calibration points were as follows: (F1) between 22.7 and 38.8 million years ago (Ma), (F2) between 17.4 and 44.7 Ma, and

(F3) between 1.13 and 33.48 Ma. These points were strategically chosen to constrain each corresponding node in the phylogenetic tree. Our analysis employed the independent rates model (IRM),

which assumes a lognormal distribution for rate variation among lineages. The HKY85 model was selected for nucleotide substitution, with the alpha parameter for gamma-distributed rate

variation across sites set at 0.5. The birth–death process model was used to establish priors for node ages within the phylogenetic tree. We adhered to the default settings for this model,

with the parameters λ (birth rate) and μ (death rate) both set to 1, and the sampling proportion (s) set to 0. For the MCMC analysis, posterior probabilities of the parameters were

estimated. The initial 10% of trees generated were discarded as burn-in to ensure sampling from a stationary distribution. Subsequent trees were sampled every 10 iterations, culminating in a

total of 10,000 sampled trees for the final analysis. STATEMENT OF PERMISSION AND COMPLIANCE We confirm that _Calendula officinalis_ materials used in this study were collected with

permission from the parterre of Changde Vocational Technical College. The collection complied with all relevant local legislation, and appropriate permissions were granted before the samples

were collected. All experiments and field studies on _Calendula officinalis_ in this research complied with local legislation and were carried out in accordance with relevant institutional,

national, and international guidelines and legislation. The experimental protocols were approved by the relevant ethics committee. RESULTS THE CHLOROPLAST FEATURE OF _C. OFFICINALIS_ The

chloroplast genome of _C. officinalis_ is a circular molecule of 150,465 bp in length (Fig. 2a), which consists of four parts: a large single-copy region (LSC) with a length of 83,056 bp; a

small single-copy region (SSC) with a length of 17,911 bp; and two inverted repeat regions (IRs) of 24,749 bp (Fig. 2). The G + C content of the whole chloroplast was 37.75%, and the IRs

were 43.11%, which was higher than that of the LSC and the SSC regions (35.84% and 31.81%). The schematic representation of the entire chloroplast genome of _Calendula officinalis_ is

depicted in Fig. 2b, and Fig. S1 illustrates the uniform mapping depth across the entire genome, indicating an absence of heteroplasmy. The genome contains 131 genes, including 86

protein-coding genes, eight rRNA genes, and 37 tRNA genes (Fig. 2), the cis-splicing genes and trans-splicing gene _rps12_ in the chloroplast genome of _Calendula officinalis_ can be found

in Fig. S2. PHYLOGENETIC ANALYSIS Based on the chloroplast genome of _C. officinalis_, the Maximum-likelihood (ML) tree was constructed (Fig. 3) using 74 protein-coding genes, which helps to

determine _C. officinalis_’ phylogenetic status. Phylogenetic analysis indicates that, broadly, the support values for each clade of the phylogenetic tree exceed 50%, with the majority

reaching 100%, demonstrating the reliability of our phylogenetic tree (Fig. 3). Further, _C. officinalis_ and _C. arvensis_ were within one clade with a support of 100%, and also, from a

local point of view, the relationship between _C. officinalis_, _C. arvensis,_ and _O. ecklonis_ was very close, although _O. ecklonis_ does not belong to the _Calendula_ genus_,_ which is

consistent with the previous study35. Interestingly, _Crassocephalum crepidioides_, _Gynura japonica_, _Jacobaea vulgaris_, and _Seneico vulgaris_ were found to be more closely related to

_C. officinalis_ in our study (Fig. 3), compared to previous reports. This close relationship represents a novel finding, likely attributed to the differences in sequence data and

phylogenetic methods employed in this study, as the protein-coding genes extracted from complete chloroplast genomes contain richer information compared to previous marker genes used. CODON

COMPOSITION ANALYSIS The codon composition for CDS of the three species (_Calendula officinalis_, _Calendula arvensis_, and _Osteospermum ecklonis_) was analyzed (Table 1), and the GC

content of chloroplast-encoded genes in the three Asteraceae plants is 38.49%, 38.50%, and 38.54%, respectively. The GC content varies at different positions, with the first, second, and

third positions in the codons all having a GC content below 50%. The highest content is at the first base, and the lowest content is at the third base, showing a trend of GC1 > GC2 >

GC3. This indicates that the chloroplast genome sequences of the three Asteraceae plants are rich in A/T bases, particularly at the third position of the codons. the ENc values of

chloroplasts in the three Asteraceae plants (_C. officinalis_, _C. arvensis_, and _O. ecklonis_) are 37.6959.17, 38.7459.17, and 39.1 ~ 58.98, with average values of 47.34, 47.41, and 47.65,

respectively. All of these values are significantly greater than 35, indicating that the codon usage bias in the chloroplast genomes is relatively weak. CODON USAGE BIAS ANALYSIS By using

GC3 and ENc as the X-axis and Y-axis, respectively, for the c analysis, the influence of nucleotide composition on codon preference can be detected. When genes are distributed along the

standard curve or near it, it indicates that the codon preference of the gene is affected only by mutations. However, when genes fall far below the standard curve, it indicates that the

codon preference of the gene is affected by selection. From the result of this study, it can be observed that the ENc-plot diagrams (Fig. 4) of the chloroplast genomes of the three

Asteraceae plants (_C. officinalis_, _C. arvensis_, and _O. ecklonis_) are similar, some genes are distributed along the standard curve or close to it, indicating that their codon preference

is mainly influenced by nucleotide mutations. However, some other genes deviate from the standard curve, suggesting that nucleotide mutations are not the main factor affecting their codon

preference and that they may be affected by other factors such as natural selection (Fig. 4). In addition to the similarity, there are a few differences, for example, photosynthesis-related

genes of _C. officinalis_ and _C. arvensis_ are mainly concentrated near the standard curve, while those of _O. ecklonis_ deviate below slightly, indicating that the photosynthesis-related

genes of _O. ecklonis_ might be more influenced by selection, while those of _C. officinalis_ and _C. arvensis_ are primarily affected by mutations. Parity rule 2 plots (PR2 plot) were

generated respectively for the three Asteraceae plants (_C. officinalis_, _C. arvensis_, and _O. ecklonis_) using the chloroplast's protein-coding genes (Fig. 5). It can be easily

noticed that all three plots were with great similarity. Firstly, the majority of their coordinate points are not uniformly distributed across the four regions but are mainly concentrated in

the region where G3/(G3 + C3) > 0.5 and A3/(A3 + T3) < 0.5 (Fig. 5). Then, genes of the small subunit of ribosome of all three species tend to use A more, while Photosynthesis-related

genes lean towards using T. However, overall, the usage frequency of the third base T in the codon is higher than A, and the usage frequency of G is higher than C. If codon usage bias were

solely caused by nucleotide mutations, the usage frequencies of A/T and G/C should be equal. Therefore, the PR2-plot analysis results, combined with the ENc-plot analysis, indicate that the

codon usage bias in the chloroplast genomes of the three Asteraceae plants is formed by the combined effects of nucleotide mutations and natural selection. The similarity of the PR2-plot

analysis results reflects the similarity of their phylogenetic relationships. The correlation between codon GC12 and GC3 for the chloroplast genomes of the three Asteraceae plants (_C.

officinalis_, _C. arvensis_, and _O. ecklonis_) is quite familiar as the Neutrality plot showed (Fig. 6). The codon GC12 values of the three plants' chloroplast genomes are distributed

between 27.85 and 58.88, while GC3 values are distributed between 18.40 and 36.74, indicating that the frequency of using A/T at the third codon position is higher than G/C (Fig. 6). The

slope of the regression line fitted with GC12 and GC3 ranges from 0.13 to 0.22, with R2 > 0, suggesting a positive correlation between G12 and G3 values. However, the two-tailed test did

not reach significant levels (P > 0.05) for all three species, indicating that the mutation patterns of the first and second bases are different from the third base, and the codon usage

bias is more affected by natural selection than by nucleotide mutations (Fig. 6). Additionally, the regression coefficient of _O. ecklonis_ is closest to 0, indicating that its chloroplast

genome codon preference is most influenced by natural selection, while _C. officinalis_ has the furthest regression coefficient from 0, suggesting that its chloroplast genome codon

preference is least influenced by natural selection compared to the other two Asteraceae plants. As the result of the correspondence analysis, the genes (CDS sequences) of the chloroplast

genomes of the three Asteraceae plants (_C. officinalis_, _C. arvensis_, and _O. ecklonis_) are distributed on the figure with the first major factor axis as the x-coordinate and the second

major factor axis as the y-coordinate (Fig. 7). The origin represents the average RSCU values of all genes relative to the first axis and the second axis. The sum of the proportions of the

total variation accounted for by the first four principal factor axes in the three Asteraceae plants are 34.80%, 34.69%, and 33.66%, respectively (Fig. 7). The proportion of the total

variation accounted for by the first principal factor axis is 9.93%, 9.93%, and 9.23%, respectively (Fig. 7), indicating that the first axis contributes the most to the variation, and the

contribution of the remaining factor axes decreases successively. This again suggests that the formation of codon usage bias characteristics in the chloroplast genes of the three Asteraceae

plants is not influenced by a single factor but is the result of the combined action of multiple factors. To explore the factors affecting the distribution of each gene in the correspondence

analysis plot of the chloroplast genomes of the three Asteraceae plants, correlation analysis between the first axis and GC3s, ENc, CAI, CBI, and Fop, respectively was performed. As can be

seen from Table 2, the GC3s and CAI values of _C. officinalis_ are significantly correlated with the first axis (P < 0.05); the GC3s and CAI values of _C. arvensis_ are also highly

significantly correlated with the first axis (P < 0.05), and the ENc value is significantly correlated with the first axis; the ENc value of _O. ecklonis_ is significantly correlated with

the first axis (Table 2). It can be found that, when not considering the slope direction, the correlation between the first axis and various indicators of _C. officinalis_ and _C. arvensis_

is closer, while there is a larger difference with _O. ecklonis_. This pattern is consistent with that shown in the phylogenetic tree (Fig. 3), reflecting the similarity and differences

among the three, which indicate that correspondence analysis may reveal the commonalities and subtle differences in codon usage bias among the three species, which may be an important

characteristic reflecting the differences in their phylogenetic relationships, even if their relationships are relatively close. DIVERGENCE TIME ANALYSIS Our divergence time analysis, as

illustrated in Fig. 8, indicates that the divergence of _C. officinalis_ took place approximately 0.25 million years ago (Mya), situating it in the recent Quaternary period. Additionally,

the genus _Calendula_ is estimated to have originated around 2.38 Mya, also during the Quaternary. Furthermore, the common ancestor of _Calendula_ and _Osteospermum_ is traced back to

roughly 18.77 Mya, placing this divergence squarely within the Miocene epoch of the Neogene period, an era known for significant environmental and climatic shifts that likely influenced

their evolutionary paths. To further explore the subtle evolutionary differences among the chloroplast genomes of three species, we conducted a comparative analysis of the boundaries of the

LSC, SSC, and IR regions across these species (Fig. 9). The result reveals a notable consistency across three species, primarily reflected in the genes adjacent to IR boundaries.

Specifically, the genes near the JLA (junction of the LSC and the IRa region) consistently include _rpl2_, _rps19_, and _rpl22_ across the species examined. Additionally, _psbA_, trnH_,_ and

_rpl22_ genes are entirely located within the LSC region, while two copies of the _rpl2_ gene are fully situated within IRa and IRb, respectively. The ycf1 gene spans the IRb and the SSC

regions, positioned at the JSB junction. Despite these overarching similarities, specific differences in gene placement and IR boundary dynamics are evident among the species. A notable

distinction is observed in the placement of the _ycl1_ gene, which spans the JSA junction (junction of the SSC and IRa region) exclusively in _C. officinalis_, with the majority of its

sequence within IRa (extending 7 bp into the SSC), indicating a significant expansion/contraction event. This occurrence is not mirrored in the other two species. Furthermore, the _ndhF_

gene in _C. officinalis_ predominantly resides within the SSC, marginally spanning the JSA junction by 5 bp, whereas in the other species, it is completely contained within the SSC,

showcasing a unique trait of_ C. officinalis_. In another aspect, the _rps19_ gene is located entirely within the IRb near the JLB (junction of the IRb and the LSC region) and near the JLA

in the LSC for _O. ecklonis_, while in _C. officinalis_ and _C. arvensis_, it appears only once, situated near the JLA in the LSC and IRa, respectively (Fig. 9). DISCUSSION _C. officinalis_

is a versatile plant with applications in various fields, including ornamentation, chemistry, and pharmacology. Despite its widespread use, its chloroplast genome structure, features,

phylogeny, and patterns of evolution and mutation have remained largely unexplored. This study aims to address this knowledge gap by examining the chloroplast genome, phylogeny, and codon

usage bias of _C. officinalis_, thereby enhancing our understanding of its evolution, adaptation, and potential uses. The chloroplast genome of _C. officinalis_ was found to be a 150,465 bp

circular molecule, containing a large single-copy region (LSC), a small single-copy region (SSC), and two inverted repeat regions (IRs). The genome's G + C content is 37.75%, and it

comprises 131 genes, including 86 protein coding, eight rRNA, and 37 tRNA genes. A Maximum-likelihood (ML) tree was constructed using 74 protein-coding genes to establish the phylogenetic

status of _C. officinalis_. Phylogenetic analysis revealed that _C. officinalis_ and _C. arvensis_ form a clade with 100% support, and their relationship with _O. ecklonis_ is close, in

accordance with a previous study. The analysis also indicated a closer relationship between _C. officinalis_ and four other species than previously reported35. This discrepancy could be

attributed to differences in sequences and methods employed for phylogenetic analysis, warranting further investigation. Codon usage bias is a vital element of evolution across diverse

genomes, which is influenced by multiple biological factors including gene expression, gene length, tRNA abundance, mutation bias, and GC composition, as evidenced by a wealth of

studies36,37,38,39,40,41,42. Nevertheless, it's the interplay between directional mutation pressure and natural selection that primarily governs codon usage bias across diverse

organisms, forming the bedrock of interspecies and intragenomic codon usage disparities43. Plant genomes further demonstrate the complexity of codon usage bias; the nuclear gene codon

preference is largely shaped by nuclear acid composition constraints, whereas in the realm of chloroplast and mitochondrial genomes, natural selection takes precedence44,45. The effective

number of codons (ENc) is a common metric to quantify the degree of deviation in codon usage from random selection. Ranging in value from 20 to 61, ENc helps evaluate the strength of codon

usage bias in genomes or genes. Smaller ENc values signify stronger codon preference, while larger values indicate weaker codon preference. Notably, when the ENc value is less than or equal

to 35, the codon usage bias phenomenon is considered more significant. In our research, we found that the chloroplast genome sequences of the three Asteraceae plants _C. officinalis_, _C.

arvensis_, and _O. ecklonis_ were rich in A/T bases, with ENc values ranging from 37.69 to 59.17, 38.74 to 59.17, and 39.1 to 58.98, and average values of 47.34, 47.41, and 47.65,

respectively. As all these values are significantly greater than 35, this suggests that the codon usage bias in the chloroplast genomes of these species is relatively weak. Further analysis

using ENc-plot (Fig. 4), PR2-plot (Fig. 5), Neutrality plot (Fig. 6), and correspondence analysis (Fig. 7) revealed that the codon usage bias in _C. officinalis_, _C. arvensis_, and _O.

ecklonis_ is a result of the combined effects of natural mutation and selection pressure. In addition, the codon usage bias patterns in these three species are highly similar, providing a

robust explanation for their phylogenetic similarities. This finding suggests that these species may have been subjected to similar environmental conditions and selection pressures during

their evolutionary process. This similarity in environmental conditions and selection pressures could also account for the close phylogenetic relationship between the species _O. ecklonis_

and _C. officinalis_ and _C. arvensis_. Moreover, we discovered that the correlation of the first axis of the correspondence analysis with GC3s, ENc, CAI, CBI, and Fop can effectively

reflect the subtle differences in codon usage bias patterns among the three species. Interestingly, these differences are consistent with the results of the phylogenetic tree, indicating

that this phenomenon is worthy of further study. The divergence of the genus _Calendula_ around 2.38 Ma (Fig. 8), within the Quaternary period, corresponds to a phase of Earth's history

marked by intense climatic fluctuations. This period, characterized by repeated glacial and interglacial cycles46,47, would have imposed strong selective pressures on plant species, driving

adaptive responses. The speciation of _C. officinalis_ during this time suggests its evolutionary resilience and adaptability to changing environments. This aligns with our findings of weak

codon usage bias in the chloroplast genome, indicative of a balanced selection-mutation dynamic possibly influenced by these environmental shifts. The emergence of the common ancestor of

_Calendula_ and _Osteospermum_ around 18.77 Ma (Fig. 8) in the Miocene epoch of the Neogene period coincides with significant global climate changes from warmer to cooler conditions48. The

Miocene epoch, known for its extensive tectonic activities and consequent ecological shifts49, likely provided diverse niches and selective pressures that catalyzed speciation events. The

similarity in codon usage bias patterns among _Calendula_ and _Osteospermum_ species provides further evidence of their phylogenetic relationship and shared evolutionary history. This

similarity suggests a parallel adaptation route, possibly as a response to similar environmental pressures over time. However, despite the many evolutionary similarities among the three

species, our comparative analysis of the boundaries of the LSC, SSC, and IR regions highlights distinct differences in the expansion and contraction of the _ycl1_ and _ndhF_ genes in _C.

officinalis_, compared to the other two species. In the evolutionary progression of angiosperms, the alteration, reduction, and enlargement of IR regions represent frequent events. Such

changes often take place at the junctions between IRs and LSC and SSC, facilitating the movement of specific genes into either IR or single-copy regions50. The unique characteristics of

these two genes may reflect the distinct nature of C. _officinalis_ as a species that emerged relatively recently in evolutionary terms (0.25 Ma, compared to the divergence time of 18.77 Ma

among these three species). This distinctiveness warrants further in-depth investigation to understand its evolutionary implications and the adaptive significance of these genomic features.

The insights gleaned from this research not only improve our understanding of the evolutionary relationships and adaptation mechanisms of _C. officinalis_ and related species but also lay

the groundwork for future investigations into the potential applications of these plants. Understanding the divergence time and evolutionary adaptations of _C. officinalis_ opens avenues for

exploring its potential applications in pharmacology and agriculture. Future research could delve deeper into how specific adaptations in the chloroplast genome have contributed to its

medicinal and ornamental properties. CONCLUSION In summary, our research provides a comprehensive analysis of the chloroplast genome, phylogenetic relationships, codon usage bias, and

divergence time of _C. officinalis_. It highlights the close evolutionary kinship between _C. officinalis_, _C. arvensis_, and _O. ecklonis_, supported by similarities in codon usage

patterns and divergence timelines. These findings suggest that shared environmental selection pressures have played a significant role in their evolutionary paths. Moreover, we have

identified unique evolutionary features in _C. officinalis_, possibly associated with certain genes. Besides, our findings enhance the understanding of the genetic makeup of _C.

officinalis_. This deeper genetic insight lays a foundation for future research into the genetic diversity and medicinal value of _C. officinalis_, potentially unlocking new avenues for

exploiting its properties in pharmacology and agriculture. Our analysis not only advances knowledge of the _C. officinalis_ genome but also sets the stage for exploring its genetic diversity

and tapping into its vast medicinal potential, highlighting the importance of continued investigation into this valuable plant species. DATA AVAILABILITY The complete chloroplast genome

sequence of _C. officinalis_ has been deposited in the GenBank database under the accession number OP161555 (https://www.ncbi.nlm.nih.gov/nuccore/OP161555.1/). The associated BioProject and

Bio-Sample numbers are PRJNA1019102 and SAMN37474090, respectively. REFERENCES * Bayat, H., Alirezaie, M. & Neamati, H. Impact of exogenous salicylic acid on growth and ornamental

characteristics of calendula (_Calendula officinalis_ L.) under salinity stress. _J. Stress Physiol. Biochem._ 8, 258–267 (2012). Google Scholar * Jan, N., Andrabi, K. I. & John, R.

_Calendula_ _officinalis_-an important medicinal plant with potential biological properties. _Proc. Indian Natl. Sci. Acad._ 83, 769–787 (2017). Google Scholar * Ashwlayan, V. D., Kumar, A.

& Verma, M. Therapeutic potential of _Calendula_ _officinalis_. _Pharm. Pharmacol. Int. J._ 6, 149–155 (2018). Google Scholar * Green, B. R. Chloroplast genomes of photosynthetic

eukaryotes. _Plant J._ 66, 34–44 (2011). Article CAS PubMed Google Scholar * Sugiura, M., Shinozaki, K., Zaita, N., Kusuda, M. & Kumano, M. Clone bank of the tobacco (_Nicotiana

tabacum_) chloroplast genome as a set of overlapping restriction endonuclease fragments: Mapping of eleven ribosomal protein genes. _Plant Sci._ 44, 211–217 (1986). Article CAS Google

Scholar * Sugiura, M. The chloroplast genome. _Plant Mol. Biol._ 19, 149–168 (1992). Article CAS PubMed Google Scholar * Zhou, J. _et al._ Chloroplast genomes in Populus (Salicaceae):

Comparisons from an intensively sampled genus reveal dynamic patterns of evolution. _Sci. Rep._ 11, 9471 (2021). Article ADS CAS PubMed PubMed Central Google Scholar * Li, E. _et al._

Insights into the phylogeny and chloroplast genome evolution of Eriocaulon (Eriocaulaceae). _BMC Plant Biol._ 23, 1–14 (2023). Google Scholar * Song, Y. _et al._ Chloroplast genome

evolution and species identification of Styrax (Styracaceae). _BioMed Res. Int._ 2022, 1–13 (2022). Google Scholar * Buhr, F. _et al._ Synonymous codons direct cotranslational folding

toward different protein conformations. _Mol. Cell_ 61, 341–351 (2016). Article CAS PubMed PubMed Central Google Scholar * Zhou, Z. _et al._ Codon usage is an important determinant of

gene expression levels largely through its effects on transcription. _Proc. Natl. Acad. Sci._ 113, E6117–E6125 (2016). Article CAS PubMed PubMed Central Google Scholar * Peden, J. F.

Analysis of codon usage. _BioSystem_ 5, 73–74 (2000). Google Scholar * Sharp, P. M., Stenico, M., Peden, J. F. & Lloyd, A. T. Codon usage: Mutational bias, translational selection, or

both?. _Biochem. Soc. Trans._ 21, 835–841 (1993). Article CAS PubMed Google Scholar * Subramanian, S. Nearly neutrality and the evolution of codon usage bias in eukaryotic genomes.

_Genetics_ 178, 2429–2432 (2008). Article PubMed PubMed Central Google Scholar * Qin, H., Wu, W. B., Comeron, J. M., Kreitman, M. & Li, W. H. Intragenic spatial patterns of codon

usage bias in prokaryotic and eukaryotic genomes. _Genetics_ 168, 2245–2260 (2004). Article CAS PubMed PubMed Central Google Scholar * Xing, Z. B., Cao, L., Zhou, M. & Xiu, L. S.

Analysis on codon usage of chloroplast genome of _Eleutherococcus_ _senticosus_. _Chin. J. Chin. Mater. Med._ 38, 661–665 (2013). CAS Google Scholar * Chen, S., Zhou, Y., Chen, Y. &

Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. _Bioinformatics_ 34, i884–i890 (2018). Article PubMed PubMed Central Google Scholar * Jin, J. J., Yu, W. B., Song, Y.,

dePamphilis, C. W. & Yi, T. S. GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. _Genome Biol._ 21, 241 (2020). Article PubMed PubMed

Central Google Scholar * Shi, L. C. _et al._ CPGAVAS2, an integrated plastome sequence annotator and analyzer. _Nucleic Acids Res._ 47, W65–W73 (2019). Article CAS PubMed PubMed Central

Google Scholar * Rozewicki, J., Li, S., Amada, K. M., Standley, D. M. & Katoh, K. MAFFT-DASH: Integrated protein sequence and structural alignment. _Nucleic Acids Res._ 47, W5–W10

(2019). CAS PubMed PubMed Central Google Scholar * Guo, S. _et al._ A comparative analysis of the chloroplast genomes of four Polygonum medicinal plants. _Front. Genet._ 13, 764534

(2022). Article CAS PubMed PubMed Central Google Scholar * Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Quang Minh, B. IQ-TREE: A fast and effective stochastic algorithm for

estimating maximum-likelihood phylogenies. _Mol. Biol. Evol._ 32, 268–274 (2015). Article CAS PubMed Google Scholar * Wei, L. _et al._ Analysis of codon usage bias of mitochondrial

genome in _Bombyx_ _mori_ and its relation to evolution. _BMC Evol. Biol._ 14, 1–12 (2014). Article Google Scholar * Wen, Y., Zou, Z., Li, H., Xiang, Z. & He, N. Analysis of codon

usage patterns in Morus notabilis based on genome and transcriptome data. _Genome_ 60, 473–484 (2017). Article CAS PubMed Google Scholar * James, F. C. & McCulloch, C. E.

Multivariate analysis in ecology and systematics: Panacea or Pandora’s box?. _Annu. Rev. Ecol. Syst._ 21, 129–166 (1990). Article Google Scholar * Wang, Z. _et al._ Comparative analysis of

codon usage patterns in chloroplast genomes of six Euphorbiaceae species. _PeerJ_ 8, e8251 (2020). Article PubMed PubMed Central Google Scholar * Puttick, M. N. MCMCtreeR: Functions to

prepare MCMCtree analyses and visualize posterior ages on trees. _Bioinformatics_ 35(24), 5321–5322 (2019). Article CAS PubMed Google Scholar * Li, H. T. _et al._ Origin of angiosperms

and the puzzle of the Jurassic gap. _Nat. Plants_ 5(5), 461–470 (2019). Article PubMed Google Scholar * Kim, K. J., Choi, K. S. & Jansen, R. K. Two chloroplast DNA inversions

originated simultaneously during the early evolution of the sunflower family (Asteraceae). _Mol. Biol. Evol._ 22(9), 1783–1792 (2005). Article CAS PubMed Google Scholar * Mandel, J. R.

_et al._ A fully resolved backbone phylogeny reveals numerous dispersals and explosive diversifications throughout the history of Asteraceae. _Proc. Natl. Acad. Sci._ 116(28), 14083–14088

(2019). Article ADS CAS PubMed PubMed Central Google Scholar * Zhang, C. _et al._ Phylotranscriptomic insights into Asteraceae diversity, polyploidy, and morphological innovation. _J.

Integr. Plant Biol._ 63(7), 1273–1293 (2021). Article CAS PubMed Google Scholar * Zhang, Q. _et al._ New insights into the formation of biodiversity hotspots of the Kenyan flora.

_Divers. Distrib._ 28(12), 2696–2711 (2022). Article Google Scholar * Verboom, G. A., Stock, W. D. & Cramer, M. D. Specialization to extremely low-nutrient soils limits the nutritional

adaptability of plant lineages. _Am. Nat._ 189(6), 684–699 (2017). Article PubMed Google Scholar * Foster, C. S. P. _et al._ Evaluating the impact of genomic data and priors on Bayesian

estimates of the angiosperm evolutionary timescale. _Syst. Biol._ 66(3), 338–351 (2017). PubMed Google Scholar * Fu, Z. X., Jiao, B. H., Nie, B., Zhang, G. J. & Gao, T. G. A

comprehensive generic-level phylogeny of the sunflower family: Implications for the systematics of Chinese Asteraceae. _J. Syst. Evol._ 54, 416–437 (2016). Article Google Scholar * Wang,

B., Yuan, J., Liu, J., Jin, L. & Chen, J. Q. Codon usage bias and determining forces in green plant mitochondrial genomes. _J. Integr. Plant Biol._ 53, 324–334 (2011). Article CAS

PubMed Google Scholar * Blake, W. J., Kaern, M., Cantor, C. R. & Collins, J. J. Noise in eukaryotic gene expression. _Nature_ 422, 633–637 (2003). Article ADS CAS PubMed Google

Scholar * Ingvarsson, P. K. Gene expression and protein length influence codon usage and rates of sequence evolution in Populus tremula. _Mol. Biol. Evol._ 24, 836–844 (2007). Article CAS

PubMed Google Scholar * Duret, L. & Mouchiroud, D. Expression pattern and surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. _Proc. Natl.

Acad. Sci._ 96, 4482–4487 (1999). Article ADS CAS PubMed PubMed Central Google Scholar * Rao, Y. _et al._ Mutation bias is the driving force of codon usage in the _Gallus gallus_

genome. _DNA Res._ 18, 499–512 (2011). Article CAS PubMed PubMed Central Google Scholar * Sueoka, N. & Kawanishi, Y. DNA G+C content of the third codon position and codon usage

biases of human genes. _Gene_ 261, 53–62 (2000). Article CAS PubMed Google Scholar * Wan, X. F., Xu, D., Kleinhofs, A. & Zhou, J. Quantitative relationship between synonymous codon

usage bias and GC composition across unicellular genomes. _BMC Evol. Biol._ 4, 1–11 (2004). Article Google Scholar * Sharp, P. M., Emery, L. R. & Zeng, K. Forces that influence the

evolution of codon bias. _Philos Trans. R. Soc. B_ 365, 1203–1212 (2010). Article CAS Google Scholar * Liu, Q. & Xue, Q. Comparative studies on codon usage pattern of chloroplasts and

their host nuclear genes in four plant species. _J. Genet._ 84, 55–62 (2005). Article CAS PubMed Google Scholar * Morton, B. R. & Wright, S. I. Selective constraints on codon usage

of nuclear genes from _Arabidopsis thaliana_. _Mol. Biol. Evol._ 24, 122–129 (2007). Article CAS PubMed Google Scholar * Richter, C. _et al._ New insights into Southern Caucasian

glacial–interglacial climate conditions inferred from Quaternary gastropod fauna. _J. Quat. Sci._ 35(5), 634–649 (2020). Article Google Scholar * Brown, S. C. _et al._ Persistent

Quaternary climate refugia are hospices for biodiversity in the Anthropocene. _Nat. Clim. Change_ 10(3), 244–248 (2020). Article ADS Google Scholar * Holbourn, A. E. _et al._ Late Miocene

climate cooling and intensification of southeast Asian winter monsoon. _Nat. Commun._ 9(1), 1584 (2018). Article ADS PubMed PubMed Central Google Scholar * Lin, C. _et al._ Himalayan

Miocene adakitic rocks, a case study of the Mayum pluton: Insights into geodynamic processes within the subducted Indian continental lithosphere and Himalayan mid-Miocene tectonic regime

transition. _Bulletin_ 133(3–4), 591–611 (2021). CAS Google Scholar * Raubeson, L. A. _et al._ Comparative chloroplast genomics: Analyses including new sequences from the angiosperms

_Nuphar advena_ and _Ranunculus macranthus_. _BMC Genom._ 8, 174 (2007). Article Google Scholar Download references ACKNOWLEDGEMENTS We thank Jun Yan and Mi He for their assistance with

the phylogenetic analysis as well as Lixuan Xiang for her help. FUNDING This work was supported by the Natural Science Foundation of Hunan Province (2023JJ30436, 2022JJ50249 and

2022JJ40291), the Scientific Research Foundation of Hunan Provincial Education Department (22A0487), Key Research Project of Hunan University of Arts and Science (E06022005), and the

Scientific Research Youth Foundation of Education Department of Hunan Province (21B0610). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Agricultural Products Processing and Food Safety Key

Laboratory of Hunan Higher Education, Hunan Provincial Key Laboratory for Molecular Immunity Technology of Aquatic Animal Diseases, College of Life and Environmental Sciences, Hunan

University of Arts and Science, Changde, Hunan, China Ningyun Zhang, Kerui Huang, Peng Xie, Aihua Deng, Xuan Tang, Ming Jiang, Ping Mo, Hanbin Yin, Rongjie Huang, Jiale Liang, Fuhao He,

Yaping Liu, Haoliang Hu & Yun Wang Authors * Ningyun Zhang View author publications You can also search for this author inPubMed Google Scholar * Kerui Huang View author publications You

can also search for this author inPubMed Google Scholar * Peng Xie View author publications You can also search for this author inPubMed Google Scholar * Aihua Deng View author publications

You can also search for this author inPubMed Google Scholar * Xuan Tang View author publications You can also search for this author inPubMed Google Scholar * Ming Jiang View author

publications You can also search for this author inPubMed Google Scholar * Ping Mo View author publications You can also search for this author inPubMed Google Scholar * Hanbin Yin View

author publications You can also search for this author inPubMed Google Scholar * Rongjie Huang View author publications You can also search for this author inPubMed Google Scholar * Jiale

Liang View author publications You can also search for this author inPubMed Google Scholar * Fuhao He View author publications You can also search for this author inPubMed Google Scholar *

Yaping Liu View author publications You can also search for this author inPubMed Google Scholar * Haoliang Hu View author publications You can also search for this author inPubMed Google

Scholar * Yun Wang View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS Y.W. and K.H. conceived and designed the study. Y.W., H.H., and K.H.

identified the plant material. N.Z., Y.W., K.H., P.X., A.D., M.J., and P.M. collected the samples. K.H. and N.Z. performed genome assembling and data analysis. N.Z. drafted the manuscript.

N.Z., X.T., H.Y., R.H., J.L., F.H., Y.L., and H.H. revised the manuscript. All authors discussed the results, critically reviewed the manuscript, and approved the final version.

CORRESPONDING AUTHORS Correspondence to Kerui Huang, Haoliang Hu or Yun Wang. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION

PUBLISHER'S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY FIGURES.

RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and

reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes

were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.

If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to

obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS

ARTICLE Zhang, N., Huang, K., Xie, P. _et al._ Chloroplast genome analysis and evolutionary insights in the versatile medicinal plant _Calendula officinalis_ L.. _Sci Rep_ 14, 9662 (2024).

https://doi.org/10.1038/s41598-024-60455-2 Download citation * Received: 26 January 2024 * Accepted: 23 April 2024 * Published: 26 April 2024 * DOI:

https://doi.org/10.1038/s41598-024-60455-2 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not

currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative KEYWORDS * _Calendula officinalis_ * Chloroplast genome * Codon

usage bias * Evolution * Adaptation