Multi-omics reveal the prevalence of Thaumarchaeota and their biogeochemical roles in coastal low oxygen zones

1、Introduction:

Dissolved oxygen (DO) is a critical environmental factor in marine ecosystems, with its spatial heterogeneity and vertical stratification fundamentally establishing the biogeochemical zonation of redox boundaries (Wang et al., 2024). DO controls the predominance of aer- obic versus anaerobic metabolic pathways, thereby structuring micro- bial community assemblages and modulating energy transfer efficiency across trophic levels (Song et al., 2019). Under the pressures of climate warming and anthropogenic eutrophication, the global expansion of coastal hypoxic zones (DO <2 mg/L) and oceanic oxygen minimum zones has been documented with increasing concern (Breitburg et al., 2018; Oschlies, 2021). The ocean deoxygenation imposes physiological

constraints on aerobic biota through habitat compression, leading to systematic biodiversity erosion and functional homogenization (Rose et al., 2024), destabilizing energy flux patterns and nutrient cycling regimes (Long et al., 2021). Such alterations progressively impair ecosystem services, ultimately threatening the resilience and long-term sustainability of marine ecosystems.
Microorganisms act as fundamental biogeochemical engineers in marine systems (Sun, 2024). They orchestrate elemental biogeochem- ical cycles and maintain ecosystem homeostasis through coupled metabolic networks, while also regulating global climate via produc- tion/consumption of greenhouse gases (He et al., 2023). The expansion of oxygen-depleted zones creates oxic-anoxic gradient along the water column, triggering profound restructuring of microbial communities

(Lin et al., 2025), which in turn alters the mediated carbon and nitrogen cycling processes, resulting in a series of environmental and climatic feedbacks (Sveen et al., 2024). Typically, photoautotrophic phyto- plankton dominate the oxygenated surface layer and participate in carbon dioxide fixation (Avrahami et al., 2025). As depth increases and DO declines, chemolithoautotrophs and facultative anaerobes become the dominant functional groups (Han et al., 2022). This transition is primarily reflected in the nitrogen cycle pathways, shifting from assimilation-dominated to nitrate reduction processes, synchronously driving nitrogen transformation and dark carbon fixation, thereby coupling carbon and nitrogen cycles (Tang et al., 2025). Meanwhile, heterotrophic microorganisms are widely distributed throughout the water column, participating in the mineralization of organic matter and providing inorganic nutrients to other microbes (Chen et al., 2025). This metabolic adaptation not only optimizes energy acquisition strategies for microorganisms in low-oxygen environments, but also regulates greenhouse gas fluxes and organic carbon export, forming critical biogeochemical feedbacks that ultimately influence the global climate system (Sveen et al., 2024). Deciphering microbial adaptation mecha- nisms under oxygen stress has emerged as a hotspot in marine micro- biology, with implications for predicting biogeochemical feedbacks under expanding marine deoxygenation.
Among these microbes, the Thaumarchaeota phylum, comprising all currently described ammonia-oxidizing archaea (Offre et al., 2013), constitutes a pivotal microbial group, accounting for 20–50% of pelagic archaeal populations across all oceanic provinces (Qin et al., 2020; Stahl and De La Torre, 2012). Among the primary ecotypes of Thaumarch- aeota, the genera Nitrosopumilus is the only broadly adaptable lineage capable of thriving in diverse habitats, including freshwater, estuarine environments, hypoxic zones and from surface to abyssal waters (Liu et al., 2023; Zhang et al., 2024a), demonstrating varied adaptive stra- tegies across environmental gradients (Qin et al., 2020). Particularly in nutrient-rich estuarine ecosystems, Nitrosopumilus functions as a che- molithoautotrophic opportunist, effectively occupying a distinct ecological niche. Estuarine ecosystems typically receive substantial terrestrial nutrient inputs and experience intense sediment resus- pension, creating favorable conditions for the proliferation of photo- autotrophs and heterotrophic bacteria, which compete for inorganic nitrogen and organic matter, respectively (Han et al., 2022). Facing such a complex and competitive environment, Nitrosopumilus possesses unique adaptive strategies, including high affinity for ammonium (Straka et al., 2019), broad salinity tolerance (Qin et al., 2017), and the expression of ammonium transporters and nitrite reductase to facilitate nitrogen acquisition (Reyes et al., 2020). Additionally, the 3-hydroxy- propionate/4-hydroxybutyrate cycle for chemolithoautotrophic carbon fixation, synergistic interactions with heterotrophic bacteria (which decompose organic nitrogen to release ammonium as a substrate for Nitrosopumilus), and enhanced antioxidant capacity (Han et al., 2022) collectively support their survival and prevalence. Due to substrate competition and light inhibition (Wan et al., 2021), the ecological niches of Thaumarchaeota and photoautotrophs are often vertically separated, as light intensity decreases and photoautotrophic activity declines, Thaumarchaeota gradually play the critical ecological roles in nitrogen biogeochemical cycle and chemolithoautotrophic carbon fixation (Qin et al., 2020). As keystone nitrifiers, Thaumarchaeota govern nitrogen speciation particularly in oligotrophic gyres (Semedo et al., 2021). In addition, their chemolithoautotrophic metabolism establishes them as major contributors to dark carbon fixation (Martens-Habbena and Qin, 2022). Furthermore, they are largely involved in the production and consumption of nitrous oxide and methane (Wan et al., 2023), biologi- cally linking the microbial-scale processes with planetary-scale climate dynamics.
Although an increasing number of studies are investigating microbial community composition and their biogeochemical functions in response to deoxygenation across various spatial and temporal scales (Lin et al., 2025; Sun, 2024). However, the adaptive mechanisms of specific

microbial taxa (such as Thaumarchaeota) to marine oxygen fluctuations, particularly the transition from oxic to anoxic, and their associated biogeochemical roles remain poorly understood. Driven by anthropo- genic eutrophication and climate warming, the Yangtze River Estuary and adjacent East China Sea have been experiencing seasonal hypoxia from May to November annually, with peak intensity in August when average dissolved oxygen in the core zone can decrease to as low as 1.5 mg/L (Sheng et al., 2024; Zhang et al., 2023). By applying an integrated multi-omics approach (16S rRNA gene sequencing, metagenomics, and metaproteomics), we conduct comparative analyses of microbial com- munity structures and functional gene expression across oxic-low oxy- gen gradients in East China Sea to (1) elucidate the dynamic shifts in microbial taxonomic diversity along oxygen gradients and (2) reveal the metabolic responses and adaptation mechanisms of key microbial groups, particularly Thaumarchaeota, under deoxygenation stress.

  1. Experimental procedures
    2.1. Sample collection and preservation
    The sampling cruise was conducted on July 25–26, 2020 in the seasonal hypoxic zone off the Yangtze River Estuary (Fig. 1). Seawater samples were collected for nutrient and microbial analysis at site C1 (30.12◦N, 122.74◦ E) by using 10 L Niskin bottles attached to a conductivity-temperature-depth (CTD) rosette sampler. To measure ni-
    trate (NO -N), nitrite (NO -N), ammonium (NH-N), dissolved inor-
    ganic phosphorus (DIP), and dissolved silicon (DSi), water samples were collected via 0.45 μm membrane at five layers, with depths of 0.5 m (C1T), 5 m (C1S), 15 m (C1M), 40 m (C1L) and 50 m (C1B). Based on the dissolved oxygen (DO) profile measured in the field, nearly 100 L of seawater was collected at both surface (5 m, aerobic, C1S) and bottom (50 m, low oxygen, C1B) layers for prokaryotes (<3 μm; (Teeling et al., 2012)) analysis. Briefly, seawater was successively passed through a 20 μm screen and a 3 μm polycarbonate filter membrane (142 mm, Merck, Germany), then ~10 L of filtrate was further filtered via a 0.22 μm polycarbonate filter membrane (47 mm, Merck, Germany) for high- throughput 16S rRNA gene sequencing and metagenomics analysis (Genomics BioTech). For metaproteomic analysis, the other 90 L of filtrate was filtered using Sterivex-GV pressure filter protein tubule (0.22 μm, Merck, Germany) and then 1.8 mL of lysis buffer (0.75 mol L_ 1 sucrose, 40 mmol L_ 1 EDTA, 50 mmol L_ 1 Tris, pH = 8.3) was added to protein tubules (Hawley et al., 2013). Additionally, the filtrate filtered through a 20 μm screen was collected and stored with 1% glutaralde- hyde at _ 80 ◦ C for the determination of the total abundance of prokaryotes.

2.2. Chemical measurements
Seawater temperature, salinity, DO, chlorophyll a (Chl a) and turbidity were measured in situ by using the corresponding probe car- ried by CTD, and DO concentrations were corrected by the Winkler
method-derived measurements. The concentrations of NO3 -N, NO2 -N,
NH-N, DIP and DSi were determined by the AA3 continuous flow
analyzer (Dai et al., 2008).

2.3. Prokaryotic abundance measurements
The abundance of prokaryotes, mainly including archaea and bac- teria was detected by using Flow CytoMetry (Jiao et al., 2002). The data were analyzed by FCS Express software, and samples were distinguished according to cell size, granularity, chlorophyll and other parameters. The distribution area of the scatter plot of the instrument was manually delineated. The cell number of the circle plot was automatically counted by the software, and the cell abundance of the sample was calculated as follows,

Multi-omics reveal the prevalence of Thaumarchaeota and their biogeochemical roles in coastal low oxygen zones
Fig. 1. The sampling site in (a) hypoxic zone, East China Sea, with dissolved oxygen in surface and bottom water showing in (b) and (c), respectively.

where C is the cell abundance of prokaryotes (cells mL_ 1); N is the number of cells detected by the instrument; V and T are injection flow rate (mL h_ 1) and flow analysis time (s), respectively. Volsample stands for sample volume (mL); Vol total is the total volume (mL) of the test sample.

2.4. Sequencing of high-throughput 16S rRNA genes and data processing
For the relative quantification, microbial community genomic DNA was extracted using the soil genomic DNA extraction Kit (DNeasy PowerSoil Kit, MoBio, USA). PCR amplification of the V4 variable region of the 16S rRNA genes in archaea and bacteria was performed using modified 515F mod (5 ’-GTGYCAGCMGCCGCGGTAA-39) and 806R mod (5 ’-GGACTACNVGGGTWTCTAAT-39) primers (Walters et al., 2016). The PCR products were then recovered with 2% agarose Gel and purified by AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, Union City, CA, USA). After agarose gel electrophoresis, the product was quantified with Quantus™ Fluorometer (Promega, USA), and then quantified with the NEXTFLEX Rapid DNA-Seq Kit. Finally, it was sequenced by Illumina Miseq platform (Han et al., 2022). For the absolute quantification of 16S rRNA genes, 2–3 of spike-in sequences are inserted into the DNA sample pool with four different concentration gradients of 103, 104, 105, and 106 copy numbers for each sequence, and taken as an internal standard for absolute quantification and comparative analysis of samples (Jiang et al., 2019; Tkacz et al., 2018). The following amplification and sequence of 16S rRNA genes in archaea and bacteria were the same as aforementioned (Walters et al., 2016).
The original sequencing data were processed according to the description of Huang et al. (2015). Adaptor sequence and primer sequence were removed by TrimGalore and motherur, respectively. Double-ended reads were spliced and low-quality reads were discarded (< 200 bp, average quality score < 20, ambiguous base calls >0). The USEARCH (version 10) was used to cluster operational taxonomic units

(OTUs), clustering sequences with similarity >97% into the same OTU classification unit (Huang et al., 2015). Species taxonomic annotation was performed by comparing the RDP v11.5 database with the com- mand classify.seqs of Mothur v1.41.1 and by setting a confidence threshold of 80%. The absolute quantitative copy number of the pro- karyotic OTUs was calculated from the standard curve established by the copy number of the OTU containing the spike-in sequence. The raw data were stored in the NCBI Sequence Read Archive (SRA) with entry numbers SRR11179217-SRR11179228.
2.5. Metagenomics analysis
Sample DNA was extracted from the QIAGEN Large-Construct Kit (QIAGEN, Germany), and then purified by the QIAGEN DNe purification kit (QIAGEN, Germany). After purification, the samples were tested for Qubit® 2.0 Fluorometer (Invitrogen, USA) concentration and 1.2% agar- gel electrophoresis (80 V constant pressure, 40 min) to confirm that the DNA master band was clear with no RNA contamination. Following DNA extraction from the metagenome, detection was performed via electro- phoresis on a 1% agarose gel. The DNA was fragmented using a Covaris M220 focused-ultrasonicator (Covaris, USA) and PE libraries were constructed using the TruSeq™ DNA Sample Prep Kit. Subsequently, bridge PCR amplification of the DNA fragments was conducted using the cBot Truseq PE Cluster Kit v3-cBot-HS. Finally, sequencing was per- formed on the Illumina NovaSeq 6000 platform employing the Truseq SBS Kit (300 cycles).
Following quality assessment of the raw sequence data using FastQC v0.11.9 (Thrash et al., 2018), adapter sequences were removed and reads were quality-trimmed using Trimmomatic v0.38 (Bolger et al., 2014) to generate clean reads. Taxonomic classification of the meta- genomic data was subsequently performed using Kraken2 v2.1.1 with the Genome Taxonomy Database (GTDB, vR202; https://gtdb.ecogen omic.org/) classification system. Sequence assembly was conducted using the MEGAHIT assembler (https://github.com/voutcn/megahit). Protein-coding genes were then predicted from the assembled contigs

using METAProdigal (http://prodigal.ornl.gov/). The predicted gene sequences were clustered using CD-HIT (http://www.bioinformatics.or g/cd-hit/) to construct a non-redundant gene catalog, with the longest sequence within each cluster serving as the representative gene. Gene abundance was quantified using Salmon (https://github.com/ COMBINE-lab/salmon). Finally, the gene catalog was functionally an- notated via BLASTP alignment against multiple reference databases, including the Non-Redundant Protein (NRP) Database, the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, the evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) database, and the Carbohydrate-Active enZymes (CAZy) database. This process yielded functional annotations, taxonomic annotations, and Clusters of Orthologous Groups (COG) functional classifications for the predicted genes.
To conduct the metagenomic assembly genomics and BLAST, the raw sequencing data were similarly processed using FastQC and Trimmo- matic for quality control, followed by taxonomic classification of the metagenomic data based on the k-mer-based Kraken2 reference genome database. The quality-filtered sequences were then assembled using MetaSPAdes v3.15.2 with a minimum contig length threshold of 200 bp. High-quality reads were mapped to the assembled genomic sequences using Bowtie2 v2.3.4.3, and genome binning was performed on the as- sembly results using MetaBAT v2.12.1. The reconstructed genomes were subsequently evaluated with CheckM v1.0.7, and taxonomic annotation of the gene clusters were performed using GTDB-Tk v1.5.0. For BLAST analysis, seven selected bins were subjected to open reading frame (ORF) prediction using MetaProdigal v2.6.3. The predicted protein- coding genes were functionally annotated with eggNOG-mapper v2.1.6. The ORFs were then aligned against the custom-built DNA-seq reference library derived from the study area, and the number of map- ped reads was calculated for each alignment.
2.6. Metaproteomics analysis
The proteins were extracted and the purity was evaluated by applying sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) (Hawley et al., 2013). The concentration of protein was determined by using BCA kit. By adding trypsin to the proteins, and peptides were obtained via enzymolysis. The peptides were then isolated and detected using the QExactive似 Plus mass spectrometry system (Thermo Scientific, USA) equipped with an ultra-high performance liquid chromatography (EASY-nLC 1000, Thermo Scientific, USA).
The metaproteomic data were processed using MaxQuant v1.6.15.0 for database searching against the custom database constructed from metagenomic assemblies of the Yangtze River Estuary. This custom database was constructed using all metagenomic contigs and contains approximately 1,590,301 non-redundant protein sequences, broadly covering the domains of bacteria and archaea. The overall functional gene annotation rate is 89.6%, and the coverage of key functional genes (annotated by the KEGG database) is 71.9%. Common contaminant databases were also included to eliminate the impact of contaminant proteins in identification results. Quality control was performed at both peptide and protein levels based on the database search results. The identified proteins were functionally annotated by using Gene Ontology (GO), Protein domain, KEGG, and COG classifications (Schiebenhoefer et al., 2020). In addition, quantitative analysis of proteins was per- formed, including quantitative distribution and repeatability analysis. Protein abundance was determined based on unique peptides and rep- resented by LFQ intensity values correcting from the original intensity value of the protein.

相关推荐