Skip to main content

Profiling the full-length transcriptome of plasma cell mastitis via nanopore sequencing

Abstract

Introduction

In this study, we aimed to determine the transcriptomic profile of plasma cell mastitis (PCM) and elucidate its underlying mechanisms using nanopore sequencing technology (ONT).

Methods and results

Through comparisons and analyses of redundantly removed transcripts with known reference genome annotations, we identified 39,408 novel transcripts and 20,980 genes. By exploring full-length transcriptome data, we characterized selective splicing, selective polyadenylation events, and simple sequence repeat (SSR) site information, which enhanced our understanding of genome annotation and gene structure in plasma cell mastitis. Additionally, we investigated predicted transcription factors and LncRNAs, screening those with differences for further investigations. The GO and KEGG enrichment analysis of differentially expressed genes (DEGs) and differentially expressed transcripts (DETs) revealed subtle distinctions between them, with primary enrichments being in immune response and intercellular interactions. Our protein–protein interaction (PPI) analysis of hub proteins from DETs indicated up-regulated genes' involvement in immune response and down-regulated genes' role in cell adhesion. Furthermore, we assessed immune cell infiltration in plasma cell mastitis, observing various immune cells, such as B cells, T cells, and DC cells.

Conclusion

These preliminary findings offer novel insights into the pathogenesis of plasma cell mastitis and present promising ideas for optimizing personalized treatment approaches, warranting further exploration and follow-up studies.

Peer Review reports

Introduction

Plasma Cell Mastitis (PCM) is a rare breast disease characterized by abnormal proliferation of plasma cells in breast tissue [1]. Although it is a clinically uncommon condition, studies investigating its epidemiological features have gradually increased in recent years. The prevalence of plasma cell mastitis varies widely worldwide, but most research suggests that the disease primarily affects middle-aged women [2, 3]. Clinically, the main symptoms of plasma cell mastitis include breast swelling, pain, and erythema [4]. Nipple discharge and the presence of abnormal lumps in the breast tissue may also occur. Diagnosis is usually made through imaging techniques such as breast ultrasound, breast aspiration biopsy, and breast magnetic resonance imaging [5,6,7,8]. The exact cause of the disease remains unknown, but some studies propose that breast trauma, infection, and obstruction of the milk ducts may be associated with the development of plasma cell mastitis [9, 10]. Additionally, certain immune-related disorders and changes in hormone levels have been suggested as potential risk factors for pathogenesis [11]. Further research will be crucial in unraveling the intricate interactions between these factors and how they influence the development of plasma cell mastitis.

Currently, there is limited research on plasma cell mastitis, mostly focusing on its clinical features and treatment. We aim to conduct a comprehensive investigation of the disease using advanced research techniques. In this regard, ONT technology plays a crucial role in studying plasma cell mastitis [12, 13]. Additionally, by analyzing differential mRNA profiles, hub proteins, and immune infiltrating cells, we can further understand the pathogenic pathways, key genes, and cells involved in the disease. This knowledge can help reveal the underlying disease mechanisms, monitor disease progression, and guide therapeutic decisions. The application of ONT technology provides precise information for the accurate diagnosis and individualized treatment of plasma cell mastitis. Furthermore, it accelerates research progress in this field, enhancing our understanding of the disease and paving the way for more effective therapeutic approaches.

Materials and methods

Clinical sampling

The study adhered to ethical principles and regulations and received approval from the relevant ethical committee. Breast tissue specimens were obtained from four patients diagnosed with plasma cell mastitis (named Sm Group in the subsequent study) and four normal subjects (named Ctrl group the subsequent study) at the Thyroid Breast Surgery Department of Henan Provincial Hospital of Traditional Chinese Medicine. Specimens were collected through surgical excision of breast tissue or puncture biopsy to ensure sample integrity and quality. Detailed explanations of the study's purpose and methods were provided to the patients, and their informed consent was obtained. To protect patient privacy, data was anonymized and encrypted to ensure data security. The objective of this study was to analyze the complete transcriptome of plasma cell mastitis using ONT technology, aiming to obtain comprehensive gene expression information. The goal is to uncover the molecular features and pathogenesis of the disease.

RNA extraction, cDNA library preparation and nanopore sequencing

Total RNA extraction from the samples was carried out using Trizol reagent (Invitrogen, Carlsbad, CA, USA). To construct the cDNA libraries, cDNA was prepared using the cDNA-PCR sequencing kit (SQK-PCS109, Oxford Nanopore Technologies, Oxford, UK). The resulting cDNA libraries in the FLO-MIN109 flow-through pool were then analyzed using the PromethION platform (Oxford Nanopore Technologies, Oxford, UK). All testing procedures were conducted in accordance with the manufacturer's protocol.

Structural analysis of transcripts

In each sample, we identified the variable splicing types using Astalavista software (4.0.1). The main gene variable splicing types included alternative splicing (AS), intron retention (IR), exon skipping (ES), alternative donor (AD), alternative acceptor (AA), and mutually exclusive exons (MEE), as well as alternative polyadenylation (APA) events. For the analysis of Simple Sequence Repeat (SSR), we employed the MISA (MIcroSAtellite identification tool) program [14].

ncRNA analysis

For the identification of lncRNA candidates in transcripts, we applied specific criteria, including a minimum length of 200 nucleotides and a minimum of 2 exons. To further enhance the accuracy of our predictions, we employed four methods: coding potential calculator (CPC), coding-noncoding index (CNCI), combined coding potential assessment tool (CPAT), and protein structural domain analysis (Pfam). Each of these methods was used in combination with the aforementioned criteria to predict lncRNAs in the newly identified transcripts.

Differential expression analysis of gene/transcript

The DESeq2 software package was used for differential expression analysis on samples with biological duplicates. During the detection of differentially expressed genes and transcripts, we applied screening criteria for differential genes of Fold Change ≥ 2 and a False Discovery Rate (FDR) < 0.01, and for differential transcripts of log2FC ≥ 1 and P < 0.05. To address the issue of false positives, we utilized the Benjamini–Hochberg method to adjust the significance p-values obtained from the initial hypothesis test. Ultimately, FDR was adopted as the primary index for the selection of differentially expressed genes.

Annotation of functionality and enrichment analysis

Gene functions were annotated using several databases, including NR (NCBI, http://ftp.ncbi.nih.gov/blast/db/), Swissprot (UniProtKB/Swiss-Prot—SIB Swiss Institute of Bioinformatics | Expasy), Pfam (http://pfam.xfam.org/), KOG (http://www.ncbi.nlm.nih.gov/), COG (http://www.ncbi.nlm.nih.gov/COG/), eggnog (http://eggnogdb.embl.de/), KEGG (http://www.genome.jp/kegg/), and GO (http://www.geneontology.org/). For statistical enrichment of differentially expressed genes in the KEGG pathway, we used the KOBAS (2.0) software [15]. Before analysis, differential genes and transcripts were further screened using specific criteria. The screening criteria for differential genes were log2FC ≥ 2 and P < 0.01, while for differential transcripts, the criteria were log2FC ≥ 1 and P < 0.05.

Protein–protein interaction

After conducting differential expression analysis of transcripts, we categorized the differentially expressed transcripts into up-regulated and down-regulated fractions. These fractions were individually subjected to analysis in the STRING database (http://string-db.org/) to obtain predicted protein–protein interactions (PPIs) for these differentially expressed transcripts (DETs). Subsequently, the obtained PPIs were visualized and analyzed using Cytoscape.

Immune cell abundance distribution analysis

For assessing immune cell abundance in the sequenced samples, we utilized Xcell, a tool available on the TIMER2 platform (http://timer.cistrome.org/). Xcell integrates gene enrichment analysis through back-convolution, enabling the evaluation of immune cell enrichment in the samples.

Quantitative real-time PCR (qRT-PCR)

Total RNA was extracted using Trizol reagent (15596026, Invitrogen, USA), followed by reverse transcription of total RNA into cDNA using the cDNA synthesis kit, Surescript™ First-Strand cDNA Synthesis Kit (QP056, GeneCopoeia, USA). Final quantitative PCR reactions were performed using BlazeTaq™ SYBR® Green qPCR mix2.0 (QP031, GeneCopoeia, USA) in a fluorescent quantitative PCR instrument (4485692, ABI, USA) for quantitative PCR reactions under the following conditions: pre-denaturation at 95 ℃ for 10 min, followed by pre-denaturation at 95℃ for 10 s, annealing at 60 °C for 20 s, and extension at 72℃ for 15 s, and the process was cyclized 40 times. The primer sequences used for qPCR are provided in Supplementary Table 6. Analysis was performed using the 2−ΔΔCt method.

Statistical analysis

T-tests were used for between-group analyses to compare the differences between the Sm group and the Ctrl group. R software (3.6.1) was used to perform statistical tests and visualization analyses, and P < 0.05 was considered statistically significant.

Result and discussion

ONT sequencing overview

Eight transcriptome libraries were constructed, and each sample's sequencing output yielded 3.94 GB of clean data. After filtering out short fragments and low-quality reads, we obtained 24,845,297 clean reads from the Sm group and 29,333,579 clean reads from the Ctrl group (Table S1). The removal of rRNA resulted in 20,721,646 clean reads from the Sm group and 21,772,416 clean reads from the Ctrl group (Table S2). The number of full-length sequences obtained for each sample ranged from 2,839,350 to 7,907,654 (Table S2). All sample genes were compared with the reference genes, achieving more than 70% efficiency for subsequent analysis (Table 1). For each sample, the concordant sequences were aligned to the reference genome, and after de-redundancy, we obtained 74,033 non-redundant transcript sequences. Comparing the de-redundancy of all samples with the known annotations of the reference genome, we identified 39,408 new transcripts. These new transcript sequences were then compared with nine databases, including NR, Pfam, KOG, COG, Swiss-Prot, KEGG, GO, and Swiss-Prot TrEMBL. This resulted in the annotation of a total of 20,980 isoforms, with the number of annotations for each database shown in Table 2.

Table 1 Sequencing data and reference genome mapping information
Table 2 Information of function annotation

Coding sequence of novel genes, transcription factor and lncRNAs prediction

The Coding Sequence and their corresponding amino acid sequences were predicted using TransDecoder (v.3.0.0) software for the new transcripts obtained. A total of 14,783 ORFs were obtained, including 6,049 intact ORFs. The CDS length distribution analysis of intact ORFs revealed that most of them ranged from 100 to 1300 amino acids (Fig. 1A). Furthermore, we predicted the transcription factors (TFs) of the new transcripts. The results showed that a total of 5,409 TFs were obtained, and the different types of transcription factors are illustrated in Fig. 1B. To identify long non-coding RNAs (lncRNAs) within the transcripts, we applied four methods: CPC analysis, CNCI analysis, CPAT analysis, and pfam protein structural domain analysis. The combined results from these methods predicted a total of 1,808 lncRNA transcripts (Fig. 1C). The categorization of lncRNAs included lincRNAs, Antisense-lncRNAs, Intronic-lncRNAs, and sense-lncRNAs, as shown in Fig. 1D.

Fig. 1
figure 1

Coding sequence of novel genes, transcription factor and lncRNAs prediction. A CDS length distribution of the complete ORF. B Transcription factor type distribution. C Vien plot of the number of lncRNA transcripts analyzed by CPC, CNCI, CPAT, pfam. D lncRNA classification chart

RNA post-transcriptional editing events in plasma cell mastitis (alternative splicing, alternative polyadenylation and SSR)

The correlation score on transcriptome profile of specimen and control group was assessed (Figure S1). In the ONT data, a total of five types of Alternative Splicing (AS) events, including exon skipping (ES), variable 3’ splice site (Alt.3’), intron retention (IR), variable 5’ splice site (Alt.5’), and variable exon events (Fig. 2A). The proportion of AS events varied among different samples. ES had the highest proportion, with an average of 63.61% in the Ctrl group and 60.03% in the Sm group. In the Sm group, two types of AS events, intron retention and variable exons, were relatively higher compared to the Ctrl group (Fig. 2B, Figure S2). To further analyze the transcripts, we screened transcripts above 500 bp and conducted Simple Sequence Repeat (SSR) analysis using the MISA software. The SSR site repeat unit ranged from 1 to 6 bases. Among the SSRs, the most common repeat unit was p1 (23,331), followed by p3 (7811), p2 (5216), p4 (793), and p5 (196), with p6 (72) being the least common. Additionally, we identified 3789 composite SSRs (c) and 187 composite SSRs with overlapping positions (c*) (Fig. 2C, Table S3).

Fig. 2
figure 2

RNA post-transcriptional editing events in plasma cell mastitis. A Type of alternative splicing, (A) Exon skipping; (B) Variable 3’ splice site; (C) Mutually exclusive exon; (D) Variable 5’ splice site; (E) Intron retention. B Statistical chart of the number of alternative splicing. C Distribution of SSR type

Differential mRNA profiles in plasma cell mastitis genes and transcripts

In this study, we conducted an analysis of differential gene and transcript expression in the full-length transcriptome of plasma cell mastitis. Volcano plots were used to visualize the differences, revealing 2,172 differential genes, of which 1,203 were up-regulated and 969 were down-regulated, as well as 2,081 differential transcripts, with 1,322 up-regulated and 759 down-regulated (Fig. 3A, B, C and D, Table S4). A Venn plot demonstrated three genes with up-regulated gene expression and variable transcript expression, namely CD44, GLUL, and ONT.11364 (Fig. 3E). Additionally, we analyzed genes without differential transcript expression, finding 17 statistically significant genes (Table S5).

Fig. 3
figure 3

Differential mRNA profiles of plasma cell mastitis genes and transcripts. A Volcano map of differentially expressed genes. B Histogram of differentially expressed genes. C Volcano map of differentially expressed transcripts. D Histogram of differentially expressed transcripts. E Differently expressed gene vs. differentially expressed transcript vein plot. F GO enrichment map of differentially expressed genes. G GO enrichment map of differentially expressed transcripts. H KEGG enrichment map of differentially expressed genes. I KEGG enrichment map of differentially expressed transcripts

To further understand the functional differences between differentially expressed genes (DEGs) and differentially expressed transcripts (DETs), we conducted Gene Ontology (GO) enrichment analysis on the up- and down-regulated subsets of DEGs and DETs. Both DEGs and DETs showed relatively independent biological functions. The down-regulated genes in DEGs and DETs were enriched in processes related to cell junction assembly and negative regulation of cellular component movement. The up-regulated genes in DEGs were mainly enriched in lymphocyte differentiation and regulation of hemopoiesis, while the up-regulated genes in DETs were primarily enriched in neutrophil degranulation and neutrophil activation involved in the immune response (Fig. 3F and G). Notably, the up-regulated genes in both DEGs and DETs were significantly enriched in immune-related biological pathways.

Subsequently, we performed KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis, revealing no overlap in pathway enrichment between up- and down-regulated genes in DEGs, with DETs being enriched only in gene up-regulated pathways. The down-regulated genes in DEGs were predominantly enriched in pathways related to tight junctions, Rap1, and ErbB signaling. The common pathways enriched in up-regulated genes in both DEGs and DETs included the chemokine signaling pathway, human cytomegalovirus infection, cytokine signaling pathway, and cytokine-cytokine receptor interaction (Fig. 3H and I). These findings provide valuable insights into the functional differences between DEGs and DETs and highlight the significant involvement of immune-related pathways in plasma cell mastitis.

Interaction analysis of differentially expressed transcript protein networks and identification of hub genes

To gain deeper insights into the pivotal genes among the differentially expressed transcripts closely associated with the disease, we conducted an analysis of protein interaction networks. In Fig. 4A, the node proteins within the up-regulated differential interactions network included SPI1, CTSS, PTPRC, CD14, CCR5, and others, all of which are closely related to the immune system [16,17,18,19]. This finding is consistent with the results of GO and KEGG enrichment analyses, further supporting the significance of immune-related pathways in the disease. Figure 4B displays the nodal proteins within the down-regulated genes, such as CDH1, NRXN1, ESR1, NRG1, ERBB4, and more. Among them, CDH1 down-regulation can weaken cell adhesion strength in tissues, resulting in increased cell motility [20]; NRXN1 is also associated with cell adhesion [21]; ESR1 (estrogen receptor 1) has been extensively studied in breast cancer but less so in plasma mastitis [22], and the relationship between the two remains unclear. The identification of these pivotal genes suggests a reduction in cell adhesion strength and an enhancement of immune responses in plasma cell mastitis.

Fig. 4
figure 4

Protein–protein Interaction analysis. A Upregulation of differential gene protein network interactions. B Upregulation of differential gene protein network interactions

xCell analysis identifies infiltrating cells in the immune microenvironment of plasma cell mastitis

The xCell algorithm is a valuable tool for analyzing gene expression data by calculating 64 cell enrichment scores in each tissue sample, providing an estimate of the relative abundance of immune cells in the tissue. In our study, we applied xCell to the eight samples from the Sm and Ctrl groups, where xCell identified 64 cell types, categorized into five main types based on their origin: Epithelial, Hematopoietic Stem Cells (HSC), Lymphoid, Myeloid, and Stromal Cells (Stroma). Among these cell types, hematopoietic stem cells (HSC) and stromal cells (Stroma) did not exhibit significant differences between the Sm and Ctrl groups (Fig. 5A and B). However, notable differences were observed in lymphocytes (Lymphoid), myeloid cells (Myeloid), and epithelial cells (Epithelial). Specifically, in lymphocytes (Lymphoid), 11 types of immune cells, including B-cells, Memory B-cells, CD4 + memory T-cells, CD4 + naive T-cells, Tregs, and others, were significantly higher in the Sm group than in the control group (Fig. 5C). In bone marrow cells (Myeloid), nine types of immune cells, including DCs, Macrophages, Monocytes, Mast cells, and more, were significantly higher in the Sm group (Fig. 5D). Conversely, epithelial cell scores were higher in the Ctrl group than in the Sm group (Fig. 5E). The xCell analysis clearly revealed differences in immune cell patterns between plasma cell mastitis and normal tissue, indicating potential alterations in the immune microenvironment associated with the disease.

Fig. 5
figure 5

Infiltrating cells in the immune microenvironment of plasma cell mastitis. A HSC scores comparison. B Stroma score comparison. C Lymphoid score comparison. D myeloid score comparison. E Epithelial score comparison

Validation results of qRT-PCR

In the full-length transcriptome analysis, it is essential to ensure the reliability and accuracy of the data, which is commonly validated through quantitative PCR (qPCR). In our experiment, we conducted qPCR based on the protein network interactions and selected 20 significant differential genes. The qPCR results demonstrated a consistent trend with the expression results of the chosen genes from the protein network interactions (Fig. 6A). Specifically, the differential fold changes of the up-regulated genes SPI1, CTSS, and PTPRC, as well as the down-regulated genes CDH1, NRXN1, and ESR1, showed no significant differences (Fig. 6B and C). Moreover, the qPCR results of other genes (shown in Figure S3) also exhibited no significant differences, further confirming the reliability and validity of our ONT sequencing data.

Fig. 6
figure 6

Differential gene expression was verified by qPCR. A 20 differential gene expression profiles B Up-regulated genes SPI1, CTSS, and PTPRC qRT-PCR results. C Down-regulated genes SPI1, CTSS, and PTPRC qRT-PCR results

Discussion

Plasma cell mastitis (PCM), also known as ductal dilatation, is a persistent disease with an exceptionally high recurrence rate, even after complete mastectomy. Surprisingly, its incidence has been increasing, even among children and men [23, 24]. Several risk factors are now associated with the development of PCM, including high prolactin levels, smoking, and obesity [25]. Currently, the main treatment approach involves surgical excision, sometimes complemented by traditional Chinese medicine [26, 27]. Despite extensive research, the etiology and pathogenesis of PCM remain elusive. However, it is widely accepted as an autoimmune disease. Previous studies have unveiled the crucial roles of immune and intracellular inflammatory mechanisms in the disease's development, highlighting the significance of identifying specific and sensitive biomarkers to enhance PCM differentiation and diagnosis. Moreover, a systematic investigation of PCM's etiology and a comprehensive understanding of its pathogenesis are crucial. Such efforts will pave the way for accurate and standardized treatment regimens, tailored for targeted therapy of PCM.

lncRNAs are a class of endogenous non-coding RNA molecules that play diverse roles in the pathogenic processes of inflammatory diseases. However, there have been limited reports on the role of lncRNAs in PCM, with most studies focusing on bovine mastitis. In our investigation, we predicted 1808 lncRNAs, among which 130 were found to exhibit significant differences. Additionally, we identified 213 transcription factors (TFs) with significant differences. These differentially expressed lncRNAs and TFs may exert crucial immunoregulatory functions in PCM and influence immune gene expression, warranting further detailed exploration. Notably, due to the presence of post-transcriptional editing events and transcription factors, transcripts may differ from gene expression. We observed a total of 17 genes with no difference in gene expression and 8 known genes with differences in transcripts. These known genes have been linked to various functions in previous studies. For instance, TSC22D2 overexpression was found to inhibit cell growth in colorectal cancer [28], CYBRD1 was associated with immunity [29], ARRDC3 and CAV1 were associated with cell migration and invasion [30, 31], OXR1 and FBXO32 were linked to oxidative stress [32, 33], TPM1 was related to pro-inflammation [34], and KRT18P11’s role is yet to be explored in existing studies.

Moreover, the functional enrichment analysis of genes and transcripts revealed subtle differences, but they were primarily associated with cell–cell interactions and immune responses. The relationship between cell junction and plasma mastitis has been previously reported [35], and the differentiation of lymphocytes represents an inflammatory response [36]. Notably, cell-secreted chemokines have been considered one of the reasons for the development of this disease [37]. In KEGG enrichment analysis, pathways like Rap1 and ErbB signaling have been reported in bovine mastitis, but they are less explored in plasma mastitis [38, 39]. Surprisingly, we observed enrichment of the human cytomegalovirus infection (HCV) pathway in both differential gene and transcript KEGG enrichment analyses. Additionally, Kaposi sarcoma-associated herpesvirus infection and Cytokine-cytokine receptor interaction were enriched in KEGG analyses of differential genes. Furthermore, in KEGG analyses of differential transcripts, we observed enrichment for Human immunodeficiency virus 1 infection. As plasma mastitis is a non-bacterial pathogenic disease, further investigation revealed that these enrichments are primarily related to various immune cell responses in the human body. Combining the results of GO enrichment, we posit that these virus-associated enrichments are observed due to the presence of various immune cell responses in plasma mastitis, and they do not necessarily imply a direct association of plasma mastitis with viral infections.

The hub proteins identified in the Protein–Protein Interaction (PPI) analysis are primarily associated with cellular immunity, motility, and adhesion. Among these hub proteins, the role of ESR1 in PCM remains unclear. However, recent single-cell transcriptome studies have revealed that this estrogen receptor is specifically expressed in LumHR cells, a type of mammary epithelial cell, in addition to Basal and LumSec cells [40]. LumSec-HLA cells, in turn, play a role in immune cell signaling. The decrease in ESR1 estrogen receptor expression indicates a reduction in epithelial cells, which was further supported by the immune cell infiltration analysis. In this analysis, we observed significantly lower levels of epithelial cells in the Sm group compared to the Ctrl group, which may be a triggering factor for PCM. Previous studies have demonstrated that the pathological basis of ductal dilatation in PCM involves inflammation resulting from ductal epithelial cell destruction and detachment, along with the obstruction of luminal stimulation by keratinized debris and lipid secretion [41]. Simultaneously, the destruction of mammary epithelial cells leads to the absence of the natural immune barrier, making patients susceptible to autoimmune diseases and pathogen invasions. As the disease progresses, individuals with PCM may also be at higher risk of pathogen infections due to compromised immunity.

The assumption that plasma cell mastitis leads to a significant infiltration of plasma cells was not strongly supported by the xCell analysis. Although plasma cell content was more abundant in the Sm group, the difference was not statistically significant. However, plasma cell-like dendritic cells (pDC) showed a more pronounced enrichment, which warrants further validation. Dendritic cells (dDC) are divided into two main subclasses: plasma cell-like (pDC) and conventional (cDC). pDC are known for their specialization in producing large amounts of Type I interferon (IFNγ), which is a key activator of macrophages and enhances cytokine production in response to inflammation [42, 43]. On the other hand, cDC specialize in antigen presentation to T cells. cDC2 interacts with B cells and promotes plasma cell development, as well as activates CD4 + T cells. The presence of a large infiltration of immune cells in PCM was observed in our study, and their interactions require further investigation. Lastly, we identified 20 hub protein expression profiles, confirming the suitability of the ONT method for PCM disease analysis.

In conclusion, our study employed nanopore sequencing as a comprehensive analytical approach to explore the molecular regulatory mechanisms of plasma cell mastitis (PCM). We investigated post-transcriptional editing events, transcription factors, lncRNAs, differential mRNA profiles, hub proteins, and immune-infiltrating cells in PCM. These findings offer novel insights into the pathogenesis of PCM and provide a theoretical foundation for the development of new diagnostic and treatment strategies for the disease.

Data availability

Data are available from NCBI (BioProject: PRJNA1141192).

Abbreviations

AA:

Alternative acceptor

AD:

Alternative donor

APA:

Alternative polyadenylation

AS:

Alternative splicing

cDNA:

Complementary DNA

CNCI:

Coding-noncoding index

CPAT:

Coding potential assessment tool

CPC:

Coding potential calculator

DEGs:

Differentially expressed genes

DETs:

Differentially expressed transcripts

ES:

Exon skipping

FDR:

False Discovery Rate

GO:

Gene ontology

HSC:

Hematopoietic Stem Cells

IR:

Intron retention

KEGG:

Kyoto Encyclopedia of Genes and Genomes

LncRNAs:

Long non-coding RNA

MEE:

Mutually exclusive exons

ONT:

Nanopore sequencing technology

ORFs:

Open reading frames

PCM:

Plasma cell mastitis

Pfam:

Protein structural domain analysis

qPCR:

Quantitative Polymerase Chain Reaction

SSR:

Simple sequence repeat

TFs:

Transcription factors

References

  1. Yu JJ, Bao SL, Yu SL, Zhang DQ, Loo WT, Chow LW, et al. Mouse model of plasma cell mastitis. J Transl Med. 2012;10 Suppl 1(Suppl 1):S11.

    Article  PubMed  Google Scholar 

  2. Dixon JM, Anderson TJ, Lumsden AB, Elton RA, Roberts MM, Forrest AP. Mammary duct ectasia. Br J Surg. 1983;70(10):601–3.

    Article  CAS  PubMed  Google Scholar 

  3. Hamwi MW, Winters R. Mammary Duct Ectasia. In: StatPearls. Treasure Island (FL): StatPearls Publishing; 2023.

  4. Xing M, Zhang S, Zha X, Zhang J. Current understanding and management of plasma cell mastitis: can we benefit from what we know? Breast Care (Basel). 2022;17(3):321–9.

    Article  PubMed  Google Scholar 

  5. Hu J, Huang X. Combining ultrasonography and mammography to improve diagnostic accuracy of plasma cell mastitis. J Xray Sci Technol. 2020;28(3):555–61.

    PubMed  Google Scholar 

  6. Ortiz-Mendoza CM, Sanchez NAA, Dircio AC. Fine-needle aspiration cytology to identify a rare mimicker of breast cancer: plasma cell mastitis. Rev Bras Ginecol Obstet. 2018;40(8):491–3.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Zheng Y, Wang L, Han X, Shen L, Ling C, Qian Z, et al. Combining contrast-enhanced ultrasound and blood cell analysis to improve diagnostic accuracy of plasma cell mastitis. Exp Biol Med (Maywood). 2022;247(2):97–105.

    Article  CAS  PubMed  Google Scholar 

  8. Zhu YC, Zhang Y, Deng SH, Jiang Q, Shi XR, Feng LL. Evaluation of plasma cell mastitis with superb microvascular imaging. Clin Hemorheol Microcirc. 2019;72(2):129–38.

    Article  PubMed  Google Scholar 

  9. Liu Y, Zhang J, Zhou YH, Zhang HM, Wang K, Ren Y, et al. Activation of the IL-6/JAK2/STAT3 pathway induces plasma cell mastitis in mice. Cytokine. 2018;110:150–8.

    Article  CAS  PubMed  Google Scholar 

  10. Zhang HJ, Ding PP, Zhang XS, Wang XC, Sun DW, Bu QA, et al. MAC mediates mammary duct epithelial cell injury in plasma cell mastitis and granulomatous mastitis. Int Immunopharmacol. 2022;113(Pt A):109303.

    Article  CAS  PubMed  Google Scholar 

  11. Liu Y, Zhang J, Zhou YH, Jiang YN, Zhang W, Tang XJ, et al. IL-6/STAT3 signaling pathway is activated in plasma cell mastitis. Int J Clin Exp Pathol. 2015;8(10):12541–8.

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Lin J, Guan L, Ge L, Liu G, Bai Y, Liu X. Nanopore-based full-length transcriptome sequencing of Muscovy duck (Cairina moschata) ovary. Poult Sci. 2021;100(8):101246.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Xin H, He X, Li J, Guan X, Liu X, Wang Y, et al. Profiling of the full-length transcriptome in abdominal aortic aneurysm using nanopore-based direct RNA sequencing. Open Biol. 2022;12(2):210172.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Thiel T, Michalek W, Varshney RK, Graner A. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet. 2003;106(3):411–22.

    Article  CAS  PubMed  Google Scholar 

  15. Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, et al. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res. 2011;39(Web Server issue):W316-22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Bararia D, Hildebrand JA, Stolz S, Haebe S, Alig S, Trevisani CP, et al. Cathepsin S alterations induce a tumor-promoting immune microenvironment in follicular lymphoma. Cell Rep. 2020;31(5):107522.

    Article  CAS  PubMed  Google Scholar 

  17. Hemmatazad H, Berger MD. CCR5 is a potential therapeutic target for cancer. Expert Opin Ther Targets. 2021;25(4):311–27.

    Article  CAS  PubMed  Google Scholar 

  18. Wei J, Fang D, Zhou W. CCR2 and PTPRC are regulators of tumor microenvironment and potential prognostic biomarkers of lung adenocarcinoma. Ann Transl Med. 2021;9(18):1419.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Wu Z, Zhang Z, Lei Z, Lei P. CD14: Biology and role in the pathogenesis of disease. Cytokine Growth Factor Rev. 2019;48:24–31.

    Article  CAS  PubMed  Google Scholar 

  20. Zhang C, Lv GQ, Cui LF, Guo CC, Liu QE. MicroRNA-572 targets CDH1 to promote metastasis of Wilms’ tumor. Eur Rev Med Pharmacol Sci. 2019;23(9):3709–17.

    CAS  PubMed  Google Scholar 

  21. Reissner C, Runkel F, Missler M. Neurexins. Genome Biol. 2013;14(9):213.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Dustin D, Gu G, Fuqua SAW. ESR1 mutations in breast cancer. Cancer. 2019;125(21):3714–28.

    Article  PubMed  Google Scholar 

  23. McHoney M, Munro F, Mackinlay G. Mammary duct ectasia in children: report of a short series and review of the literature. Early Hum Dev. 2011;87(8):527–30.

    Article  PubMed  Google Scholar 

  24. Palmieri A, D’Orazi V, Martino G, Frusone F, Crocetti D, Amabile MI, et al. Plasma cell mastitis in men: a single-center experience and review of the literature. In Vivo. 2016;30(6):727–32.

    Article  PubMed  Google Scholar 

  25. Jiao Y, Chang K, Jiang Y, Zhang J. Identification of periductal mastitis and granulomatous lobular mastitis: a literature review. Ann Transl Med. 2023;11(3):158.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Liu C, Yu H, Chen G, Yang Q, Wang Z, Niu N, et al. An herbal drug combination identified by knowledge graph alleviates the clinical symptoms of plasma cell mastitis patients: a nonrandomized controlled trial. Elife. 2023;12:e84414.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Zhang J, Xu J, Zhang J, Ren Y. Chinese herbal compound combined with western medicine therapy in the treatment of plasma cell mastitis: a protocol for systematic review and meta-analysis. Medicine (Baltimore). 2020;99(44):e22858.

    Article  CAS  PubMed  Google Scholar 

  28. Liang F, Li Q, Li X, Li Z, Gong Z, Deng H, et al. TSC22D2 interacts with PKM2 and inhibits cell growth in colorectal cancer. Int J Oncol. 2016;49(3):1046–56.

    Article  CAS  PubMed  Google Scholar 

  29. Qing M, Zhou J, Chen W, Cheng L. Highly expressed CYBRD1 associated with glioma recurrence regulates the immune response of glioma cells to interferon. Evid Based Complement Alternat Med. 2021;2021:2793222.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Shi F, Chen X, Wang Y, Xie Y, Zhong J, Su K, et al. HOTAIR/miR-203/CAV1 crosstalk influences proliferation, migration, and invasion in the breast cancer cell. Int J Mol Sci. 2022;23(19):11755.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Wedegaertner H, Pan WA, Gonzalez CC, Gonzalez DJ, Trejo J. The alpha-Arrestin ARRDC3 is an emerging multifunctional adaptor protein in cancer. Antioxid Redox Signal. 2022;36(13–15):1066–79.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Al-Yacoub N, Colak D, Mahmoud SA, Hammonds M, Muhammed K, Al-Harazi O, et al. Mutation in FBXO32 causes dilated cardiomyopathy through up-regulation of ER-stress mediated apoptosis. Commun Biol. 2021;4(1):884.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Volkert MR, Crowley DJ. Preventing neurodegeneration by controlling oxidative stress: the role of OXR1. Front Neurosci. 2020;14:611904.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Li R, Zhang J, Wang Q, Cheng M, Lin B. TPM1 mediates inflammation downstream of TREM2 via the PKA/CREB signaling pathway. J Neuroinflammation. 2022;19(1):257.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Kerro Dego O, van Dijk JE, Nederbragt H. Factors involved in the early pathogenesis of bovine Staphylococcus aureus mastitis with emphasis on bacterial adhesion and invasion. A review. Vet Q. 2002;24(4):181–98.

    Article  CAS  PubMed  Google Scholar 

  36. Moro-Garcia MA, Mayo JC, Sainz RM, Alonso-Arias R. Influence of inflammation in the process of T lymphocyte differentiation: proliferative, metabolic, and oxidative changes. Front Immunol. 2018;9:339.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Johnzon CF, Artursson K, Soderlund R, Guss B, Ronnberg E, Pejler G. Mastitis pathogens with high virulence in a mouse model produce a distinct cytokine profile in vivo. Front Immunol. 2016;7:368.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Jia L, Wang J, Luoreng Z, Wang X, Wei D, Yang J, et al. Progress in expression pattern and molecular regulation mechanism of LncRNA in bovine mastitis. Animals (Basel). 2022;12(9):1059.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Lin C, Zhu Y, Hao Z, Xu H, Li T, Yang J, et al. Genome-wide analysis of LncRNA in bovine mammary epithelial cell injuries induced by escherichia coli and staphylococcus aureus. Int J Mol Sci. 2021;22(18):9719.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Kumar T, Nee K, Wei R, He S, Nguyen QH, Bai S, et al. A spatially resolved single-cell genomic atlas of the adult human breast. Nature. 2023;620(7972):181–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Liu L, Zhou F, Wang P, Yu L, Ma Z, Li Y, et al. Periductal mastitis: an inflammatory disease related to bacterial infection and consequent immune responses? Mediators Inflamm. 2017;2017:5309081.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Alculumbre S, Raieli S, Hoffmann C, Chelbi R, Danlos FX, Soumelis V. Plasmacytoid pre-dendritic cells (pDC): from molecular pathways to function and disease association. Semin Cell Dev Biol. 2019;86:24–35.

    Article  CAS  PubMed  Google Scholar 

  43. Locati M, Curtale G, Mantovani A. Diversity, mechanisms, and significance of macrophage plasticity. Annu Rev Pathol. 2020;15:123–47.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was financially supported by the Special Subject of Scientific Research on Traditional Chinese Medicine in Henan Province (no. 2021ZY2203 and no. 2022ZY1080).

Author information

Authors and Affiliations

Authors

Contributions

S.L. and H.C. completed the experimental design. Y.F., C.W., L. A. and Q. X. collected human breast tissue. W. C., S. Z. and T. T. completed the data analysis and drafted the manuscript, then S.L., B. Z., M. Y. and L. Z. revised it. All authors approved the manuscript before submission.

Corresponding author

Correspondence to Su Li.

Ethics declarations

Ethics approval and consent to participate

This study protocol was reviewed and approved by Henan Provincial Hospital of Traditional Chinese Medicine (The Second Affiliated Hospital of Henan University of Traditional Chinese Medicine) Ethics Committee, approval number HNSZYY-20200311. Detailed explanations of the study's purpose and methods were provided to the patients, and their informed consent was obtained.

Consent for publication

All authors agree to publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Chen, H., Fan, Y. et al. Profiling the full-length transcriptome of plasma cell mastitis via nanopore sequencing. BMC Genom Data 26, 29 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12863-025-01312-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12863-025-01312-7

Keywords