Results
LC-MS/MS analysis of nine fetal discs identified 1,316 proteins. Proteins and their peptide sequences were verified with the manually curated SWISS-PROT database to remove unreviewed protein entries. Uncharacterized or putative or fragment proteins were excluded in the further analysis and are separately presented in
Supplemental Table 1.
1. Frequency of protein detection
Approximately 20% of proteins identified in the fetal disc were similar to adult disc samples identified in our earlier studies. Extracellular matrix and structural constituents conferring tensile strength to the discs, such as collagens, proteoglycans, glycoproteins, and annexins, were present in all nine samples. Other frequently detected proteins include small leucine-rich proteins (SLRPs), clusterin, matrilin, ribosomal proteins, histidines, globin family, tubulins, keratins, and peroxiredoxins, as shown in
Fig. 3 and
Supplemental Table 1. Approximately 173 proteins were found in four to seven samples, whereas approximately 60% of proteins were found in any one of the samples that are low abundant, mainly participating in adenosine triphosphate hydrolysis and metabolic, cellular, catabolic, and nucleotide binding processes.
To understand the matrisomal profiling of fetal NP discs, proteins were filtered based on the matrisome database (
http://matrisome.org). Quantitative analysis was performed based on the peptide spectral match counts of total identified proteins. It revealed that 48% of total proteins contributed to matrisomal proteins and 52% were other proteins.
Matrisomal proteins were further divided into two major categories: core matrisome (63%) and matrisome-associated proteins (37%). Core matrisomes were composed of proteoglycans (20%), collagens (20%), and glycoproteins (56%). Matrisome-associated proteins were also subcategorized into matrisome-affiliated proteins (67%), regulators (30%), and secreted factors (3%).
2. Gene ontology
Significant proteins were considered for the GO enrichment analysis. The distribution of biological processes, molecular function, and cellular component was obtained using clusterProfiler package. The distribution of biological process and cellular component across the IVD are shown in
Fig. 4 along their
p-values.
3. Cluster analysis
Approximately 247 proteins with high confidence were considered for further bioinformatics analysis. Clustering using the DAVID Gene Functional Classification Tool identified 10 clusters of highly related proteins (
Fig. 5). The topmost cluster had an enrichment score of 17.75 with nine types of collagens. Among the different types of collagens, collagen type XIV alpha 1 chain (COL14A1) was highly expressed, followed by collagen type XI alpha 2 chain, collagen type VI alpha 3 chain, collagen type IX alpha 1 chain, and collagen type VI alpha 1 chain.
The second significant cluster consisted of ribosomal proteins with an enrichment score of 17.131. Among ribosomal proteins, ribosomal protein S16 (RPS16) was expressed nearly 1.5-fold higher than other proteins, followed by ribosomal protein S18 (RPS18), ribosomal protein S7, ribosomal protein S9, ribosomal protein L9, ribosomal protein L23a, and ribosomal protein L27a.
The third abundant cluster with an enrichment score of 14.77 consisted of SLRPs. It was composed of chondroadherin (CHAD), epiphycan (EPYC), lumican (LUM), biglycan (BGN), fibromodulin (FMOD), proline/arginine-rich end leucine-rich repeat protein (PRELP), chondroadherin like (CHADL), decorin (DCN), and osteoglycin (OGN). BGN, DCN, PRELP, FMOD, and OGN were highly expressed in fetus samples, followed by LUM, CHAD, EPYC, and CHADL.
Interestingly, the fourth cluster of proteins that play a major role in maintaining the extracellular matrix integrity and forming filamentous networks in the extracellular matrices of disc was also identified with an enrichment score of 11.740. It consisted of matrilin 1 (MATN1), matrilin 2, matrilin 3 (MATN3), matrilin 4 (MATN4), thrombospondin 1 (THBS1), thrombospondin 4, thrombospondin 3, fibulin 7, and EGF-like repeats and discoidin domains 3. Among these proteins, MATN1 was found to be highly expressed, followed by MATN3, MATN4, and cartilage oligomeric matrix protein.
A cluster of key enzymes involved in proper three-dimensional folding of newly synthesized procollagen chains was also identified with an enrichment score of 11.014. It was composed of prolyl 4-hydroxylase subunit alpha 2, prolyl 3-hydroxylase 3, procollagen-lysine,2-oxoglutarate 5-dioxygenase 2, prolyl 4-hydroxylase subunit alpha 1, and procollagen-lysine,2-oxoglutarate 5-dioxygenase 1.
Other than these important clusters, we also identified clusters of peroxiredoxins, protein disulfide isomerase family proteins, annexins, tubulins, histones, and hemoglobin enriched in fetal NP discs.
4. Pathway analysis
Pathway enrichment analysis was performed using Reactome database from STRING (
Fig. 6). Most of the proteins were found to be involved in extracellular matrix organization (
p=7.61E−26), collagen biosynthesis and modifying enzyme pathway (
p=1.26E−21), and focal adhesion (
p=5.85E−08). Interestingly, we also identified significant pathways involved in the developmental processes. Important pathways such as Hedgehog signaling (
p=0.0406), which is required for the formation of the notochord sheath and patterning of NP within the IVDs, were also noted in our samples [
3]. We also identified EPH-ephrin signaling pathway (
p=0.0067), which is a key regulatory pathway in several important developmental processes such as cardiovascular and skeletal development, tissue patterning, and axon guidance [
4].
Discussion
IVD degeneration, despite being one of the most common causes of low back pain, continues to be the least understood musculoskeletal disorder. Surgical treatments are aimed to alleviate pain and therefore address only the mechanical issues arising from disc degeneration. Among all possible causative factors proposed to be involved in the development of disc degeneration, the inability to maintain matrix homeostasis remains foremost among all possible etiologies. Regenerative molecular therapies have been proposed to trigger disc regeneration, and mesenchymal cell transplantation into IVD was attempted in animal models as early as 2003 [
5]. However, further progress could not be made and disc regeneration in humans continues to be challenging due to the inability to replicate native biological environment. It was reported through in vitro culturing of human fetal spine cells and adult NP that fetal spine cells were found to have better matrix synthesizing capacity and hold promise for future cell therapies [
6]. However, the key success to such novel therapies will depend on the ability to restore the complex homeostatic mechanisms, which require in-depth understanding of the normal human IVD, including that of a fetus. In this regard, molecular analysis of a healthy disc not exposed to any mechanical and environmental insults may allow us to obtain a clear picture of the physiological, biochemical, and molecular events maintaining disc homeostasis. Recently, gene regulatory networks controlling extracellular matrix synthesis in cervical discs have been identified using RNA sequencing [
7]. In this study, efforts were made to document the proteomic signature of normal human lumbar fetal IVDs for the first time, which revealed the molecular and cellular complexities involved in the synthesis of ECM of IVD and also identified molecular targets for further research.
LC-MS/MS analysis of nine fetal IVDs identified a total of 1,316 proteins with the abundant proteins presented in
Fig. 3 and
Supplemental Table 1. Subsequent GO analysis revealed that fetal disc proteome is significantly enriched with proteins of 14 basic biological processes and five molecular functions, representing disc homeostatic mechanisms in a normal fetal disc (
Fig. 4). Of the total proteins, 247 were found in four or more fetal samples, whereas approximately 60% were found in any one of the fetal samples. This frequency of detection of proteins implies the presence of essential high abundant proteins that dominate spectral features and conceals the low abundant proteins of interest. This may be mainly due to the nature of peptides being hydrophilic or hydrophobic. Hydrophilic or small peptides pass through the reverse-phase column without being ionized, and hydrophobic or large peptides can also be challenging due to poor fragmentation [
8]. However, the less frequent or low abundant proteins are more interesting because their modifications or expression levels may carry significant biological information. Moreover, 247 proteins were subjected to cluster analysis, which is a novel method of constructing meaningful sets of similar proteins to analyze their expressions and functions, and this revealed 10 significant clusters of proteins in our study.
The IVD has an outer thick annulus fibrosus, which can withstand tensile forces, and an inner soft NP, which can withstand compressive loads. The NP is derived from the notochord cells in the fetus and has higher cellularity compared to adult discs [
9]. The extracellular matrix of the NP primarily consists of two major structural proteins, namely, collagens and proteoglycans. It has been reported that early-stage NP contains abundant collagen II, and aging is associated with an increase in the amount of collagen I [
10]. Apart from these, collagen VI and IX have also been reported in NP [
11,
12]. In our proteomic evaluation, collagen was the most significant cluster of proteins as previously reported. Interestingly, apart from collagen I and II, we identified other members of the collagen family, namely, V.VI, IX, XI, XIV, XV, and XVI. Importantly, the most abundant collagen was XIV, followed by XI, VI, IX, XII, II, and others. Collagen XIV is a regulator of fibrillogenesis, and knockout models of this gene in rats have lesser tendinous structures [
13]. Previously, this collagen has been reported in the cornea, tendon, articular cartilage, and even skin. However, to our best knowledge, there is no literature on collagen XIV and its role in human IVD. The only previous study indicating its role in the IVD is that on bovine fetus, where its expression was enhanced compared to other collagens [
14]. Collagen XIV is believed to have a major role in disc hydration and maintenance of disc height and possibly have a role in regeneration [
15]. Abundant expression of collagen XIV in the fetal IVDs signifies its role in the development of IVD and could be an ideal molecule to be targeted for regenerative therapies and tissue engineering techniques.
Ribosomal biogenesis is a basic process occurring in all dividing cells, which involves approximately 80 ribosomal proteins and >150 nonribosomal proteins. Ribosomal proteins play an important role in cell growth, proliferation, differentiation, and development [
16]. Defects in the genes encoding these proteins have been implicated in human pathologies, such as Diamond-Blackfan anemia and cancer [
17]. This highly conserved group of proteins has a 40S small subunit (ribosomal protein small subunit [RPS]) and a 60S large subunit (ribosomal protein large subunit [RPL]) in humans. In our study, ribosomal proteins formed the second most significant cluster in the fetal proteome. Among the small subunit groups, RPS16, RPS18, and RPS3 and, among the large subunit groups, RPL23, RPL27, and PO were found to be abundant. Ribosomal proteins and their association with disc degeneration have been less investigated. In a previous study, dysregulation of ribosomal genes (RPL8, RPS16, and RPS23) was found to be associated with disc degeneration [
18] upon comparison against scoliotic discs. In a recent study on the comparison between degenerated and cadaveric control discs, RPL17, RPL13A, RPL18, and RPS24 were identified among the top 10 central genes of protein-protein interaction network involved in degeneration [
19]. In this background, the basal expression of these ribosomal proteins in the fetus from our study reveals that most of them are involved in disc homeostasis and warrants further research on their differential expression during aging and degeneration to explore the possibilities of using them as molecular targets for regenerative therapy.
The third cluster of proteins belongs to the family of SLRPs having five major subclasses and plays an important role in fibrillogenesis, cellular growth, tissue repair, and remodeling, having a potential role in regeneration. In this study, among all proteoglycans, BGN was found to be abundant in the human fetal disc. BGN with DCN interact with other proteins to form a stable extracellular matrix, and BGN-deficient mice have been shown to have accelerated degeneration of IVD due to loss of macromolecular complex assembly [
20]. Further, loss of BGN has also been associated with decrease in transforming growth factor-β, thereby reducing osteoblastic differentiation, and this altered bone metabolism might itself result in premature disc degeneration [
21,
22].
Matrilin and thrombospondin formed the fourth major cluster of proteins. Matrillins belong to a family of oligomeric matrix proteins having a major role in the formation of filamentous networks and maintain the extracellular matrix integrity. In our study, MATN1 was the most abundantly expressed protein and has been found to form a complex with BGN/DCN, which in turn forms a link between aggrecan and collagen VI/II to form a stable extracellular matrix assembly [
23]. MATN1 polymorphism and deficiency leading to altered cartilage architecture have been associated with increased incidence and severity of adolescent idiopathic scoliosis [
24], and its abundant expression in our fetal disc emphasizes the need for its basal expression to maintain disc structure and homeostasis. It would be interesting to observe their expression in aging and degenerative disc tissues, and if found to be downregulated, it can be used for cell therapy. Similarly, downregulation of THBS-1 (abundantly expressed in the fetus) has been documented to cause progression of disc degeneration [
25] and manipulation of either this gene or its product could again be a novel therapeutic strategy.
Overall, the dataset comprising 1,316 different proteins obtained through LC-MS/MS analysis of fetal discs would serve as a molecular repository of IVD and form the basis for studying a series of disc-related diseases like DDD and herniation. Proteomic changes in the IVD across various developmental stages and decades of aging will improve the understanding on the molecular pathology of DDD. For the first time, this study unraveled proteomic constitution of healthy human fetal disc. A simple comparison of proteins extracted from the fetal disc in this study and human adult disc in a previous study [
26], as shown in
Fig. 7, reveals that, apart from the 224 common proteins that play a role in disc homeostasis, there are 1,092 proteins in the fetal disc that might have regulatory potential and possibly lead in tissue regeneration therapies. The fetal proteome included a series of proteins involved in extracellular matrix organization. For example, we identified nine types of collagens that are an essential part of the matrix. Collagen fibrils are essential and involved in mechanical strength, cell-fiber interactions, and degradation [
27]. Especially, COL14A1, which was highly expressed in fetal disc, plays a major role in fibril assembly. Studies have proved that COL14A1 is a key regulatory molecule in different steps and transitions in fibril assembly and growth during tendon development [
28]. Similarly, only key proteins that were most abundantly expressed in the fetal disc have been discussed briefly, as it beyond the scope of including all other molecules mentioned in the results.