Identification of the complete coding cDNAs and expression analysis of B4GALT1, LALBA, ST3GAL5, ST6GAL1 in the colostrum and milk of the Garganica and Maltese goat breeds to reveal possible implications for oligosaccharide biosynthesis

Background Milk sialylated oligosaccharides (SOS) play crucial roles in many biological processes. The most abundant free SOS in goat’s milk are 3’sialyllactose (3′-SL), 6’sialyllactose (6′-SL) and disialyllactose (DSL). The production of these molecules is determined genetically by the expression of glycosyltransferases and by the availability of nucleotide sugar substrates, but the precise mechanisms regulating the differential patterns of milk oligosaccharides are not known. We aimed to identify the complete cDNAs of candidate genes implicated in SOS biosynthesis (B4GALT1, LALBA, ST3GAL5, ST6GAL1) and to analyse their expression during lactation in the Garganica and Maltese goat breeds. Moreover, we analysed the colostrum and milk contents of 3′-SL, 6′-SL and disialyllactose (DSL) and the possible correlations between expressed genes and SOS. Results We identified the complete coding cDNAs of B4GALT1 (HQ700335.1), ST3GAL5 (KF055858.2), and ST6GAL1 (HQ709167.1), the single nucleotide polymorphism (SNPs) of these genes and 2 splicing variants of the ST6GAL1 cDNA. RT-qPCR analysis showed that LALBA and ST6GAL1 were the genes with the highest and lowest expression in both breeds, respectively. The interaction effects of the breeds and sampling times were associated with higher levels of B4GALT1 and ST3GAL5 gene expression in Garganica than in Maltese goats at kidding. B4GALT1, LALBA, and ST3GAL5 gene expression changed from kidding to 60 and 120 days in Maltese goats, while in Garganica goats, a difference was observed only for the LALBA gene. Breed and lactation effects were also found for SOS contents. Positive correlations of B4GALT1, LALBA, ST3GAL5, and ST6GAL1 with 3′-SL/6′SL and DSL were found. Conclusions The genetic effect on the oligosaccharide content of milk was previously highlighted in bovines, and this study is the first to investigate this effect in two goat breeds (Garganica and Maltese) during lactation. The genetic variability of candidate genes involved in SOS biosynthesis highlights their potential role in affecting gene expression and ultimately biological function. The investigation of gene regulatory regions as well as the examination of other sialyltransferase genes will be needed to identify the genetic pattern leading to a higher SOS content in the autochtonous Garganica breed and to protect it using a focused breeding strategy.


Background
Carbohydrates and, in particular, their oligosaccharidic fraction are part of the plethora of milk components that have functional health benefits for consumers. Oligosaccharides (OS) are synthesized in the mammary gland, contain between 3 and 10 monosaccharides, and can be divided into neutral and acidic fractions. The five principal monosaccharides component of milk OS are: Dglucose (Glc), D-galactose (Gal), N-acetylglucosamine (GlcNAc), L-fucose (Fuc), and sialic acid (Sia, specifically N-acetylneuraminic acid [Neu5Ac] in humans and both Neu5Ac and N-glycolyl neuraminic acid [Neu5Gc] in most other species). Lactose (Galb1-4Glc) forms the reducing end of milk OS; 3′-SL and 6′-SL can be synthesized by the addiction of Sia to Gal in α2-3 and/or α2-6 linkages, respectively. Lactose can also be fucosylated in α1-2, α1-3 linkages to form 2'fucosyllactose (2-FL) and 3'fucosyllactose (3′-FL), respectively. These trisaccharides are referred to as short-chain milk OS [1,2]. Moreover, if a sialic acid is added in an α-2-8 linkage to 3′-SL, the disialyllactose (DSL) tetrasaccharides is formed.
It is generally accepted that most OS are resistant to the pH of the stomach in infants; they are also resistant to enzymatic hydrolysis in the small intestine and are thus largely undigested and unabsorbed. Therefore, most OS will pass through the intestinal tract and enter the colon intact [3].
Different healthful properties have been attributed to OS; they exhibit several protective and physiological roles, including showing prebiotic activity and inhibition of pathogen adhesion in the gastrointestinal tract of infants, immunoregulation, modulation of the microbiota in neonates, intestinal barrier protection, and participation in cognitive function development. The biological functions of OS in human health and nutrition have been demonstrated by a vast number of functional in vitro and in vivo studies [4][5][6][7].
OS concentrations is high in the colostrum (up to 50 g/ L or more) with an average of 10-15 g/L in mature human milk [8]. Compared with humans, the concentration of OS in the milk of the most relevant domestic species is lower by a factor of 10 to 100 [9]. Moreover, higher OS concentrations have been reported in the colostrum than during late lactation in different animal species [10][11][12].
Goats produce milk containing free milk OS at much higher levels than cattle or sheep and with a variability, similar to those of human milk [13,14]. Therefore goat milk also present potential as a supplement for infant formulae and for the development of functional foods or animal feedstuffs on an industrial scale [15,6]. The recovery and exploitation of whey (rich in OS) by membrane technology during cheesemaking may add value and encourage a decrease in environmental pollution [14][15][16][17][18][19].
In the last decade, scientists moved beyond the investigation of the components of milk itself and examined the genomics, transcriptomics and proteomics of milk; an example is provided by the Milk Genomics Consortium [20]. To understand the complex biology of milk OS metabolism, it is important to identify the genes that encode glycosylation-related enzymes and, in human, some investigation on glycosylation-related genes and their expression have been carried out [21,22]. Harduinleper [23] reported a comprehensive analysis of sialyltransferase in vertebrate genomes; however, the mechanisms regulating differential gene expression in the mammary gland or milk somatic cells of dairy animals are still almost unknown. Sialyltransferases are a subset of glycosyltransferases that use cytidine-5′-monophospho-N-acetylneuraminic acid (CMP-Neu5Ac) as an activated sugar donor to catalyse the transfer of sialic acid residues to terminal non-reducing positions in the OS chains of glycoproteins and glycolipids. They catalyse the formation of different linkages (α2-3, α2-6, and α2-8) and differ in their acceptors. They have been grouped into four families according to the carbohydrate linkages that they synthesize: α 2-3-syaliltransferase (ST3Gal I-VI), α 2-6-syaliltransferase (ST6GALI-II), GalNAc α 2-6-syaliltransferase (ST6GalNAc I-VI), α 2-8-syaliltransferase (ST8Sia I-VI) [24][25][26].
Beyond physiological (age, kidding, body weight, stage, and number of lactations), environmental and management factors, milk characteristics are influenced by animal genetics (breed and genotype) [27,28].
The study of OS biosynthesis genes and the identification of polymorphisms associated with naturally occurring OS types, in economically important dairy species and breeds, represents the first step to plan genetic improvement programs and to forecast technological/industrial applications.
In both cattle [12,[29][30][31] and goats [28,32], differences in the OS profiles of milk have been found between breeds, and a higher concentration of sialylated oligosaccharides (SOS) fractions has been reported.
Dairy goats are of economic and social importance in southern Italy due to their ability to use vegetation in marginal areas. The Maltese goat represents the lactating goat "par excellence" of the Mediterranean, and the trinomial "rusticityabsence of horns-milk production" is still valid. The absence of horns plays a fundamental role in intensive and semi-intensive breeding. The Garganica breed is an autochthonous breed of economic importance on the Gargano Promontory. It represents a wager in the optics of sustainable and multifunctional breeding (environmental protection), particularly in internal and mountain areas, and the characterization of its milk and derivatives is a key element for the protection of the breed.
We focused on the free OS fraction of goat milk and, in particular, the sialylated fraction; these prebiotic molecules have been shown to have many beneficial effects in the human body and, overall, in the gut [33,34]. A first crucial step is to determine which glycosylationrelated genes are involved in the biosynthesis of OS in goat milk somatic cells.

Results
Sequencing analysis of the B4GALT1, ST3GAL5 and ST6GAL1 genes In this work, we identified the expressed complete coding cDNAs of three genes potentially involved in oligosaccharide metabolism. For DSL, we designed primers for the α-N-acetyl-neuraminide α-2,8-sialyltransferase 6 (ST8SIA6) gene [35], but we encountered problems with cDNA amplification, and we did not study this gene further (data not shown).
The B4GALT1 coding sequence obtained up from the assembly of two PCR amplicons; it is 1508 bp long, gives rise to an open reading of 1209 nucleotides, is composed of six exons and encodes a polypeptide of 402 amino acids. Two initial ATG codons are present, similar to the known mammalian B4GALT1 long isoform of the transcript. The sequence was deposited in GenBank under accession number HQ700335.1. The cDNA sequences from 8 Maltese milk samples revealed the presence of 2 SNPs in the 3'UTR (n.1294 G > A; n.1348 G > A). The deduced protein sequences showed identity with the four functional domains of the B4GalT-1 protein; comparison with the bovine sequence showed an overall similarity of 95.52%, with 18 amino acid differences throughout the coding sequences.
The complete ST3GAL5 cDNA was compiled by merging the sequences of three amplicons; it is 1358 bp long, gives rise to an open reading frame of 1263 nucleotides, is composed of seven exons and encodes a protein of 420 amino acids. Comparison of the predicted ST3GAL5 goat protein with the bovine protein showed 97% identity and 13 amino acid differences throughout the coding sequences. The sequence was deposited in GenBank under accession number KF055858.2. Sequencing of the cDNAs from 8 Maltese milk samples showed the presence of 2 SNPs: c.213 G > A in exon 2 and c.240 T > C in exon 3. The first causes an amino acid change from Gly to Asp (aa67), and the second causes an amino acid change from Phe to Ser (aa76). The SIFT analysis predicted that Gly67Asp in exon 2 is tolerated (score 0.329), while Phe76Ser substitution in exon 3 is predicted to be deleterious for protein function (score 0.00).
The complete ST6GAL1 cDNA was compiled from the assembly of two PCR amplicons; it is 1672 bp long, spans a coding region of 1218 nucleotides, is composed of five coding exons and encodes a protein of 414 amino acids. Comparison of the predicted ST6GAL1 goat protein with the bovine protein showed 98% similarity and 10 amino acid differences, most of which were located in the first 90 amino acids. The sequence was deposited in GenBank under accession number HQ709167.1. In Fig.  1, the results of PCR gel electrophoresis of the 1593 bp amplicon from the cDNAs of 3 Maltese milk samples are shown. Sequences analysis of the weak bands indicated by arrows showed that the slower band corresponds to a cDNA with a spliced coding exon 2 and the faster band to a cDNA with a spliced coding exon 1.
Sequencing of cDNA from 8 Maltese milk samples showed the presence of 4 SNPs: c.372 T > A and c.533 G > A in exon 1, c.977 C > T in exon 4 and c.1310 C > T in exon 5. The first three SNPs cause synonymous amino acid changes, while the c.372 T > A SNP causes a Ser versus Thr amino acid change (aa69). The SIFT analysis predicted this substitution in exon 1 to be tolerated for protein function (score 0,59).
Gene expression analysis of B4GALT1, LALBA, ST3GAL5 and ST6GAL1 during lactation In this work, we aimed to study the expression of four genes (B4GALT1, LALBA, ST3GAL5, ST6GAL1) during lactation in two genetically distant goat breeds characterized by different productive traits. As an initial step, optimization of the primer concentration, annealing temperature and PCR efficiency was performed for all primer pairs used for both target and RGs. The results showed that a PCR efficiency of 92-105% of was achieved with the settled parameters (Additional file 1). The geNormPLUS analysis showed that ATP5B and SDH were the two most stable RGs (Additional file 2). For one MA and two GA animals, the 120-day sampling time point could not be analysed (dry-off). All procedures followed for the qPCR experiments complied with the Minimum Information for the Publication of Quantitative Real-time PCR Experiments (MIQE) guidelines [36].
The effects of breed and lactation on gene expression are reported in Tables 1, while the effects of the interaction between the breeds and sampling times on gene expression are reported in Table 2 and Fig. 2 (please insert Table 2 after line 197). In Fig. 2, statistically significant differences between breeds are reported.
LALBA was the most highly expressed gene, followed by ST3GAL5, B4GALT1 and ST6 (1.206, 1.171, 0.695, and 0.6 mean values, respectively). A breed effect was found for ST3GAL5, whose expression was higher in GA. Lactation effects were shown for all genes except for ST6GAL1. In particular, B4GALT1 expression increased from 0 to 60 days, LALBA expression increased from 0 and 1 day to 60 and 120 days, and ST3GAL5 expression increased from 0 to 1, 60 and 120 days (please insert here Table 2. The breed/sampling time interaction effects of B4GALT1 and ST3GAL5 at day 0 were stronger in the GA breed (p = 0.01 and p = 0.014, respectively). During lactation, B4GALT1 gene expression increased from 0 to 60 and 120 days only in the MA breed (p = 0.0026 and 0.0016, respectively). LALBA showed an increase from 0 to 60 days in the GA breed (p = 0.0025) and from 0 to 60 and 120 days in the MA breed (p = 0.0025 and 0.00028, respectively). ST3GALT5 expression increased from day 0 to 1, 60 and 120 days in the MA breed (0.0002 ≤ p ≤ 0.01). No difference was observed for the ST6GAL1 gene between either lactation stages or breeds.

Evaluation of oligosaccharide contents in colostrum and milk
Overall, 3′-SL was the most abundant SOS, followed by DSL and 6′-SL (170.77 and 52 mg/L mean values, respectively). The evaluation of SOS contents in colostrum    and milk showed a significant effect of the breed and sampling time (Table 3). A breed effect was shown for DSL at a higher concentration in MA. A sampling time effect was observed for all the SOS. In particular, the 3′-SL content decreased from 0 and 1 days to 60 and 120 days; the 6′-SL content increased from 0 to 1 days and then decreased from 60 and 120 days; and the DSL content decreased from 0 and 1 days to 60 and 120 days.
Considering the interaction effect (Table 4), differences between breeds were observed at 1 day, when 3′-SL values were higher in GA (p = 0.032), and DSL was higher in MA (p = 0.0001). (please insert here Table  4). During lactation, decreases in the contents of all SOS were observed from 1 to 60 days in both breeds, whereas there was an increase in 6′-SL from 0 to 1 days. The 3′-SL content increased from 0 to 1 day only in GA, while DSL decreased.

Correlation analysis between the analysed phenotypes
Pearson's correlation coefficients between the studied phenotypes throughout the experiment (gene expression levels and SOS contents) are shown in Table 5.

Discussion
To our knowledge, this study is the first to investigate the gene expression of candidate genes for OS biosynthesis in different goat breeds during four lactation stages. In fact, only one RNA-seq experiment in the Jersey and Friesian bovine breeds [37] and one in the Maltese goat breed [38] have been reported to date. The identification of the candidate gene cDNAs represents a first step toward gene expression analysis.
The mutations found in the ST3GAL5 coding cDNA lead to amino acid changes in the protein transmembrane domain (TM).
Compared with other Golgi-resident glycosyltransferases (GT), the deduced proteins of both ST3GAL5 and ST6GAL1 shared a typical type II membrane protein topology (consisting of a short N-terminal cytoplasmic tail, a transmembrane (TM) domain followed by a stem region and a large C-terminal catalytic domain facing the luminal side) and sialyl motifs in their catalytic domain [39,40].
The Phe76Ser change (c.240 T > C) identified in the ST3GAL5 cDNA is located in the middle of the sequence, and SIFT prediction showed the mutation to be deleterious for protein function. The N-terminal region of glycosyltransferases, and particularly the transmembrane domain, is crucial for the correct Golgi localization of the enzymes [41]. a change from an apolar to a polar in amino acid residue the hydrophobic TM could interfere with the correct localization of an enzyme and, consequently, its function. In humans, a missense change in the TM region of an ST3GAL3 family protein was shown to co-segregate with intellectual disability, and cellular and biochemical systems showed that this mutation caused ER retention and drastically impaired protein functionality [42]. It would be interesting to investigate the c.240 T > C mutation more deeply in at least the two goat breeds examined herein and in a larger number of animals to uncover its potential influence on gene expression.
The Ser69Thr amino acid change (caused by the c.372 T > A SNP) found in the ST6GAL1 cDNA is located in the stem region outside of the catalytic region of the protein, and SIFT analysis did not show deleterious effects on protein function. Considering the preliminary identification of ST6GALT1 splicing variants in GMSCs, the obtained bands were less distinct compared with the principal band. Moreover, the higher band gives rise to a deduced protein product lacking the second coding exon containing the entire L sialyl motif of the catalytic domain, while the lower band gives rise to a protein lacking the TM domain. Thus, these bands would not produce a complete protein, although the full transcript could compensate for this, or they might represent expressed pseudogene forms. as reported for bovines [43]. The whey α-lactalbumin protein coding gene (LALBA) showed the highest expression in our experiment, which agrees with previous findings in goats [44], cows [45], sheep [46] and humans [47]. The increase in LALBA expression that we observed starting from the colostrum stage was consistent with increases in both the milk protein output and lactose synthesis. LALBA is a major protein component of whey, and a high level of LALBA expression in mammary epithelial cells is required  Pearson coefficient significance: * P < 0.05; ** P < 0.01; *** P < 0.001 B4GALT1 = β-1,4-galactosyltransferase 1 LALBA = lactalbumin-α ST3GAL5 = β-galactoside α-2,3-sialyltransferase 5 ST6GAL1 = β-galactoside α-2,6-sialyltransferase 1 6′-SL = 6'sialyllactose 3′SL = 3'sialyllactose DSL = disialyllactose during lactation. At 120 days, LALBA values were shown to be high in MA, while in GA, they tended to decrease; we hypothesize that this trend could be related to the longer lactation period of the MA than the GA breed. In the mammary gland, B4GalT-1 interacts with the αlactalbumin protein to form the lactose synthase complex which is involved in the final step of lactose synthesis by transferring the galactose moiety of free UDP-Gal to Glc rather than to GlcNAc [7]. Although this process has been demonstrated to occur in intact Golgi, the precise mechanisms whereby lactose is modified by the addition of other sugars by specific glycosyltransferases during lactation are still unknown. It is assumed that the same enzymes involved in producing the termini of other glycan classes might be responsible. The B4GALT1-encoded protein is one of the best-studied galactosyltransferase enzymes; it is a trans-Golgi resident type II membrane-bound glycoprotein that catalyses the transfer of galactose to N-acetylglucosamine residues and participates in the biosynthesis of the oligosaccharide structures of glycans. Since B4GalT-1 has been identified on the plasma membrane, it may also be involved in cell-cell or cell-substrate recognition [48]. B4GALT1 gene expression was maintained at similar levels during lactation in the GA breed, while increasing expression values were shown in the MA breed; the latter is in agreement with the results reported in an RNA-seq experiment in goat [38]. The expression patterns of both LALBA and B4GALT1 suggest that the encoded proteins are present during lactation and are assembled to carry out the production of lactose, which might in turn be used as a precursor for SOS biosynthesis. A moderate positive correlation coefficient between the B4GALT1 and LALBA genes was found in the experiment, supporting our hypothesis.
Considering the ST6GAL1 gene, its expression in bovines is primarily regulated at the transcription level by several cell and development-specific promoters, producing transcripts with divergent 5′ untranslated regions (UTR) [43]. In the mouse mammary gland [49], a novel mRNA isoform of the ST6GALI gene containing a novel 5′UTR exon (L) that is associated with a drastic increase in gene expression during lactation has been identified. Dalziel et al. [50] found a similar transcript in the rat mammary gland. In contrast, no exon (L)-containing transcript was detected in the lactating bovine or human mammary gland. The authors also observed a trend of increasing ST6GAL1 gene expression in the bovine mammary gland, culminating in involution. This contrasts with findings in species such as mice, in which the greatest change in ST6GAL1 gene expression occurs between pregnancy and lactation, suggesting different roles in rodents vs. other mammals. The transcripts identified in this study showed differences in their coding regions, and we could not clarify the nature of the different 5′ regions of the transcripts because neither 5'RACE or a promoter analysis was performed.
The mammary gland is susceptible to modification and remodelling associated with cell turnover and apoptosis during lactation [51,52]. On our experimental farm, the goats are extensively reared, and kids are left with parents for a long period until the peak of lactation, when they are separated. At the beginning of weaning in the mammary gland, some modifications lead to involution; Li et al. [53] found that 5% of cells in the involuting tissue in goats were apoptotic, whereas less than 1% of cells in lactating tissue were undergoing death. During this phase, greater susceptibility to infection arises, with the consequent stimulation of the immune system, which plays an important role in the defence of the mammary gland. The review of Mills et al. [54] explains the possible roles of OS that harbour epitopes similar to selectin-binding ligands and may play a role in antiinflammatory processes.
We hypothesize that at 60 days, both the ST6GAL1 and ST3GAL5 genes might be expressed at higher levels due to their potential involvement in the synthesis of glycoconjugate sialylated structures, which are fundamental for the immune response.
Sialyltransferases are divided into four families, ST6Gal, ST3Gal, ST6GalNAc and ST8Sia, according to the glycosidic linkage formed and the monosaccharide acceptor used. Currently, 20 sialyltransferase subfamilies are known in higher vertebrates [23]. The ST3gal gene family comprises six ST3Gal sub-families, and the ST6Gal gene family comprises two sub-families. ST3Gal III, IV, V, and VI use the oligosaccharide isomers Gal β1-3/4Glc(NAc)-R.
RNA-seq studies in both goats [38] and bovines [37] showed the expression of all 6 sub-family ST3Gal genes, including ST3GAL5 and ST6GAL1. In goats, the most highly expressed genes were ST3GAL1, ST3GAL4 and ST3GAL6. Another study [55] found that the most highly expressed genes in mouse mammary glands were ST3GAL1, ST3GAL4 and ST6GAL1. By using knock-out mice for these genes and analysing 3′-SL and 6′SL levels in milk, it was shown that the ST3GAL4 and ST6GAL1 genes account for the bulk of 3′SL and 6′-SL production, respectively.
When we looked at the possible correlations of the phenotypes analysed in the experiment, we found positive correlations among genes and among SOS. A positive correlation of ST3GAL5 and ST6GAL1 expression with 3′-SL and 6′-SL colostrum/milk contents was not found. In fact, the concentration of SOS was initially high in colostrum (0 and 1 day) and then decreased, in accordance with the literature [11,32,56,57], while for gene expression, the opposite pattern was observed.
The lack of association between some of the chosen candidate genes and SOS content might have occurred because other sialyltransferase genes such as ST3GAL4 or ST3GAL6 [58] might be involved in the biosynthesis of the free OS fraction.
Overall, the higher expression of ST3GALT5 to ST6GALT1 found in this study corresponds to the higher concentration of 3′-SL compared to 6′-SL, which is supported by Crisà et al. [38] and Wickramasinghe et al. [37], who reported the same differences in gene expression in RNA-seq experiments in goats and cows.
The differences in SOS contents between the GA and MA goat breeds have been investigated by Claps et al. [32], although they sampled milk at different time points than were examined in the present study, and we could perform comparisons only for the results for 0 and 1 days. In their experiment, DSL showed the third highest mean value among the SOS after 3′-SL and 6′SL, and GA colostrum and milk contained higher levels of 3′-SL, 6′-SL and DSL than those of MA. In our experiment, DSL was the second most concentrated SOS after 3′SL, and the proportions of SOS between GA and MA differed on the different days. Only on day 1 did 3′-SL present a significantly higher value in GA, while DSL presented a higher value in MA. Given that DSL is formed from 3′-SL, we speculate that in MA, a fraction of 3′-SL might be used to synthesize DSL. This suggestion was corroborated by the results of correlation analyses in which higher and more statistically positive coefficients were found between 3′-SL and DSL than between 6′-SL and DSL. Furthermore, we do not have data on the expression of ST8SIA genes that could support this suggestion. In mice, it was shown that the ST8SiaVI product exhibited higher activity towards 3′-SL than towards 6′-SL [26].
A limitation of our study is that we considered only certain putative genes involved in the last step of SOS biosynthesis while not taking into account precursor availability for the sialic acid and lactose synthesis pathways. The goats were fed the same diet, so the diet would not be an issue explaining the observed differences, which could instead reflect an effect of parity (differences between experimental animals), season (from February to June), the duration of lactation (which is shorter in the GA breed) or production output (which is lower in the GA breed) effects. Moreover, we are aware that other intermediate biological processes exist between gene expression and the production of functional proteins.

Conclusions
This study represents the first attempt in investigating the mRNA expression profiles of candidate genes implicated in SOS biosynthesis in different goat breeds during lactation. Genetic effects on the OS content of milk have been observed in bovines, and we report candidate gene expression variability in two breeds of goats, one of which is autochthonous and economically important to the Gargano promontory. Moreover, the higher expression of ST3GALT5 with respect to ST6GALT1 corresponds to what has been reported in the literature regarding the higher concentration of 3′-SL compared to 6′-SL in goat colostrum and milk. However, no clear correlation between genes and SOS contents in colostrum and milk was shown by the analysis. Although the investigation of additional sialyltransferase genes that support and explain SOS production is needed, our findings highlight the presence of SNPs in the studied genes, one of which might impair the protein cellular localization and function of ST3GAL5. The identification of different ST6GAL1 transcripts suggests the existence of a complex regulatory process in the promoter region, as observed in other mammalian species. The investigation of SNPs in the wider Maltese and Garganica population as well the gene regulatory regions, which may account for different transcripts or expression levels, will be needed to identify associations with OS milk content. Garganica is an important autochthonous goat breed, especially for interior and mountainous areas, and the identification of genetic patterns leading to higher SOS contents might be useful for breeding purposes and a key element for breed protection.

Animals and sample collection
This study was conducted on the experimental farm of CREA (Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Research Centre for Animal Production and Acquaculture, Bella (PZ)), a research institute authorized by the Italian Ministry of Health to use farm animals for experimental purposes (DM 26/96-4). Animal management and care followed the recommendations of European Union directive 86/609/EEC.
A dairy breed, Maltese (MA) goats, and a less productive breed, Garganica goats (GA), were considered in this trial, including a total of 18 does (9 GA and 9 MA). The animals were fed indoors, receiving mixed meadow hay ad libitum plus concentrate supplementation at 14% of crude protein (CP), which was 600 g/d for MA and 400 g/d for GA in relation to their level of milk production. Within each breed, the animals were grouped homogenously according to their live weight (46.1 kg and 48.6 kg on average for MA and GA, respectively) and milk production (1200 and 800 g/day of milk, respectively, in mid lactation).
At four lactation stages (colostrum (after kidding and day 1), 60 and 120 days), individual colostrum/milk samples were collected in a 50 ml tube and transported to the laboratory at 4°C for RNA extraction. For OS determination (3′-SL, 6′-SL, DSL), the samples were distributed in 10 ml tubes and immediately frozen at − 20°C until analysis. At the fourth lactation stage, some samples were not collected because the ewes had run dry.

RNA and DNA extraction
RNA was obtained from goat milk somatic cells (GMSCs) in accord with both [59], who showed that GMSCs can be used to accurately reveal the cellular dynamics of mammary gland gene expression, and other authors [60,61] who demonstrated extensive similarity between the mammary gland and MSC/ mammary epithelial cell transcriptomes. RNA was extracted from somatic cells obtained from 50 ml of goat colostrum or milk in a total of 72 samples. Individual samples were immediately centrifuged at 2000 g for 10 min at 4°C in the presence of 50 μl EDTA 0.5 M pH 8. The fat layer on the top of the supernatant was removed with a sterile pipette tip, the skim milk was discarded, and the somatic cell pellet was resuspended in 1 ml of 1X PBS with 300 μl of 0.5 M pH 8 EDTA and centrifuged again. The resulting cells were lysed with 1 ml TriReagent (Sigma-Aldrich, Milan, Italy). Total RNA extraction was performed following the manufacturer's instructions. RNA was digested to remove contaminated DNAs with an RNase-Free DNase Set (Qiagen) and successfully cleaned with an RNeasy MinElute kit (Qiagen, Milan, Italy). RNA quality and quantity were checked via a spectrophotometry (NanoPhotometer™ Pearl, Implen GmbH, München, Germany) and microfluidic (2100 Bioanalyzer, Agilent Technologies, Milan, Italy) inspection to assess RNA integrity. The RNA samples were stored at − 80°C. Only RNA with an RNA Integrity Number (RIN) > 7 was used in the gene expression analyses (6 animals of each breed, total of 45 samples). DNA was extracted from the interphase obtained from milk cell pellets following the TriReagent (Sigma-Aldrich, Milan, Italy) manufacturer's instructions.

Reverse transcription into cDNA and sequencing
RNA was reverse transcribed to produce both long cDNAs and cDNAs for qPCR. In the first case, the Maxima H Minus First Strand cDNA Synthesis kit (Thermo Scientific, Thermo Fisher Scientific, Waltham, USA) was employed starting from 1 μg of RNA in a total volume of 20 μl containing 100 pmol oligo (dT) (18-mer), 0.5 mM dNTPs, 5X RT buffer, and Maxima H Minus Enzyme mix according to the manufacturer's instructions. The resulting cDNA was diluted 1:2 and PCR amplified by using the primer pairs listed in Table 1. PCR amplification was performed using Dream Taq DNA polymerase (Thermo Scientific, Thermo Fisher Scientific, Waltham, USA) with 1 μl of the first-strand cDNA reaction and a touchdown protocol. The PCR products were gel purified using Nucleospin columns (Macherey-Nagel, GmbH & Co KG, Duren, Germany) and bi-directionally sequenced by using the BidDye Terminator v. 1.1 Cycle Sequencing kit and an ABI 3700 sequencer (Applied Biosystems, Life Technologies, Milan, Italy).
For the cDNAs to be used in gene expression experiments, the Maxima First Strand cDNA synthesis kit for RT-qPCR (Thermo Scientific, Thermo Fisher Scientific, Waltham, USA) was used. Briefly, RT reactions were prepared by adding 500 ng RNA, 4 μl of a 5X reaction mix (containing random hexamers and oligo (dT) 18, and 2 μl of enzyme mix in a final volume of 20 μl. The reaction was incubated at 25°C for 10 min, then at 50°C for 15 min, and finally at 85°C for 5 min. An RT-negative reaction was set up to check for genomic DNA contamination. The cDNA was first diluted 1:4 and then used directly for qPCR or stored at − 20°C.

Primer design and sequence analysis
At the beginning of our study, no annotations for the studied genes in goats were present in GenBank except for LALBA (NM_001285635). Primers were picked by using Primer 3 software [62, 63] on the basis of known bovine sequences. After initial trials with different primer pairs, the complete coding sequences of ST6GAL1, ST3GAL5, and B4GALT1 were obtained by merging the sequences from two or three amplicons (Table 6). Sequence analysis and alignments (nucleotidic and proteic deduced ones) were performed using DNAMAN (Version 4.15, Lynnon BioSoft, Inc., San Ramon, USA), Bio Edit software [64] and the BLAST tool [65,66].
SIFT software analysis was performed to predict whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids [67].
Primer Express software (Applied Biosystem, Thermo Fisher Scientific, Waltham, USA) was used to design primers for qPCR on the basis of bovine sequences. Five reference genes (RGs) selected in another study [68] were tested here: ATP synthase β polypeptide, a nuclear gene encoding a mitochondrial protein (ATP5B), eukaryotic translation initiation factor eIF-2B subunit β (EIF2B2), succinate dehydrogenase complex subunit A flavoprotein (SDHA), DNA-directed RNA polymerase fragment RNA polymerase II (POLR2A), and a TATA box-binding protein (TBP). We designed and tested new primers for the following RGs: a ubiquitously expressed transcript (UXT) and ribosomal protein S9 (RPS9). The primers were intron spanning when possible and were synthesized by Eurofins MWG Operon (Ebersberg, Germany). Table 7 provides information related to the primers used to amplify the targets and RGs. Preliminary amplifications were performed on both cDNA and DNA to validate the primers. PCR products were verified via an agarose gel run and sequenced using a 3500 Genetic Analyzer (Applied Biosystem, Thermo Fisher Scientific, Waltham, USA) Amplification specificity was verified by using the BLAST tool [65].

QPCR assay
In gene expression experiments, the suitability of genes to be used as references (RGs) is not given a priori and has to be evaluated each [69]. Preliminary qPCR experiments were performed to optimize the annealing temperature and primer concentration (100-300 nM) of both the target genes and the seven RGs.
The efficiency of PCR amplification for each gene was calculated with a five-point standard curve (1:5 dilution per point) generated on the basis of cDNA samples used as a calibrator. QPCR was conducted using Max SYBR Green/ROX qPCR Master Mix (Thermo Scientific, Thermo Fisher Scientific, Waltham, USA) in a StepOne Plus instrument (Applied Biosystems, Thermo Fisher Scientific, Waltham, USA). Each reaction was run in triplicate and contained 10 ng of cDNA template along  with the specific corresponding primer concentrations and 10 μl of 2X master mix in a final reaction volume of 20 μl. Reverse transcriptase minus (RT-) and no template controls (NTC) were included. The cycling parameters were 95°for 10 min to activate the DNA polymerase, followed by 40 cycles of 95°for 30 s and T.a. for 1 min.
To verify that the primer pairs produced only a single product, a dissociation protocol from 65°C to 95°C was added after thermocycling.

Oligosaccharide isolation and HPAEC analysis
OS were isolated from individual colostrum and milk samples as described by Mc Jarrow and Van Amelsfort-Schoonbeek [70]. Briefly, after centrifugation at 2000×g at 4°C for 10 min, the supernatant lipid layer was removed, and the proteins were precipitated by the addition of 0.5 volumes of 1.8 g 100 mL − 1 Ba (OH) 2 ·8H 2 O and 0.5 volumes of 2 g 100 mL − 1 ZnSO 4 ·7H 2 O. The blend was mixed by vortexing and centrifuged at 12.000×g in a microfuge for 10 min at 4°C. The supernatant was carefully removed and centrifuged again. The second supernatant was filtered with a nylon filter with a 0.45 μm pore size. The total OS fraction was separated using high-performance anionexchange chromatography (HPAEC) in a Dionex PA100 column (Dionex, Sunnyvale, California, USA). The eluted fractions were monitored by pulsed amperometric detection (Dionex ED40), and the gradient was controlled by a Varian Pro Star pump system, which was capable of maintaining a flow rate of 1 ml/min for the duration of the run. Data were collected and analysed with Star Chromatography Workstation 6.41 (Varian, Inc. Walnut Creek, California, USA), and 6′-SL, 3′-SL, and DSL external standards were used to generate standard curves for comparison.

QPCR data analysis and statistics
The results obtained with Sequence Detection Software (Applied Biosystems, version 2.1, Thermo Fisher Scientific, Waltham, USA)) were exported as tab-delimited text files and imported into qBase PLUS software. qBase-PLUS included a geNorm PLUS analysis to screen the optimal set of reference candidate genes [71]. The geNormplus analysis was assessed by using the seven tested candidates and nine samples belonging to four lactation stages. geNorm calculates the gene expression stability measure (M) for a reference gene as the average "pairwise variation" (V) for that gene in relation to all other tested reference genes. Stepwise exclusion of the gene with the highest M value allows the ranking of the tested genes according to their expression stability. The qBase PLUS software (version 2.5. Biogazelle, Gent, Belgium) was used to the analyse qPCR data based on the 2 -ΔΔCt method, with implementations to take multiple reference genes and gene-specific amplification efficiencies into account as well as the errors of all measured parameters along the entire calculation track. Moreover, an inter-run calibration algorithm is included to correct for run-to-run differences [72]. Cumulative normalized relative quantities (CNRQs) represent n-fold normalized expression relative to inter-run calibrators (IRC) run across all plates. The logarithm (Log) of the CNRQs was used for graphical representation and statistical analysis.
Least square (LS) means for the breeds, lactation stages and oligosaccharide contents were calculated with Proc GLM in SAS software (SAS Institute Inc., 2010) using the following model: y ijk = μ + C j + K i + (C j *K i ) + e ijk where: y ijk = phenotype (gene expression or OS content) of the i th goat at the k th lactation stage in the j th breed; C j = fixed effect of the j th breed (j = 1-2); K i = fixed effect of the i th stage of lactation (i = 1-4); e ijk = random residual effect. Pearson correlation coefficients (r) for breed, lactation stages and oligosaccharide content were calculated using Proc CORR in SAS software (SAS Institute Inc., 2010).
The significance of the analyses was determined using Tukey's test under the GLM procedure.

Ethics approval and consent to participate
This study was conducted on the experimental farm of CREA (Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Research Centre for Animal Production and Acquaculture, Bella (PZ)), a research institute authorized by the Italian Ministry of Health to use farm animals for experimental purposes (DM 26/96-4). Animal management and care followed the recommendations 408 of European Union directive 86/609/EEC.

Consent for publication
Not applicable.