Impact of SNPs interplay across the locus of MBL2, between MBL and Dectin-1 gene, on women’s risk of developing recurrent vulvovaginal infections

Background Human mannose binding lectin (MBL) and dendritic cell-associated C-type lectin-1 (Dectin-1) are the two prototypical PRRs of innate immunity, whose direct role in recurrent vulvovaginal infections (RVVI) defense has been defined. Previously, MBL insufficiency was proposed as a possible risk factor for the rapid progression of RVVI while, Dectin-1 was found to be playing an active role in the defense. However, the complete genetic bases for the observed low MBL levels are still lacking as our previous studies in harmony with others demonstrated the un-expected genotype–phenotype patterns. This suggested the presence of unidentified regulatory variants that may modulate sMBL levels and risk of RVVI. Therefore, the present study was designed for more inclusive locus-wide MBL2 analysis and for the possible non-linear interaction analysis of two PRRs that may impact RVVI susceptibility. Methods The present study has extended the previous findings by investigating (1) the role of chosen additional SNPs falling in the 5′ near region relating to sMBL levels and RVVI susceptibility, using polymerase chain reaction-restriction fragment length polymorphism, (2) interactions among SNPs within gene by comprehensive locus-wide haplotype analyses of two MBL2 blocks, (3) gene–gene interaction analyses between two PRRs, using multifactor dimensionality reduction. Results rs11003124_G, rs7084554_C, rs36014597_G, and rs11003123_A were observed as the minor alleles in the representative North Indian cohort. RVVI cases and its types showed an appreciably high frequency of C allele, its homozygosity and heterozygosity, explaining the observed dominant mode of inheritance of rs7084554 polymorphism in contributing 1.81 fold risk of RVVI. The rs36014597 polymorphism showed the overdominant mode of inheritance, which further depicts that the carrier of a heterozygous genotype of this polymorphism had more extreme phenotype than either of its homozygous carriers in developing 4.07 fold risk of RVVI. sMBL levels significantly varied for rs11003124, rs36014597 and rs11003123 polymorphisms in bacterial vaginosis, while for rs7084554 polymorphism in mixed infection. Independent analysis of 5′ and 3′ haplotype blocks suggested the risk-modifying effect of all the 5′ additional variants, Y/X secretor polymorphism and 3′-UTR SNP i.e. rs10824792. Combined 5′/3′ haplotype analyses depicted the importance of rs36014597; an additional 5′ variant, Y/X and rs10824792 polymorphisms from both the blocks in regulating sMBL levels and RVVI risk. Three gene–gene interaction models involving uni-variant, bi-variant and tri-variant appeared as significant predictors of RVVI risk with cross-validation consistency of 10/10, 9/10 and 5/10, respectively. Conclusions The study presented a low-cost reproducible screening design for additional 5′ variants i.e. rs11003124, rs7084554, rs36014597 and rs11003123 of MBL2 that can act as markers of susceptibility for RVVI or any other diseases. Two additional 5′ variants of MBL2 i.e. rs7084554 and rs36014597 were suggested as novel molecular markers that may contribute to RVVI risk by varying sMBL levels. Variants of two blocks were found to have more of a combined effect than the independent effect in modulating RVVI susceptibility and sMBL levels. The study presented weak synergistic interaction between MBL2 and CLEC7A in association with RVVI risk. The preliminary data will establish the foundation for the investigation of within gene and between genes interaction analyses towards RVVI susceptibility. Electronic supplementary material The online version of this article (10.1186/s13578-019-0300-4) contains supplementary material, which is available to authorized users.


Background
Three universal pathological conditions of recurrent vulvovaginal infection (RVVI) are bacterial vaginosis (BV), vulvovaginal candidiasis (VVC) and trichomoniasis [1,2]. Despite the fine knowledge regarding organismal and non-organismal pathogenesis factors, RVVI remains one of the most enigmatic mucosal problems worldwide due to the presence of 20-30% asymptomatic cases (i.e. healthy women with vaginal microbiota composition same as that of RVVI), some of which also had predisposing non-organismal causes [3]. Moreover, women without any known predisposing factors have also been documented to acquire RVVI [4]. Therefore, identification of causal factors that modulate propensity to RVVI in women is much needed. Mannose binding lectin (MBL), encoded by MBL2 mapped to 10q21.1 has commonly been referred to as an acute phase protein whose serum levels increases following infections [5]. It is an ideal pattern recognition receptor that binds to specific sugars on pathogen's surface, consequently causing pathogen removal by complement activation, opsonisation and/or phagocytosis [5]. The serum or vaginal fluid levels of MBL have been assessed in RVVI cases by different studies, suggesting its key involvement in pathogenesis of RVVI [6][7][8][9][10][11][12][13].
The MBL levels have been shown to be determined by single nucleotide polymorphisms (SNPs) in coding and promoter region of MBL2. The former includes three SNPs, present in exon 1 at rs5030737 (codon 52), rs1800450 (codon 54) and rs1800451 (codon 57), collectively called as MBL2 structural variations [14]. The codon 54 and 57 SNPs leads to the substitution of glycine with dicarboxylic acids, while codon 52 leads to the substitution of arginine with cysteine in the collagenous region of a monomeric protein resulting in variant monomers [15,16]. These variant monomers dramatically affect the circulating levels of higher order functional MBL oligomers [17]. In addition to these, three other promoter region variations i.e. rs11003125 (L/H), rs7096206 (Y/X) and rs7095891 (P/Q) have been functionally authenticated to alter MBL2 transcription, clearly depicting the importance these genetic variations regarding MBL expression and its circulating levels [16,18].
The linkage disequilibrium (LD) has been described in the literature between structural and promoter polymorphisms leading to the formation of seven standard haplotypes. These standard haplotypes include HYPA, LYPA, LYQA, LYPB, LXPA, LYQC and HYPD. These haplotypes are commonly referred to as the secretor haplotypes because they regulate sMBL levels [19]. However, additional secretor haplotypes including HXPA, LYQB, HYQA, HYQB, HXQB, LXPB, LXQB, and LYPD have also been reported by various studies in different populations [11,[20][21][22][23]. The novel haplotypes have been suggested to be observed, due to genetic heterogeneity between different populations and selective advantage in response to environmental pressures like infections or geographic location [24,25].
From the past, the structural polymorphisms of MBL2 remain the centre point of most of the investigations including RVVI. As for instance, the association of MBL2 structural polymorphisms with RVVI have been documented in different populations [6][7][8][26][27][28][29][30]. Our previous study has also elucidated the involvement of standard MBL2 haplotypes in modulating sMBL levels and RVVI susceptibility [11]. However, our study in harmony with others found an un-expected correlation pattern of genotypes with phenotypes, suggesting the presence of unrecognised regulatory elements of MBL2 [11,19,31]. This implicates the need for further inclusive locus-wide MBL2 analysis to reveal unrecognised regulatory variants that may modulate sMBL levels and risk of infectious diseases. For this selection of variants can be made based on putative functional effects e.g. altering transcription, translation or miRNA binding.
Our previous study sorted out 12 putative functionally important SNPs of MBL2 using in silico analysis [32]. From this, evaluation of MBL2 3′-UTR SNPs has and tri-variant appeared as significant predictors of RVVI risk with cross-validation consistency of 10/10, 9/10 and 5/10, respectively.

Conclusions:
The study presented a low-cost reproducible screening design for additional 5′ variants i.e. rs11003124, rs7084554, rs36014597 and rs11003123 of MBL2 that can act as markers of susceptibility for RVVI or any other diseases. Two additional 5′ variants of MBL2 i.e. rs7084554 and rs36014597 were suggested as novel molecular markers that may contribute to RVVI risk by varying sMBL levels. Variants of two blocks were found to have more of a combined effect than the independent effect in modulating RVVI susceptibility and sMBL levels. The study presented weak synergistic interaction between MBL2 and CLEC7A in association with RVVI risk. The preliminary data will establish the foundation for the investigation of within gene and between genes interaction analyses towards RVVI susceptibility.
Keywords: Additional 5′ variants, 3′ UTR SNP, Innate immunity, Haplotype, Gene-gene interaction, Multifactor dimensionality reduction (MDR), Reproductive infectious diseases depicted a novel association of rs10824792 SNP with low sMBL levels and RVVI risk [13]. However, the role of selected SNPs falling in the 5′ near region is still pending to be elucidated. Moreover, high serum levels of Dectin-1, another important innate immune molecule, have also been shown to play an active role in defense against RVVI [12]. Dectin-1, encoded by CLEC7A mapped to 12p13.2, is a collaborative PRR that team up with other PRRs via Syk pathway to generate optimal immune responses [33]. Both MBL and Dectin-1 are the essential innate immune components. Therefore to know the relationship between two, when co-activated against same pathogenic stimuli, would be of interest. Our previous investigation suggested the effect of rs3901533 CLEC7A SNP in modulating sMBL levels and RVVI susceptibility, suggesting RVVI a multi-factorial phenotype [12]. Though, we did not find any correlation between two proteins, finding the relationship between two genes i.e. MBL2 and CLEC7A is still pending to be elucidated, as the importance of genes before proteins have already been stated.
From this background, the present study was planned to elucidate (1) the role of selected SNPs falling in the 5′ near region relating to sMBL levels and RVVI susceptibility, using a conventional approach (2) interactions among SNPs within gene by comprehensive locus-wide haplotype analyses of MBL2, based on the LD pattern obtained across the genomic structure, using genotyped data of MBL2 variants evaluated in this study and reported previously. (3) gene-gene interaction analyses between two PRRs i.e. MBL and Dectin-1, using multifactor dimensionality reduction method. The preliminary data will establish the foundation for the investigation of within gene and between genes interaction analyses towards RVVI susceptibility.

Study participants
The present study recruited RVVI cases (n = 258, mean age 29.33 years, ± S.D. 8.32) attending Bebe Nanki Mother and Child Care Centre, Department of Obstetrics and Gynaecology, Government Medical College, Amritsar (Pb) and were referred by the gynecologist. These cases were clinically pre-diagnosed with RVVI with minimum 4 documented recurrent experiences in a year, with frequent complaints of smelly discharge, itching, vaginal sores as well as pelvic pain. The controls (n = 203, mean age 29.33 years, ± S.D. 8.17) were matched to cases by age and had, by self-report, no recurrent history of vaginal infection. Participants were excluded if they were known carriers of HIV or any other chronic conditions, under chemotherapy or taking immunosuppressive medications. All the participants provided informed consent in writing. The Institutional Ethics Committee of Guru Nanak Dev University, Amritsar (Punjab), India, approved (Approval no. 06/HG dated 02/01/2015) the study protocol.

Samples and RVVI categorisation
Two types of samples were collected in the present study i.e. vaginal discharge and peripheral blood samples. Vaginal discharge samples from 200 RVVI cases were carried to the laboratory and subjected to standard diagnostic methods specified in European (IUSTI/WHO) guidelines on vaginal discharge management [34] as reported previously [35]. This categorised 200 RVVI cases into three major categories of RVVI i.e. Bacterial Vaginosis (BV; n = 97), vulvovaginal candidiasis (VVC; n = 62) and Mixed Infections (MI; n = 41) i.e. cases with both BV and VVC. However, 58 RVVI cases could not be processed, hence categorised, as these participants were either menstruating or were not willing to give vaginal samples. The peripheral blood samples (5 ml), collected from all the participants, were further processed for serum and DNA isolation by standard methodology [11][12][13].

SNPs selection
In silico analyses demonstrated twelve SNPs of MBL2 with possible functional consequences to structure and expression of MBL [32]. These twelve putative functional SNPs included rs11003125 (L/H), rs11003124, rs7084554, rs36014597, rs7096206 (Y/X), rs11003123, rs7095891 (P/Q), rs1800450 (codon 54), rs10082466, rs2165813, rs2099903 and rs2099902. Of these, the association of all the SNPs except rs11003124, rs7084554, rs36014597, rs11003123 and rs10082466 with RVVI has been reported previously [11,13]. Thus, rs11003124, rs7084554, rs36014597 and rs11003123 SNPs except rs10082466 were investigated in the present study to assess their role in RVVI and its types. Hence, all the SNPs of MBL2 that were prioritised by in silico analyses were validated in relation to RVVI except one 3′UTR SNP i.e. rs10082466. This SNP could not be evaluated as different PCR approaches used for genotyping could not be standardised. Moreover, SNPs flanking rs10082466 SNP were either not functional or of low frequency, so the selection of other regions for sequencing seems expensive, hence not opted.

Genotype analyses by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP)
The present study standardised a simple economical method i.e. PCR-RFLP, for the genotyping of four MBL2 SNPs including rs11003124, rs7084554, rs36014597, and rs11003123.

NCBI's
Primer-BLAST (https ://www.ncbi.nlm.nih. gov/tools /prime r-blast /), a freely available online software was used to design primers for PCR amplification. MBL2 gene sequence (NCBI Reference Sequence: NC_000010.11) flanking the given SNP, was used as an input for the software. The best primer pair was customsynthesized from Bioserve Biotechnologies (Hyderabad, India). A reaction mixture (20 µl) consisting of template DNA, dNTPs (0.025 mM), Taq DNA polymerase (0.3 U) and Taq buffer with 15 mM MgCl 2 (1X) was used for each PCR. Each PCR was performed in a thermal cycler (Applied Biosystems, Life Technologies, USA) with stipulated conditions. The particulars of primers sequence, concentrations of specific primers as well as the amplification conditions intended for all the PCRs have been provided in Table 1. The amplified products were analysed on 1.5% (w/v) EtBr-stained agarose gel (Himedia, India), after electrophoretic separation at 100 V, with gel viewer (Alpha imager, USA).

Choice of restriction enzymes and RFLP
The choice of restriction enzymes to differentiate among variant, wild and heterozygous genotypes for SNPs including rs11003124, rs7084554, rs36014597, and rs11003123 was made with the help of online software NEBcutter v 2.0 (http://nc2.neb.com/NEBcu tter2 /). Restriction enzyme (IU) along with cut smart NEBuffer (1×) was used for the restriction analysis of amplified PCR product. RFLP conditions along with restriction endonucleases used for the genotyping of each SNP are provided in Table 2. The pattern of restriction digestion was visualised on gel documentation system after electrophoresis on 2.5% (w/v) agarose gel at 100 V. DNA sample of known genotype was used as positive control, while negative control contained all components of restriction digestion mixture except respective restriction enzyme. The PCR products were ascertained by analysing the restriction digestion pattern (Fig. 1). About 10% of the samples with representative genotypes were also confirmed by Sanger sequencing (Fig. 1).   rs2165813, rs2099903 and rs2099902 across the MBL2 locus, from 5′ to 3′ direction were used for haplotype analyses in the present study (Table 3). For gene-gene interactions analyses, total 17 i.e. 14 MBL2 SNPs and three previously reported CLEC7A SNPs i.e. rs3901533, rs11053597 and rs11053593 were considered.

Serum MBL concentration
The Serum MBL (sMBL) concentration was determined with enzyme-linked immunosorbent assay (Human MBL ELISA kit, Ray Biotech, USA) following the manufacturer's instructions, as reported previously [11,13]. Briefly, 100 μl of standard, blank and 4000 fold pre-diluted serum sample was added to the respective wells of microtitre plate, pre-coated with anti-human MBL antibody and incubated for 2.5 h at room temperature. After 3-4 washings with 1× wash buffer, 100 μl of biotinylated antihuman MBL antibody was added and the plate was incubated at room temperature for 1 h. Following incubation, unbound biotin conjugated anti-human antibody was removed by 3-4 washings with 1X wash buffer and 100 μl streptavidin-HRP was added to all the wells, followed by incubation at room temperature for 45 min. After 3-4 washings, 100 μl of 3,3,5,5′-tetramethylbenzidine (TMB) substrate was added and incubated for 30 min at room temperature in the dark. The blue color was developed in proportion to the amount of MBL present in the sample. The reaction was stopped with 50 μl of stop solution (0.2 M H 2 SO 4 ) that changes the color from blue to yellow. The intensity of the color was measured at 450 nm by microplate reader (BIO-RAD, iMark ™ , USA). The assay was calibrated using MBL (25 ng/ml) standard provided in the kit, which was used to prepare different standard dilutions i.e. 8.33, 2.778, 0.926, 0.309, 0.103, 0.034 ng/ml, as instructed in the kit. The assay diluent A, provided in the kit was used as blank and to prepare standard dilutions. All these different standard concentrations were used to obtain MBL standard curve, hence standard equation, from which, the concentration of MBL in the serum samples was determined. The minimum detection sensitivity of the ELISA kit was 0.03 ng/ml.

Statistical analysis
To achieve a minimum of 80% power for the present study, required sample size was calculated with Genetic Association Study (GAS) power calculator (http://csg. sph.umich .edu/abeca sis/gas_power _calcu lator /), considering assumptions that are 30% countrywide prevalence of abnormal vaginal discharge, 10% MAF, 1.5 odds ratio (OR) and 5% error rate (α = 0.05). Standard adds up was carried out to compute allelic as well as genotypic frequencies of different SNPs in cases and controls. These frequencies were further compared by odds ratio statistics with MedCalc software v 9.3.9.0 (MedCalc Software, Ostend, Belgium). The major allele and its subsequent homozygous genotype were selected as the reference (OR = 1). Deviation of each marker from Hardy-Weinberg equilibrium (HWE) was tested with SNPStats (https ://www.snpst ats.net/snpst ats/start .htm). The best genetic models for each polymorphism was selected based on the

Genetic analyses of additional 5′ variants in RVVI relative to controls
All the additional 5′ variants conformed to HWE (p > 0.05) except rs11003123 polymorphism. rs11003124_G, rs7084554_C, rs36014597_G, and rs11003123_A were found to be minor alleles in North Indian cohort (Table 4). Of all the evaluated additional 5′ variants, allelic and genotypic frequencies significantly varied for rs7084554 and rs36014597 polymorphisms only (Tables 4 and 5). The C allele of rs7084554 was found to be significantly (p = 0.009; OR = 1.54; 95% CI 1.11-2.13) more prevalent in RVVI cases than controls. The frequency of TC heterozygous genotype (p = 0.002) and CC homozygous genotype (p = 0.396) of rs7084554 was high in RVVI cases than controls. Excluding the recessive model of inheritance, all the other models were found to be significant for rs7084554 SNP. However, the dominant genetic model was found to be best, with least AIC = 629.5 and BIC = 641.9 values, depicting that C allele carrier (either in homozygous or heterozygous state) had a greater risk of developing RVVI than non-C carriers (p = 0.002; OR = 1.81; 95% CI 1.22-2.68). Moreover, significantly (p < 0.0001; OR = 2.23; 95% CI 1.62-3.07) higher prevalence of G allele of rs36014597 SNP was observed in RVVI cases comparative to controls. Also, a considerably high prevalence of AG genotype (p < 0.0001) was observed in RVVI cases than controls. The overdominant model of inheritance, with the lowest AIC = 590.5 and BIC = 602.9 values, was found to be best, of all the models that showed significant association with rs36014597 polymorphism. The overdominant model depicts that the carrier of heterozygous genotype will have more extreme phenotype than that of either of its homozygous carriers in developing RVVI risk (p < 0.0001; OR = 4.07; 95% CI 2.69-6.17).

Genetic analyses of additional 5′ variants in RVVI types relative to controls
For rs7084554, significantly higher prevalence of C allele was observed in BV (p = 0.03) and MI (p = 0.02) than controls ( Table 6). Heterozygosity of C allele was observed to be appreciably higher in VVC (p = 0.007) and MI (p = 0.01) than controls. For rs36014597 polymorphism, G allele was significantly more prevalent in BV (p = 0.0001), VVC (p = 0.004) and MI cases (p = 0.0002) as compared to controls. Heterozygosity for G allele was observed to considerably more prevalent in BV (p = 0.0001), VVC (p = 0.0001) and MI cases (p = 0.0007) comparative to controls. However, no significant difference in genotypic as well as allelic distribution was observed for rs11003124 and rs11003123 variants. Also, no homozygote of the minor allele of all the additional 5′ variants was observed in VVC.

Genotype-phenotype association of additional 5′ variants
The stratification of previously measured sMBL levels was made on the basis of the observed genotypes of additional 5′ variants in cases and controls (Table 7). In BV, overall significant difference between genotypic sMBL levels was observed for rs11003124 (p = 0.026), rs36014597 (p = 0.01) and rs11003123 (p = 0.006) polymorphism. Further analysis of rs11003124 polymorphism indicated that TT (p = 0.032) and TG (p = 0.021) genotypes contributed considerably low measured sMBL levels than GG genotype. For rs36014597 polymorphism, AA (p = 0.007) and AG (p = 0.014) genotypes were significantly contributing low sMBL levels comparative to GG genotype. For rs11003123 polymorphism, considerably low sMBL levels were observed for GA (p = 0.01) genotype than AA genotype. In MI, an overall significant difference between genotypic sMBL levels was observed for rs7084554 (p = 0.01) polymorphism only. Further analysis of this polymorphism indicated that sMBL levels contributed by TT genotype were significantly (p = 0.013) different from levels contributed by TC genotype. Data analysis for genotypic sMBL levels in controls, RVVI and VVC revealed no significant difference in sMBL levels of studied 5′ variants. In addition, for the same genotypes, cases were found to have considerably low sMBL levels than controls.

Linkage disequilibrium analyses
The LD pattern of 14 variants across MBL2 was determined using their genotyped data. As LD is likely to decrease with increase in physical distance, the LD plot with two blocks was observed in the cohort of the present study for MBL2 (Fig. 2). The secretor polymorphisms along with additional 5′ near gene variants form the 5′ block. The six 3′UTR variants form the 3′ block of MBL2. The markers of one block were not in LD with the markers of other. The LD analysis of 5′ block of MBL2 indicated nearly complete LD between P/Q and codon 54 variants with D′ of 0.96 and thus co-inherited together. SNP pairs including rs11003124/rs11003123, rs7084554/ rs11003123, rs7084554/rs36014597, rs11003124/ rs7084554 and rs36014597/rs11003123 were in strong LD with D′ of 0.78, 0.78, 0.75, 0.73 and 0.73, respectively. In addition, SNP pairs including rs11003124/rs36014597, LH/AB and LH/PQ showed fairly high LD with D′ of 0.69, 0.63 and 0.6. The LD analysis indicated that all the SNPs of 3′ block are in strong LD with each other and are co-inherited together [13].

Distribution of MBL2 5′ block haplotypes
Total fifty-nine 5′ block haplotypes from 5′H-1 to 5′H-59, with frequencies ≥ 0.001 (0.1%) either in cases or controls, were observed in the present study. An overall significant difference (global p = 0.01) were observed in distributions of all the observed 5′ block haplotypes among case-control groups (Additional file 1). Only six haplotypes i.e. HTTAXGQB (5′H-1), HTTAYGQB (5′H-2), LTTAXGQB (5′H-3), LTTAYGPA (5′H-4), LGCGYAPA (5′H-5) and HTTAYAPA (5′H-6) with frequency ≥ 0.05 (5%, either in cases or controls) were observed and are referred as common haplotypes, while the rest haplotypes were considered as rare and thus excluded from the present analysis due to their very low frequency. The case-control comparison of common haplotypes depicted considerably low prevalence of haplotype 5′H-2 in RVVI and MI cases than controls (Table 8). Also, the frequency of haplotype 5′H-4 was observed to be significantly low in RVVI and BV cases than controls. Moreover, the other common haplotypes did not show any statistical differences in their observed frequencies between cases and controls.

Haplotype-phenotype associations
The previously measured sMBL levels were stratified in cases and controls on the basis of independent and combined haplotypes of two blocks. The distribution and comparisons of sMBL levels of 3′ block haplotypes in cases and control groups have been reported previously, which suggested that 3′ block haplotypes may alter sMBL levels and susceptibility to RVVI [13].

Distribution of sMBL levels in 5′ block haplotypes
sMBL levels were segregated according to 5′ block common haplotypes in cases and controls (Table 10). In controls, an overall significant difference (p = 0.004) was observed among sMBL of different haplotypes. Further analysis has shown that 5′H-1 (HTTAXGQB; p = 0.002) and 5′H-4 (LTTAYGPA; p = 0.048) haplotypes were conferring significantly low levels comparative to 5′H-2 (HTTAYGQB) haplotype, whereas other haplotypes in controls did not show any significant difference among sMBL levels. In RVVI cases, sMBL levels of different haplotypes showed an overall significant difference (p = 0.008). Further analysis has shown that 5′H-3 (LTTAXGQB) haplotype were accounted for significantly (p = 0.029) low levels than 5′H-2 haplotype, whereas other haplotypes in RVVI cases did not show any significant difference among sMBL levels. In RVVI types i.e. BV, VVC and MI, no overall as well as in particular significant difference was observed among sMBL levels of different haplotypes. Furthermore, significantly (p < 0.05) low sMBL levels were accounted by various haplotypes of RVVI and its types comparative to corresponding haplotypes in controls.

Gene-gene interaction analyses
Multifactor dimensionality reduction (MDR) method was used to study interaction between all the seventeen SNPs of two genes i.e. MBL2 and CELC7A. Three genegene interaction models involving uni-variant, bi-variant and tri-variant appeared as significant (p < 0.001) predictors of RVVI risk (Table 12). Among 17 SNPs, one-way interaction model including MBL2 Y/X polymorphism was found to have maximum cross-validation consistency (CVC) of 10/10 with testing balance accuracy of 63.70%. Two-way interaction model (MBL2.rs36014597 and MBL2.Y/X) and three-way interaction model (MBL2.Y/X, MBL2.rs10824792 and CLEC7A.rs3901533) also showed association with RVVI risk but had comparatively low CV consistency i.e. 9/10 and 5/10, respectively relative to the uni-variant model. MDR analysis provides dendrogram to interpret the nature of possible interaction between SNPs (Fig. 3a). It was found that MBL2.Y/X, MBL2.rs10824792 and CLEC7A.rs3901533 belonging to one group had shown a weak synergistic interaction in predicting RVVI risk. This group along with MBL2.rs36014597 showed an intermediate level of association between synergy and redundancy in predicting RVVI susceptibility. MDR analysis also provides a graphical model of entropy interaction (Fig. 3b). It was found that entropy-based analysis was positive (0.03%) for a pairwise effect of CLEC7A.rs3901533 and MBL2. rs36014597 indicating synergy while negative for pair CLEC7A.rs3901533 and MBL2.Y/X (− 1.01%) as well as pair CLEC7A.rs3901533 and MBL2.rs10824792 (− 1.29%) indicating redundancy towards RVVI susceptibility.

Discussion
The present study is the first report presenting the frequency distribution of the four additional 5′ near gene variants of MBL2 in North Indian cohort, which depicted rs11003124_G, rs7084554_C, rs36014597_G, rs11003123_A to be minor alleles. The distribution of these variants was in agreement with all the different populations of the 1000 Genomes Project (Phase 3) except African population owing to its population substructure, high genetic diversity and less LD between genetic loci (Additional file 3) [36]. The two SNPs i.e. rs7084554 and rs36014597 were found to be significantly predisposing individuals to RVVI and its types, while the other two variants i.e. rs11003124 and rs11003123 was not found to be associated with the disease condition. Significantly higher prevalence of C allele, its homozygosity and heterozygosity were observed in RVVI cases and its types as compared to controls, explaining the observed dominant mode of inheritance of rs7084554 polymorphism that is increasing the risk of RVVI. In addition, the best fit model of rs36014597 polymorphism showed the overdominant mode of inheritance, which further depicts that the carrier of the heterozygous genotype of this polymorphism had more extreme phenotype than either of its homozygous carriers in developing RVVI risk as complemented by the observed results of genotype distribution. The literature search did not reveal any articles that have evaluated the role of select 5′ additional variants of MBL2 in association with RVVI or its types. However, these polymorphisms have been evaluated in relation to leprosy and malaria with no significant associations [23,37]. An association of rs11003124 and rs11003123 SNPs have been shown with the increased risk of leprosy and hepatocellular carcinoma in patients with hepatitis B-related cirrhosis, respectively [37,38]. Linking genotypes of these polymorphisms with phenotypes showed that for the same genotypes, cases accounted significantly low levels than controls. Furthermore, genotypic sMBL levels significantly varied for rs11003124, rs36014597 and rs11003123 polymorphisms in BV, while for rs7084554 polymorphism in MI. However, sMBL levels for these particular loci did not vary as expected from their genetic association analysis, possibly due to the small size of the respective groups, as the study was underpowered for RVVI categories. Therefore, increasing the sample size may ascertain the role of these SNPs in susceptibility to BV and MI. To date, no studies have examined the genotype-phenotype correlation of these polymorphisms. However, a single and recent study depicted no significant difference in genotypic sMBL levels of rs11003124 polymorphism in bakery workers with work-related respiratory symptoms [39]. Thus, of the 14 screened polymorphisms of MBL2, the single variant analysis showed five polymorphisms including rs7096206 (Y/X), rs7084554, rs36014597, rs10824792, and rs2099903 have been found to be associated with RVVI risk [11,13]. For the association studies, the haplotype-based analysis has been suggested to be more influential approach, which can help avoid the risk of misinterpretation of individual SNPs analysis [40]. Therefore, haplotypes were constructed on the basis of linkage disequilibrium analysis as 5′ and 3′ block haplotypes. These blocks were evaluated independently as well as in combination, as it's the different combinations of amino acids in a polypeptide chain that collectively determine the protein structure and function. Independent analyses of 5′ block haplotypes have shown a significantly low prevalence of common haplotypes i.e. HTTAYG QB (5′H-2) in RVVI and MI cases as well as haplotype LTTAYG PA (5′H-4) in RVVI and BV cases relative to controls. This depicts that the major alleles (marked by underline) of 5′ additional variants and Y/X secretor polymorphism are collectively conferring protection against RVVI, BV and MI cases. Evaluation of MBL2 3′ block haplotypes analyses showed the presence of three common haplotypes i.e. TTT GCT (3′H-1), CCG AAC (3′H-2) and CTT GCT (3′H-3) in RVVI cases and controls. Independent analyses 3′ block haplotypes showed the risk effect of 3′H-3, the haplotype including the minor allele of rs10824792 SNP only in RVVI [13].
Combined 5′/3′ haplotypes analyses have shown significantly high prevalence of 5′H-60/3′H-1 (HTTA XGQB/ TTT GCT ) haplotype in BV cases than controls, showing X allele as an important marker for conferring risk of disease development. Thus, independent analysis of two blocks suggested the risk-modifying effect of 5′ additional variants, Y/X secretor polymorphism and 3′-UTR SNP i.e. rs10824792. However, complete haplotype analysis depicted only Y/X polymorphism as an important marker for determining disease risk. It also suggested the possibility of unrevealed regulating variants that may be masking the effect of others. Therefore, for further clarification, haplotype-phenotype correlation analysis was carried out.
Independent analyses of 5′ block haplotypes with sMBL levels showed significant difference in sMBL levels of different haplotypes, in controls and RVVI cases. Further analysis depicted, 5′H-1 (HTTAXGQB) haplotype accounted for significantly low levels than 5′H-2 (HTTAYGQB) haplotype in controls. Also, 5′H-3 (LTTAXGQB) haplotype accounted for significantly low levels than 5′H-2 haplotype in RVVI. These results again suggested the contribution of Y/X variant in modulating sMBL levels in line with the above findings of the present study. Moreover, stratification of sMBL levels based on 3′ block haplotypes suggested 3′H-3, the haplotype with the minor allele of rs10824792 SNP only, accounted for significantly low sMBL levels, and RVVI risk [13]. Considering this, it is possible that both 5′ and 3′ haplotypes contribute to sMBL levels, but at this point, it is still tentative whether these blocks have an independent impact on the sMBL levels or they have combined effect.
Therefore, sMBL levels were further correlated with combined 5′/3′ haplotypes. Low sMBL levels were observed in various haplotypes of RVVI and its types as compared to respective haplotypes in controls. Moreover, an overall significant difference was observed among sMBL levels of different haplotypes in controls, RVVI, BV, and MI cases. Analyses in these groups showed that 5′H-10/3′H-2 haplotype having LYQB a low secretor  [11]. Also, 5′H-35/3′H-3 haplotype having low secretor haplotype 'HXPA' and 3′ block haplotype 'CTT-GCT' with only risk allele 'C' of SNP rs10824792 was significantly contributing low sMBL levels than 5′H-5/3′H-1 haplotype with high secretor haplotype 'LYPA' and major allele haplotype of 3′ block in MI cases. Thus, correlation of combined haplotypes with sMBL levels signifies the importance of Y/X and rs10824792 polymorphisms from both the blocks in regulating sMBL levels and RVVI risk. Thus, permutation analysis of haplotype with phenotype suggested that variants of two haplotype blocks have more of a combined effect than the independent effect in regulating sMBL levels and hence RVVI risk. These results are in consonance with the studies suggesting that there are additional 5′ variants as well as SNPs in 3′ haplotype block along with secretor polymorphisms modifies MBL function and its circulating levels and further increases risk of diseases [31,[41][42][43]. However, further functional studies are needed to confirm these observations and the proposed functional mechanism of these polymorphisms observed by in silico analyses towards disease development. Other than these findings, some other significant associations also observed in the present study that was unexplainable due to unexpected pattern of the variants, suggesting the possibility of other additional variants falling outside the region of observed haplotype blocks.
Our previous investigation suggested the effect of rs3901533 CLEC7A SNP in modulating sMBL levels, suggesting RVVI, a polygenic phenotype. Therefore, the interactions between genes i.e. MBL2 and CLEC7A were assessed using multifactor dimensionality reduction (MDR) analysis. Three gene-gene interaction models involving uni-variant, bi-variant and tri-variant appeared as significant predictors of RVVI risk. Of which, the one-way interaction model including MBL2 Y/X polymorphism with maximum crossvalidation consistency (CVC) of 10/10 was found to be the best model for susceptibility to RVVI for pooled RVVI patients and controls. This strongly suggested MBL2 Y/X polymorphism as the fundamental candidate genetic variant that determining RVVI risk. In addition, two-way interaction model (MBL2. rs3901533) also showed association with RVVI risk, though with comparatively low CV consistency to univariant model. This further suggests the contribution of both MBL2 (5′ and 3′ block) and CLEC7A genes in the pathogenesis of the RVVI. As per our finest knowledge, this is the first interaction study that found significant associations of genetic interaction in susceptibility to RVVI. However, in the rule as regards with CVC, the three loci model was not the best model for vulnerability to RVVI. The dendrograms and interaction entropy models also suggested a weak synergistic interaction between the variants of three loci model in predicting RVVI risk. In consonance with these findings, our previous results did not show any significant correlation pattern between sMBL and sDectin-1 levels in cases and controls [12]. This further implies, an independent effect of MBL and Dectin-1 to counter the same antigenic stimuli i.e. RVVI. However, analyses of additional putative functional variants especially in CLEC7A may reveal a synergy between the two molecules. A single study has shown that opsonophagocytosis of pathogens mediated by MBL leads to an increased intracellular expression of Dectin-1 [44]. This study suggested the inhibitory effect on the phagocytosis of C. albicans by neutrophils owing to the Dectin-1 blockage was completely compensated by the exogenous MBL, attributed to its direct role in opsonophagocytosis of pathogens, which is further supported by several receptors e.g. calreticulin and complement receptor 1 present on phagocytes surface that bind to MBL and mediate uptake and phagocytosis of MBL-pathogen complex, independent of complement activation. This further indicates that in the case of infections by C. albicans, the interplay between the MBL and Dectin-1 would have a compensatory characteristic, which in turn change the view of synergism mechanism of these molecules in specific situations.
Further, estimation of serum MBL levels was preferred in the present study than methods like mRNA expression. This is because the gene mRNA expression profile does not exactly reflect its phenotypic level in the serum. This difference may possibly be due to protein translational failure or failure of higher order proteins due to variant monomers formation. The same was depicted by a recent study that found low sMBL levels in RVVC cases with high MBL mRNA expression [10]. In addition, proteins with variant allele are suggested to be functionally inactive and are more likely to degrade relative to the proteins with no variations [45]. The functional implication of these variations on phagocytosis or complement activation by MBL warrants further investigations.

Conclusions
The study presented a low-cost reproducible screening design for additional 5′ variants i.e. rs11003124, rs7084554, rs36014597 and rs11003123 of MBL2 that can act as markers of susceptibility for vulvovaginal infections or any other diseases. Evaluation of these variants revealed two SNPs i.e. rs7084554 and rs36014597 that are significantly predisposing individuals to RVVI and its types by altering sMBL levels. The LD analyses of the SNP map of MBL2 indicated two haplotype blocks inside the gene. Permutation analysis of haplotype with phenotype suggested that variants of two blocks have more of a combined effect than the independent effect in regulating sMBL levels and hence RVVI risk, in which associated variants from both the blocks played the crucial role. In consonance with literature, sMBL levels did not exactly associate with standard secretor haplotypes, signifying the role of additional regulating variants of MBL2, which possibly be altering sMBL levels. Thus, inclusion of additional regulating variants of MBL2 in the present study has helped to solve these inconsistencies. The study presented weak synergistic interaction between MBL2 and CLEC7A in association with RVVI risk. However, analyses of additional putative functional variants especially in CLEC7A may possibly have simplified the nature of interactions among the polymorphisms and genes studied and might have provided a better understanding of the pathway implicated in the pathogenesis of RVVI. All these preliminary findings of the present study, demand further in-depth functional investigations to clarify the connection observed within and between genes in susceptibility to RVVI. Such kind of studies with larger data-sets will provide invaluable data to authenticate the therapeutic possibilities of MBL for RVVI and its types. Once validated, the day is not so far when different MBL formulations will be available over the counter for the RVVI treatment. Because, MBL replacement therapy has already passed the phase I clinical trials to aid patients with MBL paucity [46][47][48][49]. While, recombinant MBL production, Phase II and phase III trials are presently in progress. However, in contrast to MBL, more extensive research is needed to dissect the complete and specific role of Dectin-1 in RVVI, prior to judging its therapeutic potential.