Metabolomic profiling of Burkholderia pseudomallei using UHPLC-ESI-Q-TOF-MS reveals specific biomarkers including 4-methyl-5-thiazoleethanol and unique thiamine degradation pathway

Burkholderia pseudomallei is an emerging pathogen that causes melioidosis, a serious and potentially fatal disease which requires prolonged antibiotics to prevent relapse. However, diagnosis of melioidosis can be difficult, especially in culture-negative cases. While metabolomics represents an uprising tool for studying infectious diseases, there were no reports on its applications to B. pseudomallei. To search for potential specific biomarkers, we compared the metabolomics profiles of culture supernatants of B. pseudomallei (15 strains), B. thailandensis (3 strains), B. cepacia complex (14 strains), P. aeruginosa (4 strains) and E. coli (3 strains), using ultra-high performance liquid chromatography-electrospray ionization-quadruple time-of-flight mass spectrometry (UHPLC-ESI-Q-TOF-MS). Multi- and univariate analyses were used to identify specific metabolites in B. pseudomallei. Principal component and partial-least squares discrimination analysis readily distinguished the metabolomes between B. pseudomallei and other bacterial species. Using multi-variate and univariate analysis, eight metabolites with significantly higher levels in B. pseudomallei were identified. Three of the eight metabolites were identified by MS/MS, while five metabolites were unidentified against database matching, suggesting that they may be potentially novel compounds. One metabolite, m/z 144.048, was identified as 4-methyl-5-thiazoleethanol, a degradation product of thiamine (vitamin B1), with molecular formula C6H9NOS by database searches and confirmed by MS/MS using commercially available authentic chemical standard. Two metabolites, m/z 512.282 and m/z 542.2921, were identified as tetrapeptides, Ile-His-Lys-Asp with molecular formula C22H37N7O7 and Pro-Arg-Arg-Asn with molecular formula C21H39N11O6, respectively. To investigate the high levels of 4-methyl-5-thiazoleethanol in B. pseudomallei, we compared the thiamine degradation pathways encoded in genomes of B. pseudomallei and B. thailandensis. While both B. pseudomallei and B. thailandensis possess thiaminase I which catalyzes degradation of thiamine to 4-methyl-5-thiazoleethanol, thiM, which encodes hydroxyethylthiazole kinase responsible for degradation of 4-methyl-5-thiazoleethanol, is present and expressed in B. thailandensis as detected by PCR/RT-PCR, but absent or not expressed in all B. pseudomallei strains. This suggests that the high 4-methyl-5-thiazoleethanol level in B. pseudomallei is likely due to the absence of hydroxyethylthiazole kinase and hence reduced downstream degradation. Eight novel biomarkers, including 4-methyl-5-thiazoleethanol and two tetrapeptides, were identified in the culture supernatant of B. pseudomallei.

Results: Principal component and partial-least squares discrimination analysis readily distinguished the metabolomes between B. pseudomallei and other bacterial species. Using multi-variate and univariate analysis, eight metabolites with significantly higher levels in B. pseudomallei were identified. Three of the eight metabolites were identified by MS/MS, while five metabolites were unidentified against database matching, suggesting that they may be potentially novel compounds. One metabolite, m/z 144.048, was identified as 4-methyl-5-thiazoleethanol, a degradation product of thiamine (vitamin B 1 ), with molecular formula C 6 H 9 NOS by database searches and confirmed by MS/MS using commercially available authentic chemical standard. Two metabolites, m/z 512.282 and m/z 542.2921, were identified as tetrapeptides, Ile-His-Lys-Asp with molecular formula C 22 H 37 N 7 O 7 and Pro-Arg-Arg-Asn with molecular formula C 21 H 39 N 11 O 6 , respectively. To investigate the high levels of 4-methyl-5-thiazoleethanol in B. pseudomallei, we compared the thiamine degradation pathways encoded in genomes of B. pseudomallei and B. thailandensis. While both B. pseudomallei and B. thailandensis possess thiaminase I which catalyzes degradation of thiamine to 4-methyl-5-thiazoleethanol, thiM, which encodes hydroxyethylthiazole kinase responsible for degradation of 4-methyl-5-thiazoleethanol, is present and expressed in B. thailandensis as detected by PCR/RT-PCR, but absent or not expressed in all B. pseudomallei strains. This suggests that the high 4-methyl-5-thiazoleethanol level in B. pseudomallei is likely due to the absence of hydroxyethylthiazole kinase and hence reduced downstream degradation.
(Continued on next page)

Background
Burkholderia pseudomallei is an emerging, highly pathogenic, Gram-negative betaproteobacterium that causes melioidosis, a potentially serious and fatal disease often manifested as severe community-acquired pneumonia and sepsis. The bacterium is classified as a category B bioterrorism agent by the Center for Disease Control, USA. Although melioidosis is mainly endemic in Southeast Asia and northern Australia, the disease has been increasingly reported in countries outside the Asia-Pacific region, such as India [1,2], Mauritius [3], South, Central and North America [4][5][6], and West and East Africa [7,8], suggesting an expanding geographical distribution. B. pseudomallei is a natural saprophyte and melioidosis is believed to be acquired through environmental contact with contaminated soil and water [9,10]. Illness can be presented as an acute, subacute, or chronic process, with an incubation period of up to 26 years [11]. The disease manifestations can range from subclinical infection, localized abscesses, pneumonia to fulminant sepsis, leading to a mortality rate of up to 19 % [12]. Besides human, melioidosis also affects a wide range of animals in endemic areas [10,13]. Treatment of melioidosis is often difficult, as B. pseudomallei is usually resistant to multiple antibiotics and prolonged antibiotics are required to prevent relapse [14,15]. Moreover, diagnostic and therapeutic resources in endemic areas are often limited, which have hindered efforts to improve treatment outcomes.
Diagnosis of melioidosis can be difficult, as the bacterium may not be readily isolated from clinical specimens. And even with positive cultures, commercial bacterial identifications often fail to differentiate between B. pseudomallei and closely related species such as B. thailandensis and members of B. cepacia complex [16]. Therefore, new molecular techniques are often required for more accurate species identification [14,[17][18][19][20][21][22][23]. Despite these new technologies, the diagnostic problems associated with culture-negative cases remain unresolved. Although different serological tests have been developed to help diagnose culture-negative melioidosis, their clinical usefulness is limited by the low sensitivities and specificities [24,25]. Similarly, PCR assays for direct detection from blood and sputum samples are often associated with suboptimal sensitivities and specificities [23]. The availability of alternative techniques for improved diagnosis of melioidosis is thus eagerly awaited, and such techniques should be able to differentiate between melioidosis and infections caused by common Gram-negative bacteria including the closely related Burkholderia species.
Metabolomics is an uprising research platform for systematic studies of the small-molecular metabolite profiles of a biological system such as cell, tissue or organism, which may be intermediate of end products of various metabolic pathways. Using statistical analyses, the metabolomes of different biological systems can be compared, which may help differentiate between them and identify potential metabolite markers specific to each system. The technique has also been recently applied to characterize various infectious diseases and pathogens, with an aim to improve laboratory diagnosis [26][27][28][29][30][31]. In addition, the application of metabolomics to pathogens has opened a new arena for studying microbial metabolomics pathways while making use of genomic data. For example, we have recently reported a novel biosynthetic pathway for monascorubin, a thousand-year-old natural food colorant, in the pathogenic dimorphic fungus, Penicillium marneffei, identified using metabolomics approach and genomics data [32]. Despite being an important pathogen, no studies have reported the use of metabolomics to explore specific biomarkers in B. pseudomallei. We hypothesize that there are potentially novel extracellular metabolites specifically produced by B. pseudomallei that may be detected in body fluids of patients with melioidosis. To search for potential biomarkers for diagnosis of melioidosis, we attempted to characterize the metabolomes of culture supernatants of B. pseudomallei and related species, using ultrahigh performance liquid chromatography-electrospray ionization-quadruple time-of-flight mass spectrometry (UHPLC-ESI-Q-TOF-MS). Multi-and univariate statistical analyses of the metabolome data were used to identify specific metabolites in B. pseudomallei.

Visual inspection of total ion chromatograms
We characterized and compared the metabolomes of culture supernatants from 15 B. pseudomallei (seven clinical and eight environmental isolates), three B. thailandensis, 14 B. cepacia complex, four P. aeruginosa and three E. coli isolates. The total ion chromatograms from the same bacterial species shared considerable similarity, whereas significant differences were observed in the chromatograms obtained between different species, except B. pseudomallei and B. thailandensis sharing higher similarity. Representative examples of chromatograms obtained from each species are shown in Fig. 1.

PCA and PLS-DA modeling
To compare the metabolomes between B. pseudomallei and other bacterial species, both multi-and univariate analyses were performed. For multi-variate analysis, principal component analysis (PCA) showed that 40.4 % of the total variance in the data was represented by the first two principal components (PCs) (Fig. 2a). The 2D-PCA score plot revealed that all bacterial strains of the same species were closely related and can be distinguished from other bacterial species based on the first two principal components, with the B. pseudomallei strains clearly separated from other species including B. thailandensis along PC1 which represented 25.9 % of the variance. In view of the significant separation achieved using PCA, supervised partialleast squares discrimination analysis (PLS-DA) (Fig. 2b) was subsequently performed to maximize the separation and identify additional metabolites to those identified using PCA. In the PLS-DA score plot, the separation between different bacterial species is more prominent. Potential metabolites were selected based on the VIP score (>1). Based on the degree of similarity of metabolite abundance profiles, hierarchical clustering analysis was performed to show the global overview of all culture supernatant metabolites detected (Fig. 3). Metabolites with similar abundance pattern were positioned closer together. The heat map and dendrogram indicated the close clustering of B. pseudomallei strains and their separation from other bacterial species, although B. pseudomallei strains were more closely related to B. thailandensis strains than to other species. However, no significant difference was observed between clinical and environmental strains of B. pseudomallei. To further confirm the specificity and significance of potential metabolites identified from PCA and PLS-DA, univariate analysis of each metabolite was performed using one-way ANOVA and Student's t-test. A total of eight potential metabolites contributing most to the variation between B. pseudomallei and other bacterial species with significantly higher level in B. pseudomallei strains were selected for further identification ( Table 1).

Identification of potential biomarkers specific to B. pseudomallei
The metabolites were identified by MS/MS fragmentation and their predicted molecular formulae were shown in Table 1. All metabolites except m/z 144.048 were found only in B. pseudomallei strains but not other bacterial species. Three (m/z 144.048, m/z 512.282 and m/z 542.2921 m/z 144.048) of the eight metabolites were identified as known compounds, while the other five metabolites may represent potentially novel metabolites with no match against known compounds or databases.
The metabolite m/z 144.048 was identified as 4-methyl-5-thiazoleethanol (metabolite no. 2 in Table 1) with molecular formula C 6 H 9 NOS by database searches in METLIN and Massbank, and confirmed by MS/MS using commercially available authentic chemical standard of 4methyl-5-thiazoleethanol (Fig. 4a). 4-methyl-5-thiazoleethanol is a degradation product of thiamine (vitamin B 1 ), an essential cofactor in most living organisms, being important for purine metabolism. Although it was found in B. pseudomallei strains with significantly higher levels ( Fig. 4b), low levels of m/z 144.048 (at approximately 100-to 1000fold lower levels) were also detected in B. thailandensis, B. cepacia complex, P. aeruginosa and E. coli (Fig. 4c).
Two metabolites, m/z 512.282 and m/z 542.2921 were identified as tetrapeptides (metabolite no. 4 and 5 in Table 1). m/z 512.282 with molecular formula C 22 H 37 N 7 O 7 was identified as Ile-His-Lys-Asp, and m/z 542.2921 with molecular formula C 21 H 39 N 11 O 6 was identified as Pro-Arg-Arg-Asn in METLIN. Some bacteria are known to produce short peptides as pheromones, which are involved in quorumsensing [33]. Future studies are required to determine the origin and biological significance of these tetrapeptides in B. pseudomallei.
Phylogenetic analysis of thiaminase I and hydroxyethyl thiazole kinase genes of B. pseudomallei and B. thailandensis To investigate the high levels of 4-methyl-5-thiazoleethanol in B. pseudomallei culture supernatant, we attempted to compare the thiamine degradation pathways in B. pseudomallei and B. thailandensis. The thiamine degrading enzyme, thiaminase I, which catalyzes the degradation of thiamine to 4-methyl-5-thiazoleethanol ( Fig. 5a), can be found in available genome sequences of B. pseudomallei and B. thailandensis [34]. Phylogenetic analysis of all bacterial thiaminase I genes available from GenBank showed that the sequences from B. pseudomallei are most closely related to that from B. thailandensis (Fig. 5b). Moreover, the thaiminase I genes of B. pseudomallei and B. thailandensis were clustered with those of two other Burkholderia species with sequences available, forming a distinct cluster among all bacterial sequences. Other known bacterial thiaminase I genes have only been found in phylogenetically distant bacterial species, such as those Gram-positive bacteria like Clostridium, Paenibacillus and Bacillus. The findings suggested that Burkholderia species is unique among Gram-negative bacteria in having acquired this gene which may be involved in specific functions. Since both B. pseudomallei and B. thailandensis possessed a thiaminase I homologue, the higher levels of 4-methyl-5-thiazoleethanol in B. pseudomallei are unlikely to be related to thiaminase I.
On the other hand, hydroxyethylthiazole kinase, encoded by the gene, thiM, catalyzes the degradation of 4-methyl-5thiazoleethanol. Unlike thiaminase I gene, thiM is widely distributed in various Gram-negative bacteria. The gene can also be found in available genome sequences of B. pseudomallei and B. thailandensis [34]. Phylogenetic analysis showed that the sequences from B. pseudomallei are most closely related to those from B. thailandensis and other Burkholderia species (Fig. 5c).
PCR for thiaminase I and hydroxyethylthiazole kinase gene and RT-PCR for mRNA detection in B. pseudomallei and B. thailandensis PCR and RT-PCR using specific primers targeting thiaminase I gene showed that it is present and expressed in all the 15 B. pseudomallei and three thailandensis strains. Therefore, it is unlikely that the increased 4-methyl-5thiazoleethanol level in B. pseudomallei is due to thiaminase I regulation. On the other hand, to test whether the higher   (Fig. 6a). To detect the mRNA expression of thiM, RT-PCR using specific primers was performed.
The results showed that the thiM gene is expressed in all the three B. thailandensis strains but not expressed in all 12 B. pseudomallei strains that possessed the thiM gene (Fig. 6b).

Discussion
Using metabolomics approach, we identified specific metabolites in culture supernatant of B. pseudomallei. As these extracellular metabolites are either secreted or released from cell wall components of B. pseudomallei, they may be present in the circulating blood or other body fluids of infected patients, and hence may represent potential biomarkers for diagnosis. The exclusion of metabolites present in Gram-negative bacteria such as Enterobacteriaceae and other non-fermenters was important, since these bacteria are common causes of bacteremia. Moreover, differentiation of B. pseudomallei from B. thailandensis in clinical isolates may be important for accurate diagnosis of melioidosis, since B. thailandensis is much less virulent and was only rarely reported to cause invasive infections in humans [10,35]. Therefore, the identification of specific biomarkers for B. pseudomallei may help differentiate melioidosis from other Gram-negative bacterial infections. In this study, both total ion chromatograms and statistical analyses showed that the metabolomes of the culture supernatant of B. pseudomallei strains are significantly different from those of other tested Gram-negative  2) catalyzes the reaction involving the replacement of organic nucleophiles with the thiazole group in thiamine. The enzyme can be found in shellfish, fish and plants, but is only reported in a limited number of bacteria such as Bacillus and Clostridium [36]. Thiaminase II (EC 3.5.99.2), although initially identified with thiamine degrading activity, is subsequently revealed to be involved in salvage pathway of thiamine pyrimidine from basedegraded thiamine [37,38]. TenA, the gene that encodes thiaminase II, can be found in bacteria, archaea and eukaryotes [38]. In B. pseudomallei, a homologue of the gene encoding thiaminase I, but not TenA, can be found in the available complete genome sequences of B. pseudomallei [39]. Therefore, 4-methyl-5-thiazoleethanol detected in B. pseudomallei is most likely a product of thiaminase I. The homologue of thiaminase I in B. pseudomallei is most closely related to that of B. thailandensis, B. oklahomensis and B. glumae. Interestingly, such homologue is absent in other common Gram-negative bacteria such as P. aeruginosa and E. coli, supporting a unique function among some Burkholderia species. Nevertheless, low levels of 4- methyl-5-thiazoleethanol (at 100-to 1000-fold lower levels) were also detected in B. cepacia complex, P. aeruginosa and E. coli in this study, which may be due to the presence of other proteins sharing conserved domains with thiaminase [40].
Since both B. pseudomallei and B. thailandensis possessed a thiaminase I homologue with mRNA expression, we hypothesize that the higher levels of 4-methyl-5thiazoleethanol in B. pseudomallei than in B. thailandensis may be due to accumulation of this compound as a result of reduced downstream degradation. Hydroxyethylthiazole kinase (EC 2.7.1.50) is the enzyme responsible for degradation of 4-methyl-5-thiazoleethanol. It is a phosphotransferase that catalyzes the transfer of a phosphorus-containing group of ATP with the alcohol group of 4-methyl-5thiazoleethanol as the acceptor, resulting in ADP and 4methyl-5-(2-phosphonooxyethyl)thiazole (Fig. 5a). Although thiM gene that encodes hydroxyethylthiazole kinase can be found in the available genomes of B. pseudomallei [34], a previous transcriptome study on B. pseudomallei strain K94263 showed no detectable expression of thiM when the bacterium was cultured under 82 different conditions [41]. In contrast, a transcriptome study on B. thailandensis type strain E264, thiM was found to be expressed [42]. Furthermore, thiM has been classified as one of the noncore genes in B. pseudomallei and was found to be variably present among 94 studied strains in an array-based comparative genomic hybridization study [41,43,44]. Interestingly, in a recent report on whole genome analysis of two B. pseudomallei strains isolated 139 months apart from the same patient, thiM was among the 221 genes lost during reduction evolution [45]. We therefore tested our B. pseudomallei and B. thailandensis strains for the presence of thiM gene and mRNA expression. Our results showed that thiM gene is either absent or not expressed in all 15 B. pseudomallei strains showing high levels of 4-methyl-5-thiazoleethanol in culture supernatants. On the contrary, thiM gene is present and expressed in the three B. thailandensis strains showing low levels of 4-methyl-5-thiazoleethanol. Based on these data, it is likely that the higher level of 4-methyl-5-thiazoleethanol in B. pseudomallei than in B. thailandensis is the result of lack of thiM or hydroxyethylthiazole kinase activity.
The present study revealed a unique metabolic pathway of thiamine degradation in B. pseudomallei resulting in the accumulation of 4-methyl-5-thiazoleethanol likely as an end product. Burkholderia species are the only Gram-negative bacteria that possess thiaminase I gene for degradation of thiamine to 4-methyl-5-thiazoleethanol. Although thiM can be found in other Gram-negative bacteria, they cannot produce 4-methyl-5-thiazoleethanol from thiamine, which explains the very low levels observed in E. coli and P. aeruginosa strains. Among the genus Burkholderia, both thiaminase I and thiM genes can be found in available genome sequences of B. pseudomallei, B. thailandensis, B. oklahomensis and B. glumae in GenBank (Fig. 5b and c). The presence of thiM gene with mRNA expression in the three B. thailandensis strains, in line with the previous transcriptome study [42], supports hydroxyethylthiazole kinase activity and explains the low 4-methyl-5-thiazoleethanol level as a result of degradation to 4-methyl-5-(2-phosphonooxyethyl)thiazole. In contrast, although thiM gene can be found in some reported B. pseudomallei strains, previous studies showed that the gene may be absent, not expressed or lost during evolution [41,[43][44][45]. As supported by the present study, B. pseudomallei likely lacks hydroxyethylthiazole kinase activity and hence is unable to degrade 4-methyl-5-thiazoleethanol. Although B. oklahomensis and B. glumae also possess both thiaminase I and thiM genes, they are not pathogenic to humans. Therefore, 4-methyl-5-thiazoleethanol may represent a specific biomarker for clinical isolates suspicious of B. pseudomallei. Further studies may be performed to elucidate potential biological and pathogenic role of this unique metabolic pathway and high levels of 4-methyl-5thiazoleethanol in B. pseudomallei.
Metabolomics is an emerging tool in studying microbes and infectious diseases. The technique offers a revolutionary approach to study both the pathogen itself as well as the host response to infections, providing insights on diagnosis, pathogenesis, host-pathogen interactions and metabolic pathways. Metabolomic data obtained from urine samples have been used to distinguish healthy subjects from patients with pneumococcal disease and urinary tract infections [46][47][48]. A study using nuclear magnetic resonance spectroscopy-based metabolomics also showed that the metabolic profile of sera from tuberculosis patients can be distinguished from those from healthy controls [49]. In another study using serum metabolomics approach on leprosy patients, higher levels of polyunsaturated fatty acids were found among patients having higher bacterial indices, which may provide clues on biological pathways involved in the immunomodulation of leprosy [50]. With the increasing applications of metabolomics technology on both microbial and clinical samples from patients with appropriate controls, we expect to witness an exponential expansion of our knowledge on microbial metabolites, including the discovery of novel metabolites and potential biomarkers for diagnosis of infections such as melioidosis. In particular, application of metabolomics for diagnosis of culture-negative melioidosis is potentially advantageous over other diagnostic methods such as serological and PCR assays. First, the detection of metabolites by mass spectrometry offers high specificity, devoid of the problem of cross-reacting antibodies as in serological assays and false-positive PCR reactions as a result of DNA from host or other bacteria. Second, the same metabolomics platform can be extended for detection of different metabolites useful for melioidosis and other infectious diseases. Although the equipment and expertise for metabolomics technology is currently not available in most clinical microbiology laboratories, the success of matrix-assisted laser-desorption ionization time-of-flight mass spectrometry as a revolutionary method for pathogen identification suggests that mass spectrometry may emerge as a new tool for diagnostic microbiology. New diagnostic tests for potential biomarkers may also be developed in other platforms that are easy to use in clinical laboratories.

Conclusion
The present study illustrates the power of the state-ofthe-art metabolomics technology in exploring potential biomarkers in B. pseudomallei. Eight metabolites with significant higher levels were identified in culture supernatants of B. pseudomallei compared to other tested bacterial species, including 4-methyl-5-thiazoleethanol and two tetrapeptides, Ile-His-Lys-Asp and Pro-Arg-Arg-Asn. The high level of 4-methyl-5-thiazoleethanol in B. pseudomallei is likely due to the absence of hydroxyethylthiazole kinase and hence reduced downstream degradation. Further studies of the B. pseudomallei-specific metabolites would help understand their biological significance and potential role as diagnostic biomarkers.

Bacterial strains and culture
Fifteen B. pseudomallei (seven clinical and eight environmental isolates), three B. thailandensis, 14 B. cepacia complex, four Pseudomonas aeruginosa and three Escherichia coli isolates were included in this study (Additional file 1: Table S1). All isolates were phenotypically identified by the API 20NE system (bioMérieux Vitek, Hazelwood, MO), supplemented by conventional biochemical methods [51]. All B. pseudomallei, B. thailandensis and B. cepacia complex strains were isolated and characterized in previous studies [23,52]. Each bacterial strain was grown on sheep blood agar at 37°C overnight. Single colony was inoculated from blood agar to 5 ml RPMI 1640 medium (Gibco, Carlsbad, CA, USA) for incubation at 37°C with shaking at 200 rpm for 24 h. The primary sub-cultures were adjusted to OD 600 0.2 and 1 mL of the diluted culture was further sub-cultured in 30 ml of fresh RPMI medium at 37°C with shaking at 200 rpm for 24 h. The secondary sub-cultures were centrifuged at 3,000 rpm for 30 min to obtain the supernatant which was filtered twice using 0.22 μm filters. Metabolic activities in the filtrates were quenched immediately by incubating the filtrates in liquid nitrogen for 10 min. The filtrates were lyophilized and stored at-80°C until sample extraction and analysis. Uninoculated culture medium was used as negative control.

Chemicals and reagents
LC-MS grade water, methanol and acetonitrile was purchased from J.T. Baker (Center Valley, PA, USA). Analytical grade acetic acid, 5 M ammonium acetate and standard chemical 4-methyl-5-thiazoleethanol were purchased from Sigma-Aldrich, Inc (Saint Louis, MO, USA).

Sample preparation
Lyophilized samples were reconstituted by dissolving in 1 mL solvent mixture containing water/methanol/acetonitrile (1:2:2). The samples were vortexed for 1 min and subsequently sonicated for 10 min at room temperature. After centrifugation at 15,000 × g for 15 min at 4°C, supernatants were transferred to LC vial for LC-MS analysis. Other source conditions were kept constant in all the experiments as follow: gas temperature was kept constant at 300°C, drying gas (nitrogen) was set at the rate of 7 L/min, and the pressure of nebulizer gas (nitrogen) was 40 psi. The sheath gas was kept at a flow rate of 10 L/min and was maintained at a temperature of 330°C. The voltages of the Fragmentor, Skimmer 1, and OctopoleRFPeak were 135 V, 65 V and 750 V respectively. The scan range was adjusted to 80-1700 m/z at the acquisition rate of 2 spectra/s. MS/MS acquisition was operated in the same parameter as in MS acquisition. Collision Energy (CE) was used at 20 or 40 eV for fragmentation of the targeted compounds.

Data processing and statistical data analysis
All mass spectral data was acquired using Agilent MassHunter Qualitative Analysis software (version B.05.00, Agilent Technologies, USA). To optimize feature detection and discovery, two software packages: Mass Hunter Qualitative Analysis and open-source software XCMS (version 1.38.0) operating in R, which adopted different peak detection and alignment algorithms, were used [53]. For Mass Hunter Qualitative Analysis software, data preprocessing including baseline correction, noise calculations and molecular features extraction were performed with built-in small molecule extraction algorithm. Data was subsequently processed using Mass Profiler Professional (MPP) (Agilent Technologies) for peak alignment, data filtering and statistical analysis. For XCMS, raw data files were first converted to mzDATA format and peak detection were performed with centWave alogrithm in XCMS [54]. Data was subsequently processed using XCMS for peak alignment and data filtering. MetaboAnalyst 2.0 (www.metaboanalyst.ca) [55] was used for statistical analysis [55]. Further data processing including normalization, scaling and filtering were performed prior to statistical analysis in both software. Only variables that are present in at least 60 % of any group and with intensity of at least 4.0E + 03 were included for analysis in order to reduce noise and low abundance metabolites. The MS data were log 2 -transformed and mean-centered with unit variance scaling for statistical analysis. PCA and hierarchical clustering were performed for unsupervised multivariate statistical analysis. PLS-DA were performed as supervised method to identify important variables with discriminative power. PLS-DA models were validated based on multiple correlation coefficient (R 2 ) and cross-validated R 2 (Q 2 ) in cross-validation and permutation test by applying 2000 iterations (p > 0.001). The significance of the biomarkers was ranked using the variable importance in projection (VIP) score (>1) from the PLS-DA model. For univariate analysis of candidate specific biomarkers in culture supernatant, statistical significance was determined using oneway ANOVA with Tukey's post-hoc test between different bacterial species (B. pseudomallei, B. thailandensis, B. cepacia complex, P. aeruginosa and E. coli) and Student's t-test for comparison between B. pseudomallei and other bacterial species. P < 0.05 was considered to be statistically significant. Volcano plot with fold change >5 and P < 0.05 was performed where appropriate. Box-whisker plots were produced using GraphPad Prism software (GraphPad Software Inc., California, USA). Extracted ion chromatograms of potential specific metabolites identified by statistical analysis were manually viewed to confirm the differences in peak areas between MTB and NTB samples. Metabolites were further filtered using CAMERA package in R, Mass Hunter and manual inspection to exclude possible fragments, dimers, adducts and isotopes [56]. Specific metabolites that were detected by both MPP and MetaboAnalyst to be statistically significant were considered to be potential biomarkers.
Phylogenetic analysis of thiaminase I and hydroxyethyl thiazole kinase genes in B. pseudomallei and B. thailandensis The protein sequences of bacterial thiaminase I and hydroxyethyl thiazole kinase were retrieved from Genbank. Phylogenetic trees were constructed using the maximumlikelihood method with 1000 bootstrap replicates with Mega 5.1 [63]. WAG + G amino acid substitution model with 5 gamma categories was used.
PCR for thiaminase I and hydroxyethyl thiazole kinase genes and RT-PCR for mRNA detection in B. pseudomallei and B. thailandensis Genomic DNA was extracted from 15 B. pseudomallei and three B. thailandensis strains using DNeasy mini Kit (Qiagen, Hilden, Germany). Total RNA was extracted using RNeasy mini Kit (Qiagen, Hilden, Germany) according to manufacturer's instructions. The purified RNA was subjected to DNase I treatment using TURBO DNA-free kit (Ambion, USA). Reverse transcription was performed using the Superscript III kit (Invitrogen, Carlsbad, CA) according to manufacturer's instructions. The resultant genomic DNA or cDNA s were then used for PCR amplification. The PCR mixture (25 μL) contained genomic DNA or cDNA (1.0 μL) as template, 0.5 μM primers, 2.5 μL 10× PCR buffer II, 2.5 mM MgCl 2 , 200 μM of each dNTPs (GeneAmp, Applied Biosystems, Waltham, Massachusetts, USA) and 1.0U Taq polymerase (AmpliTaq Gold; Applied Biosystems, Waltham, Massachusetts, USA). Thermal cycling was performed in an automated thermocycler (Veriti 96-well fast thermal cycler; Applied Biosystems, Waltham, Massachusetts, USA) with a hot-start at 95°C for 10 min; 35 cycles of 95°C for 30 s, annealing for 30 s at temperatures of 53°C and 72°C for 30 s and a final extension at 72°C for 10 min. Five microliters of each amplified product was electrophoresed in 2.5 % (w/v) agarose gel with a molecular size marker (GeneRuler 50 bp DNA Ladder; Fermentas, Pittsburgh, PA, USA) in parallel. Electrophoresis in Tris-borate-EDTA buffer was performed at 120 V for 35 min. The gel was stained with ethidium bromide (0.5 μg/mL) for 25 min, rinsed and photographed under ultraviolet light illumination. Standard precautions were taken to avoid PCR contamination, and no false-positive was observed in negative controls. The primers used are as shown in Additional file 2: Table S2. Housekeeping gene apoB was used for normalization.

Additional files
Additional file 1: Table S1. Bacterial strains used in this study.
Additional file 2: Table S2. Primers used in this study.