Phylogenetic identification of bacterial MazF toxin protein motifs among probiotic strains and foodborne pathogens and potential implications of engineered probiotic intervention in food

Background Toxin-antitoxin (TA) systems are commonly found in bacteria and Archaea, and it is the most common mechanism involved in bacterial programmed cell death or apoptosis. Recently, MazF, the toxin component of the toxin-antitoxin module, has been categorized as an endoribonuclease, or it may have a function similar to that of a RNA interference enzyme. Results In this paper, with comparative data and phylogenetic analyses, we are able to identify several potential MazF-conserved motifs in limited subsets of foodborne pathogens and probiotic strains and further provide a molecular basis for the development of engineered/synthetic probiotic strains for the mitigation of foodborne illnesses. Our findings also show that some probiotic strains, as fit as many bacterial foodborne pathogens, can be genetically categorized into three major groups based on phylogenetic analysis of MazF. In each group, potential functional motifs are conserved in phylogenetically distant species, including foodborne pathogens and probiotic strains. Conclusion These data provide important knowledge for the identification and computational prediction of functional motifs related to programmed cell death. Potential implications of these findings include the use of engineered probiotic interventions in food or use of a natural probiotic cocktail with specificity for controlling targeted foodborne pathogens.


Background
Foodborne illnesses continue to be an important public health concern in developing, as well as in developed countries, thus prevention of foodborne illness outbreaks through effective and novel interventions should be given priority. The U.S. Public Health Service has identified ten important foodborne pathogens causing human illnesses, including pathogenic strains of Escherichia coli, Salmonella, Listeria, Clostridium botulinum, Shigella, and Campylobacter, which are associated with more than 250 known foodborne diseases (http://www3.niaid. nih.gov/topics/foodborne/default.htm).
In addition, according to the World Health Organization (WHO), antibiotic overuse in food animal production is a major contributor to the emergence of antibiotic resistant foodborne pathogens [1]. The use of antibiotics in food animals for growth promotion and treatment disrupts the normal beneficial commensal bacterial microflora in the animal intestinal tract [2][3][4][5][6]. Recently, the human and chicken gut microbiome projects [7][8][9][10][11][12] have shed new light on the existence of a bacterial 'phylogenetic core' [13] consisting of a wide diversity of gastrointestinal bacteria by using new technologies such as next generation sequencing, 16S rRNA screens, metagenomics, and metaproteomics. Healthful, commensal bacteria found in the GI tract might be key members of known or potential probiotic strains revealed in this 'phylogenetic core' , which may include Bacillus clausii, Bacillus pumilus, Lactobacillus acidophilus, Lactobacillus reuteri, Lactobacillus rhamnosus GG (ATCC 53103), Bifidobacterium infantis, Saccharomyces boulardii, Lactobacillus ruminis, Lactobacillus johnsonii str. NCC 533, and many others. These known probiotic strains have been used as dietary supplements, as treatments for illnesses caused by foodborne pathogens, and for disease prevention. Use of probiotic strains not only reduce invasion by bacterial pathogens, but also restore and maintain an optimal balance of healthy commensal bacteria in the human gut via production of antimicrobials [14][15][16][17][18][19][20][21][22][23].
One of the major mechanisms recognized as being responsible for apoptosis, or programmed cell death, and production of toxic metabolites in bacteria is through the regulation of a wide variety of bacterial toxinantitoxin modules [24][25][26], such as MazE/MazF, a chromosomal toxin-antitoxin module [27][28][29][30], plasmidencoded parD [31][32][33], chpBIK [26,34], relBE [35,36], and the PhoQ-PhoP system [37][38][39][40]. MazEF is one of the most well-studied toxin-antitoxin (TA) systems and has been found on the chromosomes of many bacteria. The MazF protein has been recently categorized as an endoribonuclease [41,42] or as a type of RNA interference enzyme [43]. The link between this TA system and quorum sensing has also been explored recently through a small pentapeptide (NNWNN) called the Extracellular Death Factor (EDF) [44]. This small peptide motif (such as NNWNN) is known to be an extra-cellular death factor in E. coli and other bacterial species. The necessity of an "extracellular death factor" (EDF) or "cell death factor" (CDF) via MazEF-mediated cell death is a population phenomenon requiring the activation of quorumsensing (cell-to-cell signaling) peptides in bacteria. High cell density was found to be associated with elevated concentrations of EDF, and the presence of EDF resulted in MazF-induced cell death [44]. From a food safety and public health perspective, use of EDF or a similar strategy may be used in place of antibiotics, resulting in less usage of antibiotics. We also noticed in one very interesting study that the induction of toxin MazF and the use of antibiotic share a similar mechanism by inhibiting the transcription and/or translation of the MazE antitoxin [45].
It has been theorized that there is "one toxin for one antitoxin" and interestingly MazF, in some bacteria, exhibits a selective inhibition of ribosomes and mRNAs [43,46]. Numerous strains of probiotic bacteria, such as Lactobacillus spp., have been reported to produce antimicrobial agents [47], such as bacteriocins, that inhibit or kill closely-related species, or even different strains of the same species through the inhibition of transcription and translation by receptor binding. The antimicrobial activities of bacteriocins are due to a heterologous subgroup of ribosomally synthesized cationic peptides [48]. Nisin, a polycyclic antibacterial peptide 34 amino acid residues long, is one of the most studied bacteriocins and is produced by many strains of Lactococcus lactis. It has been approved by the FDA for use as a food preservative, and certain probiotic Lactobacillus strains have been thoroughly studied and evaluated in vitro and in vivo. For example, L. reuteri controls diarrhea in children and suppresses the growth and pathogenicity of harmful foodborne pathogens such as Salmonella, E. coli, Staphylococcus, and Listeria [49,50]. L. casei GG has been used to treat Clostridium difficile infections and to reduce intestinal permeability [51][52][53][54]. L. reuteri is known to produce a broad-spectrum antimicrobial agent, reuterin, composed of the natural metabolic compound 3-hydroxypropionaldehyde, which has been used on the surface of sausages to inhibit growth of harmful bacteria and fungi [15,16]. However, the molecular mechanisms underlying the effectiveness of individual probiotic strains have not been systematically studied and characterized. Several potential mechanisms of action, including their ability to generate diverse natural toxic metabolites, lactic acid, and other organic acids, enzymes, vitamins, and hydrogen peroxide, as well as antimicrobial peptides such as nisin, have been well-described [20].
The work reported herein explores the experimental antimicrobial possibilities and/or procedures for (1) expression of an engineered, stress-induced recombinant secreted fusion gene encoded by MazF and a small extracellular cell death factor (CDF) on the surface or extracellular space of recombinant probiotic bacteria or (2) for potential application of a cocktail of natural probiotic strains via experimental in vitro selection to control foodborne pathogens for improving the safety and quality of foods, as well as improving human health. The use of engineered probiotic strains or the natural probiotic cocktail consisting of mixed probiotic strain populations for targeting foodborne pathogens will potentially be able to selectively inactivate these pathogens, even in complex food matrices. Through gene/motif reshuffling [55,56] and computational molecular modeling, this engineered and secreted bacterial fusion protein, MazF-CDF, which contains an enterokinase (EK) cleavage site [57][58][59][60] for releasing active CDF and MazF after protein secretion, should have a narrow range of applicability, limited to inactivating only specific foodborne pathogens such as E. coli O157, Salmonella, Listeria, Campylobacter, and potentially other human pathogens. This pathogen specificity is due to the fact that the geneticallyengineered MazF-CDF fusion protein could be modified by the inclusion of specific genomic DNA sequences from various commensal, as well as pathogenic bacterial strains, identified through DNA/protein structure and functional comparisons. Additionally, this engineered antimicrobial polypeptide and MazF will be non-toxic to beneficial (healthful/probiotic) bacteria, as well as to its host that expresses the protein/peptide. Moreover, these engineered and secreted CDFs and MazF proteins/peptides can easily pass through infectious foodborne pathogen cell barriers mediating cell death, and thus could potentially reduce the use of deleterious compounds such as antibiotics or other harmful chemicals in the feed and food industry.

Results and discussion
General genetic analysis of probiotics and foodborne pathogen genomes The genomic information from selected probiotic strains and foodborne pathogens is shown in Table 1. It revealed little diversity in genomic GC-content in the Bifidobacterium genus, while showing an astonishing diversity in the GC-content among Lactobacillus species, ranging from 33 to 51.5%. It has been demonstrated that genomic GC-content is correlated with a number of factors [61], including genome size [62] from species such as Lactobacillus, which ranges from 1.8 to 3.4 Mb in length. This demonstrates that the GC-content and genome size of Lactobacillus genomes may have implications related to the biological complexity and adaptation of this genus, and could be due to the rate of recombination that has been extensively studied in the E. coli genome [63]. Knowledge of genetic diversity is fundamental to development of synthetic probiotic genomes and/or nucleic acid sequence reshuffling strategies. In this paper, we demonstrate that the MazF protein is a suitable candidate for the determination of genetic relationships within sets of MazF proteins in combination with with functional motif analysis.

Ubiquitous existence of MazE/toxin, MazF
MazEF is a toxin-antitoxin (TA) module widely distributed among many bacterial species, including both foodborne pathogens and probiotic strains (Table 1, 2), such as Lactobacillus acidophilus, Lactobacillus reuteri, Lactobacillus rhamnosus GG (ATCC 53103), Escherichia coli, Selenomonas sputigena, Enterobacter spp., Campylobacter spp., Citrobacter spp., Thermoanaerobacter tengcongensis, Pelotomaculum thermopropionicum, Lactobacillus casei, Lactobacillus crispatus, Lactobacillus buchneri, Bifidobacterium longum subsp. infantis, Clostridium botulinum, Clostridium difficile, Vibrio spp., Listeria spp., Bacillus spp., Klebsiella variicola, and Salmonella enterica. Recent literature and computational analyses have shown the presence of many different types of TA modules in various localizations, e.g. some TA modules have been found within prophages, prophage-like elements, and other mobile genetic elements, such as plasmids [31][32][33]. Due to the existence of varying types of toxin-antitoxin modules and possible gene duplication events, in this paper we present the above one-to-one best matches of chromosomal-encoded mazF orthologs and homologs among 75 publically available genome sequences of foodborne pathogens and probiotic strains (Table 1), and the publically-available databases such as NCBI and the Uniprot database (http://www.uniprot. org/) in Table 2.

Phylogenetic analyses and cluster analysis of MazF/ antitoxin protein, a growth inhibitor
The phylogenetic tree in Figure 1 displays the phylogentic relationships of many well-recognized genera within Enterobacteriaceae, including several important foodborne pathogens such as pathogenic E. coli, Salmonella, Listeria, and Campylobacter, as well as some major probiotic strains. To build a phylogenetic tree from the data in Table 2, the amino acid sequences of all the MazF or growth inhibitor proteins were analyzed using the Geneious software package v5.5.7 with Neighbor-joining (NJ) method by applying ClustalW for sequence alignment ( Figure 1). Three main clusters (viz., groups 1, 2, and 3) are given in Figure 1. At least two representatives of potential probiotic strains are listed for each group, depending on the complexity of groups. In group 1, the potential probiotic, non-pathogenic strains, such as Lactobacillus amylovorus, Lactobacillus crispatus, Streptococcus, Enterococcus faecium, Enterococcus, Lactobacillus plantarum, and Lactobacillus rhamnosus, are grouped with the foodborne pathogens E. coli, Vibrio vulnificus, and Vibrio cholerae. In group 2, the foodborne pathogens Enterobacter, Campylobacter upsaliensis, Klebsiella variicola, Salmonella enterica, Shigella flexneri, Shigella dysenteriae, and Citrobacter rodentium were shown to be genetically closer to Bacillus selenitireducens, Bacillus halodurans, and Enterococcus faecalis In group 3, some probiotic strains, such as the Bacillus, Lactobacillus, Leuconostoc, and the Bifidobacterium genera are categorized along with other major foodborne pathogens, such as Clostridium difficile, Listeria monocytogenes, Listeria grayi, and an emerging foodborne pathogen Pediococcus pentosaceus.
Phylogenetic identification of bacterial toxin MazF protein motifs and the relationship between gene structure and phylogenetic classification In Figure 2, it is shown that the number of candidate motifs are slightly different in each group, but with a high degree of amino acid sequence variability within conserved "hot spots" between groups 1, 2, and 3. To determine which motifs are the best candidates for the engineered MazF construction (discussed in next section) in each individual group, recent studies [43] suggest that the N-terminal (the first 80 amino acids in Table 1 Genomic information of selected microorganisms considered as potential probiotic strains and some major food-borne pathogens used in this study  Figure 2) of the MazF protein are the most suitable motifs. The other motifs in the C-terminal might be referred to as 'incorrect' motifs. There are three criteria for evaluating the remaining motifs as "incorrect and without biological significance", and this is related to: the mean hydrophobicity, identity, and the gene structure. In Figure 2, in the analysis for the phylogenetic identification of bacterial Toxin MazF protein motifs, the mean hydrophobicity sequence and identity were computed and compared for each sequence; it is interesting to find that all the compared MazF proteins have a higher degree of mean hydrophobicity, identity, and more conserved gene structure among several conserved amino acid regions, particular in the N-terminal. The selection of potential functional MazF motifs is discussed in the next section. The functional importance of the mean hydrophobicity has also been discussed to involve mRNA and protein degradation and slower translation of mRNA for disordered proteins (http://employees. csbsju.edu/hjakubowski/classes/ch331/protstructure/ olprotfold.html).

Bacterial probiotic cocktail strains
It is important to note that the benefits of probiotics are strain-specific [64,65]. A commercial product, VSL#3, developed by Sigma-Tau Pharmaceuticals, Inc., provides a mixture of probiotic bacteria (B. breve, B. infantis, B. longum, L. acidophilus, L. bulgaricus, L. casei, L. plantarum, and Streptococcus thermophilus) to help protect GI tract integrity [66]. In this study, the bacterial probiotic cocktail strains we propose would be comprised of all representatives of groups 1, 2, and 3 shown in   Figure 1. The principal basis behind the composition of these probiotic cocktail strains is the assumption that a combination of organisms might be more effective than the application of a single strain, which potentially could suppress many foodborne pathogens, such as the E. coli and vibros in Group 1, several foodborne pathogens, such as Enterobacter, Klebsiella variicola, Salmonella enterica, Shigella flexneri, Shigella dysenteriae, Citrobacter rodentium, and Campylobacter upsaliensis, (one of the most common Campylobacter strain found in people with diarrhea in Group 2), Clostridium difficile, Listeria monocytogenes, and Listeria grayi, in Group 3 of Figure 1. Table 1 shows an astonishing diversity in the genomic GC-content and genome size among Lactobacillus species and their diverse distribution in all groups within Figure 1, which indicates a potential to further identify other closely-related Lactobacillus species (not listed in Table 1) into the three previously-described groups. The hypothesis underlying our approach is that probiotic strains found within the same group with foodborne pathogens will have a reasonable degree of genetic and molecular phylogenetic compatibility and could Table 2 The genetic characterization of the transcriptional modulator MazF, a chromosomal cell death factor from potential probiotic strains and some major food-borne pathogens used in this study (Continued)  Figure 1.
bridge a relationship similar to a "symbiosis" of entities, including exchanging toxin/antitoxin molecules among the probiotic and pathogenic strains. Lactobacillus species are known to produce antimicrobial substances, including bacteriocins, lactic acid, and hydrogen peroxide. The MazEF toxin component may provide a basis for bacterial growth inhibition within the same group (Figure 1 and 2). Therefore, this toxin-antitoxin module may have great potential to inhibit the growth of potentiallypathogenic bacteria through a possible competitive exclusion due to selective inhibition [46]. Figures 1 and 2 list all possible cocktails of these probiotic strains. In reality, a foodborne outbreak is more likely to be associated with one particular foodborne pathogen in particular foods. For example, there is a low incidence of Campylobacter in ground beef and pork, while Campylobacter is the major foodborne pathogens associated with poultry, therefore, a single food-borne pathogen with the application of mixed probiotics will be considered initially.
Molecular recombination techniques: construction of genetically engineered synthetic probiotic strains Figure 3 details an engineered probiotic strain bearing a recombinant plasmid containing a stress-induced promoter, followed by an in-frame gene fusion accomplished by fusing an appropriate signal peptide, a functional cell death peptide/factor (CDF), an enterokinase (EK) binding site, and a genetically-modified engineered MazF gene, which will be constructed and transformed into the probiotic bacteria. In this recombinant probiotic strain, environmental stress, relevant to food environments will trigger gene expression of this fusion protein under an environmental stress-inducible promoter. The signal peptide directs the encoded fusion protein to the extracellular space of the engineered probiotic strains. The signal peptide depleted fusion proteins will be cleaved by a biliary tract enterokinase directly into the digestive system to release the functionally-active CDF and MazF. This event will occur based on the ability of probiotic strains to form stable, non-infectious biofilm-type aggregates, which may attach with other bacteria, including foodborne pathogens, in the urogenital and intestinal tracts [22]. These active species-specific CDF and MazF proteins will bind and pass through "unfriendly" bacterial cell surface barriers into the cytoplasm. These processed CDFs and MazF proteins will selectively inactivate and/or inhibit proteins involved in cell survival and induce the synthesis of more cell death-related proteins with activity against foodborne pathogens, eventually controlling, inhibiting, or inactivating "unfriendly" cells. Although antitoxin MazE could reverse the bacteriocidal effect of the overexpressed  MazF, MazE cannot impede the downstream cascade already initiated by MazF at early stages of the MazFmediated cascade [42].
The overall hypothesis for this experiment is graphically presented in Figure 3. The engineered MazF gene sequence will be designed based on the genomic sequences of all publically-available foodborne pathogenic strains by using reasoned random gene biosynthesis and/or genuine gene reshuffling to rapidly combine functions and properties of parental genes for the development of improved gene specificity and generality, molecular modeling (transcription factor binding site identification, etc.), and systems biology technologies/tools.

Conclusions
Survival of foodborne pathogens in cultures or in animal GI tracts may be very genus-or species-specific. Data presented in this paper can be explored to develop effective intervention strategies applied directly during food processing and preparation, as well as in the animal feed supply, which may lead to an overall reduction in use of antibiotic growth promoters (AGP) throughout the world. Recent research in molecular biology and genomics has provided potential applications of probiotic strains as dietary supplements, which could replace AGP in animal diets or as biotherapeutic agents in cases of antibiotic-associated diarrhea in travelers and in childhood diarrhea and other bacterial gastrointestinal illnesses. Experiments relating to this potential probiotic application may reveal a further greater range of potential benefits. For many of these potential benefits, current research is limited, and only preliminary results are available. All effects can only be attributed to the individual strain(s) tested. Testing of a specific supplement may not be extrapolated to benefits from any other strain of the same species, and testing results do not imply that comparable benefits will be imparted from other LAB (or other probiotics). In this study, we have computationally explored several potential intervention strategies to control foodborne pathogens, either by using a cocktail of probiotic strains or an engineered probiotic strain.
The inhibition of pathogens by probiotic strains is mainly due to the production of antibacterial peptides [67], the release of short-chain fatty acids, or reduction of the pH within the lumen [68,69] by the production of organic acids or by decreasing pathogen adherence to intestinal epithelial cells [70]. Therefore, the benefits of probiotics could be very strain-or species-specific, and probiotic strains may rely on different mechanisms to suppress growth, attachment, or other metabolic processes, inherent to pathogenic bacteria. Moreover, the optimal effects of probiotic strains may involve the simultaneous use of more than one strain. Our contention in this paper is not experimental proof but, rather, a clear scientifically-backed hypothesis in the form of a detailed accompanying method that multi-probiotic strain composites with diverse genetic backgrounds may complement [71] one another as vectors of competitive exclusion and, therefore, could maximize the potential to inhibit an array of common foodborne pathogens [72] in the gastrointestinal tract of humans or livestock, as well as in foods and animal feed.
The use of probiotic bacteria with the ability to produce CDFs and engineered MazF to selectively inactivate pathogens is a novel approach to controlling pathogens in foods, and possibly treating human infections. A number of studies suggest that this project could have practical significance and be a potentially new approach for the development of novel and cost-effective food safety intervention technologies for the control of foodborne pathogens and improving public health [73][74][75].
The approaches described above represent a first attempt to describe a systematic approach or method to test the hypothesis that "friendly" bacteria can be used to inactivate or inhibit pathogens in food based on expression of MazF. There are many specific cell death factors that may be associated with bacterial programmed cell death and multi-cellular behavior mechanisms in foodborne pathogens. Through computational modeling, remodeling, genetic recombination, or further gene reshuffling, and exploring experimental approaches, it may be possible to evaluate and elucidate more effective CDFs and MazF to be used for controlling foodborne pathogens, which will ultimately result in a reduction in the use of antimicrobial compounds in humans and animals, as well as during food processing and storage.

Genomic sequences
All of the genome and gene sequences examined in this study (Table 1 & 2) are available in GenBank (http://www. ncbi.nlm.nih.gov/genbank/GenbankOverview.html).

Identification and analysis of bacterial toxin-antitoxin modules, MazE/MazF
The prediction accuracy of the best chromosomalencoded MazF orthologs among relatively distinct genome strains is critical for the performance of molecular phylogenetic analysis. We used a sequential BLAST workflow based on pairwise comparison by applying either an E. coli programmed cell death toxin MazF (Genbank accession: ZP_06660634.1), an endoribonuclease MazF, a MazF protein from Vibrio cholerae (ZP_01950611.1), or a PemK family transcriptional regulator protein (ZP_05296226.1) from Listeria monocytogenes to perform a BLASTP homology search of these combined protein sequences with all 75 publically available foodborne pathogen and probiotic strain genomic sequences (presented in Table 1) publically available databases such as NCBI and Uniprot database (http://www.uniprot.org/). BLAST search results were used to list MazF or MazF-like proteins from 75 different strains being represented as 39 different species (Table 2).

Phylogenetic analysis and computational identification of phylogenetic motifs of the MazF protein
In order to identify potential functional motifs of MazF proteins, phylogentic analyses of MazF proteins were conducted by applying the embedded multiple sequence alignment ClustalW program in the Geneious software package v 5.5.7 [76,77] with the Neighbor Joining method. ClustalW output for the aligned amino acid sequences and the pdf images of the alignments were generated using the Geneious software package v 5.5.7.