Validation of mixed-genome microarrays as a method for genetic discrimination
Broschat, Shira L.
Call, Douglas R.
MetadataShow full item record
Comparative genomic hybridizations have been used to examine genetic relationships among bacteria. The microarrays used in these experiments may have open reading frames from one or more reference strains (whole-genome microarrays), or they may be composed of random DNA fragments from a large number of strains (mixed-genome microarrays [MGMs]). In this work both experimental and virtual arrays are analyzed to assess the validity of genetic inferences from these experiments with a focus on MGMs. Empirical data are analyzed from an Enterococcus MGM, while a virtual MGM is constructed in silico using sequenced genomes (Streptococcus). On average, a small MGM is capable of correctly deriving phylogenetic relationships between seven species of Enterococcus with accuracies of 100% (n ! 100 probes) and 95% (n ! 46 probes); more probes are required for intraspecific differentiation. Compared to multilocus sequence methods and whole-genome microarrays, MGMs provide additional discrimination between closely related strains and offer the possibility of identifying unique strain or lineage markers. Representational bias can have mixed effects. Microarrays composed of probes from a single genome can be used to derive phylogenetic relationships, although branch length can be exaggerated for the reference strain. We describe a case where disproportional representation of different strains used to construct an MGM can result in inaccurate phylogenetic inferences, and we illustrate an algorithm that is capable of correcting this type of bias. The bias correction algorithm automatically provides bootstrap confidence values and can provide multiple bias-corrected trees with high confidence values.