{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:30:24Z","timestamp":1772166624773,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,12,23]],"date-time":"2016-12-23T00:00:00Z","timestamp":1482451200000},"content-version":"vor","delay-in-days":60,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases (US)","doi-asserted-by":"publisher","award":["1K08AI101005"],"award-info":[{"award-number":["1K08AI101005"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","award":["1K08AI101005"],"award-info":[{"award-number":["1K08AI101005"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Collective animal behavior, such as the flocking of birds or the shoaling of fish, has inspired a class of algorithms designed to optimize distance-based clusters in various applications, including document analysis and DNA microarrays. In a flocking model, individual agents respond only to their immediate environment and move according to a few simple rules. After several iterations the agents self-organize, and clusters emerge without the need for partitional seeds. In addition to its unsupervised nature, flocking offers several computational advantages, including the potential to reduce the number of required comparisons.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Findings<\/jats:title>\n                    <jats:p>In the tool presented here, Clusterflock, we have implemented a flocking algorithm designed to locate groups (flocks) of orthologous gene families (OGFs) that share an evolutionary history. Pairwise distances that measure phylogenetic incongruence between OGFs guide flock formation. We tested this approach on several simulated datasets by varying the number of underlying topologies, the proportion of missing data, and evolutionary rates, and show that in datasets containing high levels of missing data and rate heterogeneity, Clusterflock outperforms other well-established clustering techniques. We also verified its utility on a known, large-scale recombination event in Staphylococcus aureus. By isolating sets of OGFs with divergent phylogenetic signals, we were able to pinpoint the recombined region without forcing a pre-determined number of groupings or defining a pre-determined incongruence threshold.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>Clusterflock is an open-source tool that can be used to discover horizontally transferred genes, recombined areas of chromosomes, and the phylogenetic \u2018core' of a genome. Although we used it here in an evolutionary context, it is generalizable to any clustering problem. Users can write extensions to calculate any distance metric on the unit interval, and can use these distances to \u2018flock' any type of data.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s13742-016-0152-3","type":"journal-article","created":{"date-parts":[[2016,10,24]],"date-time":"2016-10-24T06:47:25Z","timestamp":1477291645000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Clusterflock: a flocking algorithm for isolating congruent phylogenomic datasets"],"prefix":"10.1093","volume":"5","author":[{"given":"Apurva","family":"Narechania","sequence":"first","affiliation":[{"name":"1Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA"}]},{"given":"Richard","family":"Baker","sequence":"additional","affiliation":[{"name":"1Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA"}]},{"given":"Rob","family":"DeSalle","sequence":"additional","affiliation":[{"name":"1Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA"}]},{"given":"Barun","family":"Mathema","sequence":"additional","affiliation":[{"name":"2Public Health Research Institute Center, New Jersey Medical School, Rutgers Newark, NJ 07103, USA"},{"name":"5Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY 10032, USA"}]},{"given":"Sergios-Orestis","family":"Kolokotronis","sequence":"additional","affiliation":[{"name":"1Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA"},{"name":"4Department of Biological Sciences, Fordham University, Bronx, NY 10458, USA"}]},{"given":"Barry","family":"Kreiswirth","sequence":"additional","affiliation":[{"name":"2Public Health Research Institute Center, New Jersey Medical School, Rutgers Newark, NJ 07103, USA"}]},{"given":"Paul J","family":"Planet","sequence":"additional","affiliation":[{"name":"1Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA"},{"name":"3Department of Pediatrics, Division of Pediatric Infectious Diseases, Children's Hospital of Philadelphia & University of Pennsylvania, Philadelphia, PA 19104, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,10,24]]},"reference":[{"key":"2024121814515389500_CR1","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198508175.001.0001","volume-title":"Living in groups","author":"Krause","year":"2002"},{"key":"2024121814515389500_CR2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511601156.005","article-title":"Three-dimensional structure and dynamics of birds flocks","volume-title":"Animal groups in three dimensions","author":"Heppner","year":"1997"},{"key":"2024121814515389500_CR3","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1007\/978-94-011-1578-0_12","article-title":"The functions of shoaling behavior","volume-title":"The Behavior of Teleost Fishes","author":"Pitcher","year":"1993"},{"key":"2024121814515389500_CR4","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1007\/BF00657647","article-title":"The sensory basis of fish schools: relative role of lateral line and vision","volume":"135","author":"Partridge","year":"1980","journal-title":"J Comp Physiol"},{"issue":"1","key":"2024121814515389500_CR5","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.tics.2008.10.002","article-title":"Collective cognition in animal groups","volume":"13","author":"Couzin","year":"2009","journal-title":"Trends Cogn Sci"},{"key":"2024121814515389500_CR6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0065-227X(86)90003-1","article-title":"Dynamical aspects of animal grouping: swarms, schools, flocks, and herds","volume":"22","author":"Okubo","year":"1986","journal-title":"Adv Biophys"},{"key":"2024121814515389500_CR7","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/S0022-5193(05)80681-2","article-title":"The simulation of the movement of fish schools","volume":"156","author":"Huth","year":"1992","journal-title":"J Theor Biol"},{"key":"2024121814515389500_CR8","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1016\/S0378-4371(98)00468-3","article-title":"Collective motion of organisms in three dimensions","volume":"264","author":"Czirok","year":"1999","journal-title":"Physica A"},{"key":"2024121814515389500_CR9","doi-asserted-by":"crossref","first-page":"1375","DOI":"10.1088\/0305-4470\/30\/5\/009","article-title":"Spontaneously ordered motion of self-propelled particles","volume":"30","author":"Czirok","year":"1997","journal-title":"J Physics A"},{"issue":"1","key":"2024121814515389500_CR10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1006\/jtbi.2002.3065","article-title":"Collective memory and spatial sorting in animal groups","volume":"218","author":"Couzin","year":"2002","journal-title":"J Theor Biol"},{"issue":"4","key":"2024121814515389500_CR11","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1145\/37402.37406","article-title":"Flocks, herds, and schools: a distributed behavioral model","volume":"21","author":"Reynolds","year":"1987","journal-title":"Comput Graph"},{"issue":"1","key":"2024121814515389500_CR12","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1016\/j.jbi.2005.08.008","article-title":"Tree disagreement: measuring and testing incongruence in phylogenies","volume":"39","author":"Planet","year":"2006","journal-title":"J Biomed Inform"},{"issue":"1777","key":"2024121814515389500_CR13","first-page":"20132450","article-title":"Horizontal gene transfer in the acquisition of novel traits by metazoans","volume":"281","author":"Boto","year":"2014","journal-title":"Proc Biol Sci"},{"issue":"8","key":"2024121814515389500_CR14","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1038\/nrg2386","article-title":"Horizontal gene transfer in eukaryotic evolution","volume":"9","author":"Keeling","year":"2008","journal-title":"Nat Rev Genet"},{"issue":"3","key":"2024121814515389500_CR15","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.tig.2012.12.006","article-title":"Horizontal gene transfer and the evolution of bacterial and archaeal population structure","volume":"29","author":"Polz","year":"2013","journal-title":"Trends Genet"},{"key":"2024121814515389500_CR16","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1146\/annurev-genet-110711-155529","article-title":"Evolutionary implications of horizontal gene transfer","volume":"46","author":"Syvanen","year":"2012","journal-title":"Annu Rev Genet"},{"key":"2024121814515389500_CR17","first-page":"247","article-title":"Reexamining microbial evolution through the lens of horizontal transfer","volume":"92","author":"Planet","year":"2002","journal-title":"EXS"},{"issue":"10","key":"2024121814515389500_CR18","doi-asserted-by":"crossref","first-page":"2773","DOI":"10.1093\/molbev\/msr110","article-title":"Let them fall where they may: congruence analysis in massive phylogenetically messy data sets","volume":"28","author":"Leigh","year":"2011","journal-title":"Mol Biol Evol"},{"issue":"24","key":"2024121814515389500_CR19","doi-asserted-by":"crossref","first-page":"4423","DOI":"10.1093\/bioinformatics\/bti744","article-title":"mILD: a tool for constructing and analyzing matrices of pairwise phylogenetic character incongruence tests","volume":"21","author":"Planet","year":"2005","journal-title":"Bioinformatics"},{"issue":"7","key":"2024121814515389500_CR20","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1038\/nrmicro2593","article-title":"Biased gene transfer in microbial evolution","volume":"9","author":"Andam","year":"2011","journal-title":"Nat Rev Microbiol"},{"issue":"1","key":"2024121814515389500_CR21","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1080\/10635150801910436","article-title":"Testing congruence in phylogenomic analysis","volume":"57","author":"Leigh","year":"2008","journal-title":"Syst Biol"},{"issue":"4","key":"2024121814515389500_CR22","doi-asserted-by":"crossref","first-page":"1060","DOI":"10.1128\/JB.186.4.1060-1064.2004","article-title":"Evolution of Staphylococcus aureus by large chromosomal replacements","volume":"186","author":"Robinson","year":"2004","journal-title":"J Bacteriol"},{"key":"2024121814515389500_CR23","doi-asserted-by":"crossref","first-page":"570","DOI":"10.2307\/2413663","article-title":"Constructing a significance test for incongruence","volume":"44","author":"Farris","year":"1995","journal-title":"Syst Biol"},{"key":"2024121814515389500_CR24","volume-title":"PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods)","author":"Swofford","year":"2003","edition":"4"},{"issue":"8-9","key":"2024121814515389500_CR25","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1016\/j.sysarc.2006.02.003","article-title":"A flocking based algorithm for document clustering analysis","volume":"52","author":"Cui","year":"2006","journal-title":"J Syst Arch"},{"key":"2024121814515389500_CR26","doi-asserted-by":"crossref","DOI":"10.1109\/InfRKM.2012.6204996","article-title":"A flocking based data mining algorithm for detecting outliers in cancer gene expression microarray data","volume-title":"IEEE International Conference on Information Retrieval and Knowledge Management, Malaysia","author":"Bellaachia","year":"2012"},{"key":"2024121814515389500_CR27","first-page":"47","article-title":"Optimized spatial hashing for collision detection of deformable models. vision, modeling, and visualization","volume-title":"Proc. Vision, Modeling, Visualization VMV","author":"Gross","year":"2003"},{"key":"2024121814515389500_CR28","first-page":"9","article-title":"Optimization of large-scale, real-time simulations by spatial hashing","volume-title":"Proc 2005 Summer Computer Simulation Conference","author":"Hastings","year":"2005"},{"issue":"1","key":"2024121814515389500_CR29","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1006\/jtbi.1996.0144","article-title":"The dynamics of herds: from individuals to aggregations","volume":"182","author":"Gueron","year":"1996","journal-title":"J Theor Biol"},{"issue":"3","key":"2024121814515389500_CR30","first-page":"235","article-title":"Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees","volume":"13","author":"Rambaut","year":"1997","journal-title":"Comput Appl Biosci"},{"key":"2024121814515389500_CR31","article-title":"R: A language and environment for statistical computing","volume-title":"R Foundation for Statistical Computing","author":"Team RC","year":"2015"},{"key":"2024121814515389500_CR32","first-page":"49","article-title":"OPTICS: ordering points to identify the clustering structure","author":"Ankerst","year":"1999"},{"key":"2024121814515389500_CR33","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316801","volume-title":"Finding groups in data: an introduction to cluster analysis","author":"Kaufman","year":"1990"},{"issue":"6","key":"2024121814515389500_CR34","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1093\/bioinformatics\/btk040","article-title":"OrthologID: automation of genome-scale ortholog identification within a parsimony framework","volume":"22","author":"Chiu","year":"2006","journal-title":"Bioinformatics"},{"key":"2024121814515389500_CR35","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-540-69497-7_41","article-title":"ELKI. A software system for evaluation of subspace clustering algorithms","volume-title":"20th International Conference on Scientific and Statistical Database Management, Hong Kong, China","author":"Achtert","year":"2008"},{"key":"2024121814515389500_CR36","first-page":"6","article-title":"A stability based method for discovering structure in clustered data","volume":"7","author":"Ben-Hur","year":"2002","journal-title":"Pac Symp Biocomput."},{"issue":"11","key":"2024121814515389500_CR37","doi-asserted-by":"crossref","first-page":"2573","DOI":"10.1162\/089976601753196030","article-title":"Resampling method for unsupervised estimation of cluster validity","volume":"13","author":"Levine","year":"2001","journal-title":"Neural Comput"},{"issue":"3","key":"2024121814515389500_CR38","doi-asserted-by":"crossref","first-page":"982","DOI":"10.1109\/TSMCB.2012.2220543","article-title":"Understanding and enhancement of internal clustering validation measures","volume":"43","author":"Liu","year":"2013","journal-title":"IEEE Trans Cybern"},{"issue":"52","key":"2024121814515389500_CR39","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1007\/s10898-012-9854-y","article-title":"Self-learning K -means clustering: a global optimization approach","volume":"56","author":"Volkovich","year":"2013","journal-title":"J Glob Optimization"},{"issue":"9","key":"2024121814515389500_CR40","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"issue":"5","key":"2024121814515389500_CR41","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","article-title":"Hidden Markov models in computational biology. Applications to protein modeling","volume":"235","author":"Krogh","year":"1994","journal-title":"J Mol Biol"},{"issue":"3","key":"2024121814515389500_CR42","doi-asserted-by":"crossref","first-page":"e1000337","DOI":"10.1371\/journal.ppat.1000337","article-title":"Natural transformation of helicobacter pylori involves the integration of short DNA fragments interrupted by gaps of variable size","volume":"5","author":"Lin","year":"2009","journal-title":"PLoS Pathog"},{"issue":"7","key":"2024121814515389500_CR43","doi-asserted-by":"crossref","first-page":"e1002151","DOI":"10.1371\/journal.ppat.1002151","article-title":"Transformation of natural genetic variation into Haemophilus influenzae genomes","volume":"7","author":"Mell","year":"2011","journal-title":"PLoS Pathog"},{"key":"2024121814515389500_CR44","doi-asserted-by":"crossref","unstructured":"Narechania A, Baker R, DeSalle R, Mathema B, Kolokotronis S, Kreiswirth B, Planet P, J. Supporting data for\u201cClusterflock: A Flocking Algorithm for Isolating Congruent Phylogenomic Datasets\u201d, 2016, GigaScience Database., 10.5524\/100247.","DOI":"10.1101\/045773"}],"container-title":["Gigascience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/gigascience\/article-pdf\/5\/1\/s13742-016-0152-3\/61227533\/gigascience_5_1_s13742-016-0152-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/gigascience\/article-pdf\/5\/1\/s13742-016-0152-3\/61227533\/gigascience_5_1_s13742-016-0152-3.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13742-016-0152-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,18]],"date-time":"2024-12-18T09:52:34Z","timestamp":1734515554000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/gigascience\/article\/doi\/10.1186\/s13742-016-0152-3\/2737427"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,10,24]]},"references-count":44,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2016,12,1]]}},"URL":"https:\/\/doi.org\/10.1186\/s13742-016-0152-3","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/045773","asserted-by":"object"}]},"ISSN":["2047-217X"],"issn-type":[{"value":"2047-217X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2016,12]]},"published":{"date-parts":[[2016,10,24]]},"article-number":"s13742-016-0152-3"}}