Opinion Page |
||
|
General information and editorial notes News and Notes Activities at the Entomological Societies' meeting Summary of the Scientific Committee meeting Arthropods of Canadian Grasslands - News Project Update: Terrestrial Arthropods of Newfoundland and Labrador New Project: Arthopods of Canadian Forests Forest arthropod project inventory Opinion Page: Bioinformatics and Misinformatics Resources for the Study of Odonata in Canada Arctic Corner Arctic and Boreal Entomology: what's new about 2003 & 2004 List of Requests for Material or Information
|
—The Opinion Page is a forum for views and ideas of potential interest to readers— Contributions should be sent to the editor.
Bioinformatics and misinformatics: the missing links between taxonomic data and taxonomic databases Terry A. Wheeler Department of Natural Resource Sciences, McGill University, Macdonald Campus, Ste-Anne-de-Bellevue, QC H9X 3V9 (wheeler@nrs.mcgill.ca) The Importance of Systematic Data It has long been recognized that Canada is losing specialists trained in systematics and that our collective research output in the field has suffered as a result. A variety of solutions has been proposed to address this lack of basic research. The systematics community has repeatedly proposed a bold plan based on training and hiring more systematists. This innovative solution has unleashed a storm of apathy in most circles, with a few notable exceptions; the most encouraging has been the recent hiring of three systematic entomologists by Agriculture and Agri-Food Canada. Other scientists (Canadian and otherwise) have proposed grand but naïve plans to replace primary taxonomic research with so-called "DNA barcodes" based on sequencing a minute portion of the genome (e.g., Hebert et al. 2003, Tautz et al. 2003) or doing away with the tedious (at least to non-systematists) necessity of observing rules of nomenclatural priority and hard copy publications and transferring taxonomy in toto to the Web (e.g., Godfray 2002). Such sweeping and technology-driven proposals have been effective in getting systematics into the pages of Nature, Science and other prominent journals, and I suppose that is a good thing. However, these are, in the end, simplistic and flawed "solutions" that ignore the need for trained systematists to actually recognize new species in the first place, to describe taxa accurately in such a way that they are recognizable to other scientists, and to propose and test hypotheses on phylogenetic relationships. Technological advances have obviously revolutionized the way we conduct and disseminate systematic research, but such advances should be tools, not crutches, that serve as an adjunct to good work done by well-trained systematists (Mallet and Willmott 2003, Scotland et al. 2003, Wilson 2003). Some countries have adopted a balanced view of the importance of research at all levels of the systematic process and have responded accordingly. In the United States, for example, the National Science Foundation has multiple discrete funding programs for systematic research (see www.nsf.gov/bio/deb/start.htm). There are programs to discover and describe new taxa (Partnerships for Enhancing Expertise in Taxonomy), to conduct large-scale faunal inventories (Biodiversity Surveys and Inventories), to reconstruct phylogenetic history and place species within an evolutionary context (Assembling the Tree of Life), to support curation and access to collections (Biological Research Collections), and to establish bioinformatics frameworks (Biological Databases and Informatics). This is a logical and scientifically valid approach that increases the likelihood that the taxonomic databases will be built on accurate data. The current picture is very different in Canada; other than the traditional sources of (limited) support to individual researchers from agencies like the Natural Sciences and Engineering Research Council and the employment of an ever-shrinking cohort of government systematists, Canadian government agencies have largely ignored the need for research on the identity and relationships of species. Instead, they have opted for presentation over content. No new funds have been allocated across the systematics community and support for systematic research continues to erode. In contrast, bioinformatics is a current hot topic and agencies involved in disseminating and using biodiversity information have embraced packaging and marketing with the zeal of a new recruit at an advertising agency. The result is that Canada can now hold her own with any other scientific power in the production and proliferation of acronyms and websites. "Initial" Efforts While FBG/FBP/FBIP coordinated activities within the federal government departments, a broader initiative led to the formation of the Canadian Biodiversity Information Initiative (CANBII), based on the American NBII program. CANBII quickly became CBIN (The Canadian Biodiversity Information Network), which, in turn, became BCIN (Biota of Canada Information Network), which, in the fullness of time, became BKIN (Biodiversity Knowledge Information Network). Some workshops were held and optimistic plans were made. The major "proof of concept" project resulting from the CANBII/CBIN/BCIN exercise is the Butterflies of Canada (www.cbif.gc.ca/spp_pages/butterflies/index_e.php). That project assembled specimen data from many (but not all) major insect collections across Canada on a single, well-known, group of insects for which taxonomic data and curation are in good shape and, thus, used repeatedly in databasing and analysis projects. The butterflies represent a small group of insects in Canada (293 species) and are unusual in that they have been so well studied by systematists that available identification tools like field guides and regional catalogs make specimen identification a simple process for competent entomologists. There have been, apparently, no other concrete products combining data from a large number of collections arising from CBIN/BCIN/BKIN, though there has been limited distribution of the reports of the workshops and identification of some vague objectives. It appears that BKIN has been subsumed within CBIF (see below). The Global Biodiversity Information Facility (GBIF, www.gbif.org). is an international program that will coordinate national and regional efforts to compile interconnected databases of biodiversity information. Canada, one of the member countries of GBIF, has responded to its commitment to GBIF by establishing CBIF, the Canadian Biodiversity Information Facility (www.cbif.gc.ca), coordinated by FBP (or perhaps FBIP), which has assumed responsibility for the objectives previously held by CBIN, BCIN and BKIN. Under the Canadian programs, the databases of taxonomic names will be built upon, and linked to, the framework of the Integrated Taxonomic Information System (ITIS) (about which more below). One of the main weakness in this whole system, aside from the necessity to learn new acronyms every few months, is that the FBIP/BKIN/CBIF initiatives in Canada are overwhelmingly top-down, with federal agencies driving all decisions, meetings, workshops and consultations, as well as dispensing all budgets, much of which seems to be allocated to the aforementioned meetings, workshops and consultations. Information transfer to members of the systematics community is sporadic at best. The university community is notably absent from any substantive input into the programs. On the other hand, the actual data collection and verification is primarily bottom-up, built on the efforts of individual systematists, frequently in the university system. Between the top-down "planning" and the bottom-up execution, there is a broad no-man’s land, and the working systematists grow increasingly disillusioned and cynical with the glowing visions of a computerized utopia coming from above. Most databasing that has been done at the level of natural history collections has involved individual researchers finding small sums of money for support staff or setting aside some of their own valuable time to organize data on a portion of their own collection, often as part of a larger systematic study. The FBIP/BKIN/CBIF vision of a community of data generators and data users sitting at the computer peering virtually into the drawers of other museums is certainly an appealing vision, but it is, at best, a little farther in the future than we are led to believe, and, at worst, an indication of how out of touch these initiatives are with the current state of raw biodiversity information for nearly all groups of arthropods. The quality and quantity of information The ITIS website identifies the source of its taxonomic data as the NODC Taxonomic Code, database version 8.0; The acronym "NODC" is not, unfortunately, defined on the ITIS website. However, a Google search revealed that NODC is the National Oceanographic Data Center (www.nodc.noaa.gov), which, in turn, gives no indication as to the source of its information on terrestrial organisms. These data, evidently used as the basis for launching the initiative, have simply been incorporated wholesale, with all their errors, into ITIS. Some may assume that bad data ("unverified" sensu ITIS) are better than no data (this is an erroneous assumption); some may assume that seeing such errors would encourage the appropriate specialists to volunteer their time and effort to fix them (this is also an erroneous assumption). Perhaps there was simply a desire to get as many records as possible incorporated into ITIS during the early "proof of concept" stages. There are multiple problems here. First, given the small number of specialists and the current nature of our workloads, it is unlikely that we (the working systematists) will be lining up to clean up ITIS anytime soon. Second, and in the meantime, the error-filled lists are available on the Web in databases like ITIS or Nomina Insecta Nearctica (www.neartica.com/nomina/main.htm), another widely used, incompletely verified compilation that is rife with errors, at least in Diptera. People who are unaware of the errors incorporated in those resources use them, in turn, as their source for taxonomic names. And so the misinformation radiates out across the Web. So too does the mistaken assumption that as long as we have a name we don’t really need a systematist just to confirm what we already "know". The dangers of misinformatics Where do we click now? I have been assured, on more than one occasion and by more than one web database promoter, that increased funding for their products cannot help but generate additional support for basic systematic research. After many years, and especially since the adoption of the Rio Convention more than a decade ago, I have seen no evidence for this assertion whatsoever, at least in Canada. Too much money from limited departmental budgets has already been spent on ineffectual workshops and consultation reports, all of which state, repeatedly, the painfully obvious. There is little money left over to support meaningful progress toward the long-term goals. Current federal initiatives in biodiversity databasing must acknowledge the weaknesses in their existing data and organizational structure and increase support for, and involvement of, the working systematics community. The continuing absence of involvement from academia and even of many systematists within the government system is a critical oversight that seriously weakens Canadian initiatives compared to ongoing American, European and Latin American programs. If this unification of purpose does not happen soon, future developments are obvious. Systematists will continue to lose valuable time trying to convince the bioinformatics committees and working groups of the value of our expertise and research, and of the necessity to train and employ more systematists to build the foundation that our database administrators seem to think is already in place. The systematists will also waste too much time trying to correct the damage done by the growing body of taxonomic misinformation that litters the information superhighway. Meanwhile, the biodiversity database designers will surround themselves in pretty paper and ribbons as they gift-wrap the same empty boxes, over and over again. References
|
|
| Back to top | Biological Survey of Canada (Terrestrial Arthropods) home page |