A BOLD way to crowdsource genetic data
October 26, 2015
The is one of the largest genetic databases in the world where users may search over 1.7M public records. More specifically, BOLD is an informatics workbench that facilitates the acquisition, storage, analysis and publication of DNA barcode records. DNA barcode records use a very short genetic sequence from a standard part of the genome to help experts identify and categorize specimens.
As of October, 2015, NEON has shared over 3,000 specimen records with BOLD. In 2013, NEON even found a new ground beetle species in the Abacidus subgenus.
Leveraging different science infrastructures to provide high quality, open data
NEON uses publicly available and standardized protocols to guide sampling techniques in the field, such as setting up mosquito traps. Field technicians undergo training on species identification as part of NEON鈥檚 quality assurance efforts. However, species identification is a unique area that requires external quality control measures to reduce uncertainty: NEON submits specimens to BOLD so that taxonomic experts may check the accuracy of NEON species identification efforts or even if, in some cases, NEON scientists have trouble identifying a species. In turn, these results inform NEON protocols and training programs. For example, if NEON finds species that are unusual to an area, taxonomic experts at BOLD assist with identification as a quality control measure. NEON Staff Scientist and Insect Ecologist Katie LeVan states, 鈥渨e want the most precision possible because that is what is valuable鈥�.
While identifying species is an important part of NEON鈥檚 science design, providing sampling data in accessible, publicly available databases is critical to NEON鈥檚 commitment to open science. Open science is a movement to make scientific research and data accessible to broader audiences. Some fundamental goals of open science include 1) transparency in methodology and data collection; 2) public availability and reusability of data; and 3) use of open source tools to facilitate scientific collaboration. To support collective science goals and support ongoing efforts of the science community, NEON is connecting with open access databases and initiatives like BOLD.
NEON beetle collection: from the field to species identification
NEON submits specimens and barcode records to BOLD, including specimens and records of beetles and mosquitoes. NEON is projected to provide tens of thousands of new records to BOLD by sampling at unprecedented scales and quantities across the continent. For example, there are currently roughly 24,500 Carabidae (beetle) specimen records and 2,200 beetle barcode records; over the 30-year lifetime of the Project, NEON may contribute over 50,000 new beetle barcode records.
NEON sends approximately 400 beetles specimens per field site to BOLD to analyze the accuracy of NEON species identification efforts. Of these specimens, roughly 40 are DNA barcoded for further quality control. However, these numbers vary greatly depending on the field site: in 2015, an average of 7-10k beetle specimens were collected per site, but 29k samples were collected at North Sterling.
BOLD is a resource to store, analyze and publish DNA information
BOLD is maintained by the University of Guelph in Ontario, Canada. It offers researchers a way to collect, manage and analyze DNA barcode data. There are two central DNA barcode databases: BOLD and the (INSDC). BOLD and the INSDC members are connected to other databases of taxonomic names and voucher specimens, such as specimens in museums. These linkages represent international open science efforts to develop integrated, standardized and reproducible methods in the field of genetics. BOLD鈥檚 assembly of molecular, morphological and distributional data is already bridging the traditional bioinformatics chasm: according to the BOLD website, 鈥淚t's the that allows the products of bottom-up projects around the world to be integrated into a global initiative鈥�. The integration and crowdsourcing of genetic data expands the scope of traditional biology: for example, the broad geographic representation of genetic data stored in BOLD may be used to develop new theories and answer questions in the area of phylogeny, otherwise known as the evolutionary history of taxonomic groups.
Looking for NEON specimens and genetic data on BOLD?
Explore NEON specimen records on BOLD鈥檚 by searching the phrase 鈥淣ational Ecological Observatory Network, United States鈥�. Learn more about NEON Terrestrial Organismal Sampling Methods.
More about BOLD
The BOLD Systems is designed to support the generation and application of DNA barcode data and consists of four main modules: a data portal, a database of barcode clusters, an educational portal, and a data collection workbench. The project behind the database, the (iBOL), is the world鈥檚 largest biodiversity genetics initiative:
鈥淗undreds of biodiversity scientists, genetics specialists, technologists and ethicists from 25 nations are working together to construct a richly parameterized DNA barcode reference library that will be the foundation for a DNA-based identification system for all multi-cellular life.鈥�
-